☆1绘制learning curve时需要注意
在绘制error_train 与 error_cv随着样本数目的增加而变化的learning curve
需要根据当前循环内的样本数目i来求解对应的训练出的theta值,再供以求解J_train与J_cv。同时需要注意的是,J_train处的求解m对应的是i,是样本集的子集,而每一次的J_cv的求解对应的都是整个数据集。
意思也就是,当随着样本数量不断增加,训练得到的theta值将不断变化,而cv验证集需要去验证当前样本个数下的theta值的效果如何,当然需要整个验证数据集一起求得误差val_error。其中,train_error也就是当前使用的样本数目下的误差。
☆2计算error时需注意
λ=0
上代码
function [J, grad] = linearRegCostFunction(X, y, theta, lambda)
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta)); %n*1
% ====================== YOUR CODE HERE ======================
h=X*theta; %theta=n*1,X=m*n h=m*1
J= 1/2/m*sum( (h-y).^2 ) + lambda/2/m*sum(theta(2:end,:).^2);
grad(2:end)=1/m * (X'*(h-y))(2:end) + lambda/m*theta(2:end);
grad(1)=1/m * (X'*(h-y))(1);
% =========================================================================
grad = grad(:);
end
function [error_train, error_val] = ...
learningCurve(X, y, Xval, yval, lambda)
m = size(X, 1);
% You need to return these values correctly
error_train = zeros(m, 1);
error_val = zeros(m, 1);
% ====================== YOUR CODE HERE ======================
for i=1:m
sample_x=X(1:i, :);
sample_y=y(1:i);
[theta] = trainLinearReg(sample_x, sample_y, lambda);
[J, grad] = linearRegCostFunction(sample_x, sample_y, theta, 0);
error_train(i)=J;
[J, grad] = linearRegCostFunction(Xval, yval, theta, 0) ;
error_val(i)=J;
end
function [X_poly] = polyFeatures(X, p)
X_poly = zeros(numel(X), p);
% ====================== YOUR CODE HERE ======================
for i=1:p
X_poly(:,i)=X.^i;
end
% =========================================================================
end
function [lambda_vec, error_train, error_val] = ...
validationCurve(X, y, Xval, yval)
lambda_vec = [0 0.001 0.003 0.01 0.03 0.1 0.3 1 3 10]';
% You need to return these variables correctly.
error_train = zeros(length(lambda_vec), 1);
error_val = zeros(length(lambda_vec), 1);
% ====================== YOUR CODE HERE ======================
for i = 1:length(lambda_vec)
lambda = lambda_vec(i);
theta=trainLinearReg(X, y, lambda);
[error_train(i),grad]=linearRegCostFunction(X, y,theta, 0);
[error_val(i),grad]=linearRegCostFunction( Xval, yval , theta, 0);
end