设线性回归的训练集为
T=[X∣Y] T = [X|Y]T=[X∣Y]
其中 X∈Rm×pX \in R^{m \times p}X∈Rm×p, Y∈Rm×1Y\in R^{m \times 1}Y∈Rm×1, m为训练集的样本个数,p为样本特征数。
作线性回归,设其回归模型为
hi(X)=[1,x1(i),x2(i),⋯ ,xp(i)]⋅[θ0,θ1,θ2,⋯ ,θp]T=X(i)θh_i(X) = [1,x_1^{(i)},x_2^{(i)},\cdots,x_p^{(i)}]\cdot[\theta_0,\theta_1,\theta_2,\cdots,\theta_p]^T = \mathbf{X^{(i)}}\mathbf{\theta}hi(X)=[1,x1(i),x2(i),⋯,xp(i)]⋅[θ0,θ1,θ2,⋯,θp]T=X(i)θ
其中X,θ\mathbf{X, \theta}X,θ均为p+1维向量。
最小化性能指标
J(θ)=12m∑i=1m(H(X)−Y)2J(\theta) = \frac{1}{2m}\sum_{i=1}^{m}(H(X)-Y)^2J(θ)=2m1i=1∑m(H(X)−Y)2
梯度为
J˙(θ)=1m∑i=1mXT(H(X)−Y)\dot J(\theta) = \frac{1}{m}\sum_{i=1}^{m}X^T(H(X)-Y)J˙(θ)=m1i=1∑mXT(H(X)−Y)
其中, H(X)=[h1(x),h2(x),⋯ ,hm(x)]TH(X)= [h_1(x),h_2(x),\cdots,h_m(x)]^TH(X)=[h1(x),h2(x),⋯,hm(x)]T
Matlab代码实现为
clear ; close all; clc
data = load('ex1data2.txt');
X = data(:, 1:2);
y = data(:, 3);
m = length(y);
%normalization
mu = mean(X); % mean value
sigma = std(X); % standard deviation
X_norm = (X - repmat(mu,size(X,1),1)) ./ repmat(sigma,size(X,1),1);
X = [ones(m, 1) X]; % Add intercept term to X
% Choose some alpha value
alpha = 0.01;
num_iters = 8500;
theta = zeros(3, 1);
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
theta = theta - alpha / m * X' * (X * theta - y);
J_history(iter) = computeCostMulti(X, y, theta);
end
function J = computeCostMulti(X, y, theta)
m = length(y);
J = 0;
J = sum((X * theta - y).^2) / (2*m);
end