1、线性回归

【安装】
https://ftp.gnu.org/gnu/octave/windows/

【一元线性回归】
1、训练数据绘图
cd D:\study\AI\data\ex1
data = load('ex1data1.txt'); % read comma separated data
x = data(:, 1); % feature 
y = data(:, 2); % target
m = length(y); % number of training examples

plot(x, y, 'rx', 'MarkerSize', 10); % Plot the data
ylabel('Profit in $10,000s'); % Set the y-axis label
xlabel('Population of City in 10,000s'); % Set the x-axis label
2、梯度下降
2-1、代数方式
iterations = 2500;
alpha = 0.01;
theta0 = 0;
theta1 = 0;    
sumX = sum(x)
sumY = sum(y)
sumXX = sum(x.^ 2)
sumXY = sum(x.* y)
sumYY = sum(y.^ 2)
for i = 1:iterations
   grad0 = theta0 + (theta1*sumX - sumY)/m; % 梯度
   grad1 = (theta0*sumX + theta1*sumXX - sumXY)/m; % 梯度
   theta0 = theta0 - alpha*grad0; % 梯度下降法
   theta1 = theta1 - alpha*grad1; % 梯度下降法
   costFun =  (theta0**2 * m + sumYY + theta1**2 * sumXX + 2 * theta0 * theta1 * sumX - 2 * theta0 * sumY - 2 * theta1 * sumXY)/(2*m); 
   % cost function 千万仔细推导 ,最好通过for循环方式做个验证
   disp(sprintf('%0.0f => %0.6f, %0.6f, %0.6f, %0.6f, %0.6f', i, theta0, theta1, grad0, grad1, costFun))
end
disp(sprintf('%0.6f, %0.6f', theta0, theta1))

2-2、线代方式
iterations = 2500;
alpha = 0.01;
theta = [0; 0];
one = ones(m, 1);
X = [one x];
for i = 1:iterations
    grad = X' * (X * theta - y)/m;
    theta = theta - alpha * grad;
    costFun = (X * theta - y)' * (X * theta - y)/(2*m);
    disp(sprintf('%0.0f => %0.6f, %0.6f, %0.6f, %0.6f, %0.6f', i, theta(1), theta(2), grad(1), grad(2), costFun))
end
        
predict1 = [1, 3.5] * theta
predict2 = [1, 7] * theta

3、损失函数绘图
x = -10: 0.05: 10;
y = -10: 0.05: 10;
[X,Y]=meshgrid(x,y); %对X,Y变量进行网格化(指定网格化的一组变量)
z = (97 * X.^2 + 7896.2 * Y.^2 + 2 * 791.50 * X.*Y - 2 * 566.40 * X- 2 * 6336.9 * Y + 6222.1)/(2*97); 
3-1、三维图
surf(X,Y,z);  %曲面化(输入曲面化的三维变量)
colorbar; 
xlabel('θ0'); 
ylabel('θ1'); 
zlabel('costFunction'); %定义标签
shading interp; %曲面光滑处理
3-2、等高线
[C, h] = contour(X, Y, z); % 绘制等高线图
set(h, 'ShowText', 'on', 'TextStep', get(h, 'LevelStep')*2);

【多元线性回归】
cd D:\study\AI\data\ex1
data = load('ex1data2.txt'); % read comma separated data
X = data(:,[1, 2]);
y = data(:, 3); % target;
m = length(y); 
1、数据缩放
maxX = [max(X(:, 1)), max(X(:, 2))]
minX = [min(X(:, 1)), min(X(:, 2))]
sizeX = [maxX(1)-minX(1), maxX(2)-minX(2)]
X = [(X(:, 1) - minX(1))/sizeX(1)  (X(:, 2) - minX(2))/sizeX(2)]
one = ones(m, 1);
X = [one X];

maxY = max(y);
minY = min(y);
sizeY = max(y) - min(y);
y = (y - minY)/sizeY;

2、梯度下降
iterations = 80000;
alpha = 0.01;
theta = zeros(3, 1);
for i = 1:iterations
    grad = X' * (X * theta - y)/m;
    theta = theta - alpha * grad;
    costFun = (X * theta - y)' * (X * theta - y)/(2*m);
    disp(sprintf('%0.0f => %0.6f, %0.6f, %0.6f, %0.6f, %0.6f, %0.6f, %0.6f', i, theta(1), theta(2), theta(3),grad(1), grad(2), grad(3), costFun))
end

2500 => 0.053323, 0.575918, 0.164230, 0.003152, -0.011447, 0.001258, 0.009569
80000 => 0.055787, 0.952411, -0.065947, -0.000000, -0.000000, 0.000000, 0.007274

theta(1) = theta(1)*sizeY + minY - theta(2)*minX(1)*sizeY /sizeX(1) - theta(3)*minX(2)*sizeY / sizeX(2);
theta(2) = theta(2) *sizeY / sizeX(1);
theta(3) = theta(3) *sizeY / sizeX(2);
89597.73320   139.21059    -8737.91281

3、正向求解
cd D:\study\AI\data\ex1
data = load('ex1data2.txt'); % read comma separated data
X = data(:,[1, 2]);
y = data(:, 3); % target;
m = length(y); 
one = ones(m, 1);
X = [one X];
theta = pinv(X' * X ) * X' * y 
costFun = (X * theta - y)' * (X * theta - y)/(2*m)

【注意】
1、要保证梯度下降到足够小
2、使用中等规模数据量进行梯度下降和正向求解对比验证,以保证梯度下降实现过程无错误

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值