机器学习多元线性回归

最新推荐文章于 2023-11-15 21:33:29 发布

原创最新推荐文章于 2023-11-15 21:33:29 发布 · 280 阅读

0 ·

CC 4.0 BY-SA版权

机器学习专栏收录该内容

1 篇文章

订阅专栏

线性回归：

x1为第一个特征，x2为第二个特征，也可以称为属性。y为真实值，h为预测值。

$h_{\theta }(x) = \theta _{0} + \theta_{1}x_{1} + \theta_{2}x_{2}$

$h(x) = \sum_{i=0}^{n}\theta_{i}x_{i} = \theta^{T}x$

所以损失函数定义如下：

$J(\theta) =\frac{1}{2} \sum_{i = 1}^{m}(h_{\theta}(x^{(i)}-y^{(i)}))^{2}$

$x^{(i)}$ 表示第i个样本。

利用梯度下降进行参数更新：

$\theta_{j} = \theta{j} -\alpha\frac{\partial }{\partial \theta_{j}}(J(\theta))$

$\frac{\partial}{\partial\theta_{j}}J(\theta) = \frac{\partial}{\partial\theta_{j}}\frac{1}{2}(h_{\theta}(x)-y)^{2}\\=(h_{\theta}(x)-y)\frac{\partial}{\partial\theta_{j}}(h_{\theta}(x)-y)\\=(h_{\theta}(x)-y)\frac{\partial}{\partial\theta_{j}}(\sum_{i=0}^{n}\theta_{i}x_{i}-y)\\=(h_{\theta}(x)-y)x_{j}$

$x_{i}$ 在此处代表的不是第i个样本，而是样本第i维的特征或属性。
$\theta_{j}=\theta_{j}-\alpha(h_{{\theta}}(x^{(i)})-y^{(i)})x^{(i)}_{j}$ 表示的是第i个样本的第j维特征

参数更新有两个规则：

1、把m个样本对参数j的梯度分别求出来，然后求和。

$\theta_{j} = \theta_{j}-\alpha\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_{j}$

2、每次更新利用一个样本对参数j求梯度，然后循环m次，即将m个样本全部使用一遍。

$\theta_{j}=\theta_{j}-\alpha(h_{{\theta}}(x^{(i)})-y^{(i)})x^{(i)}_{j} \qquad for \ i \ in \ range(m)$

import numpy as np
import matplotlib.pyplot as plt


#x = np.array([1, 1.3, 1.4, 2, 2.4, 3.6, 4, 6, 7, 10, 13, 18])
x = np.sort(10 * np.random.randn(30))
print(x)
m = x.shape[0]
x1 = x.reshape(1, m)
x2 = np.square(x).reshape(1, m)
x = x.reshape(1, m)

X = np.concatenate((x1, x2), axis=0)
y = 2 * x1 + 3 * x2 + 0.1 * np.random.normal(0, 1, m)
theta = np.random.randint(1, 2, 2).reshape(2, 1).astype(np.float32)
#学习率的设置非常重要
alpha = 0.00001
#正则
reg = 0.001
print(X.shape)
iters = 1
for k in range(iters):
    for i in range(m):
        for j in range(theta.shape[0]):
            #print(alpha*(np.dot(theta.T, X[:, i]) - y[:, i]))
            theta[j, :] -= (alpha * ((np.dot(theta.T, X[:, i]) - y[:, i]) * X[j, i]))
print(theta)
plt.subplot(2, 1, 1)
h_x = np.dot(theta.T, X)
print(y)
print(h_x)
plt.plot(x.reshape(m), y.reshape(m), '*',  label='y')
plt.plot(x.reshape(m), h_x.reshape(m), label='h_x')
plt.legend()
plt.show()