吴恩达机器学习 笔记二 回归regression

本文深入讲解了单变量及多元线性回归原理与实现,包括预测函数、代价函数、梯度下降法及其在TensorFlow中的应用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1. 单变量线性回归

  假设样本中有mm组数据(x(1),y(1)),(x(2),y(2)),,(x(m),y(m))
  预测函数

hθ(x)=θ0+θ1xhθ(x)=θ0+θ1x
  代价函数
J(θ0,θ1)=12mi=1m(hθx(i)y(i))2J(θ0,θ1)=12m∑i=1m(hθx(i)−y(i))2
  目标求解
(θ0,θ1)=argminθ0,θ1J(θ0,θ1)(θ0,θ1)=argminθ0,θ1J(θ0,θ1)
  比较常用的方法就是梯度下降法。梯度与方向导数密切相关,是从泰勒级数的展开来证明的,比较简单,这里就不推导了。具体算法过程:
temp0=θ0αθ0J(θ0,θ1)temp0=θ0−α∂∂θ0J(θ0,θ1)
temp1=θ1αθ1J(θ0,θ1)temp1=θ1−α∂∂θ1J(θ0,θ1)
θ0=temp0θ0=temp0
θ1=temp1θ1=temp1

注意

  • 梯度下降算法过程中的步骤二、三顺序一定不要乱,因为第二步中的计算与θ0θ0是有关系的

  • 对训练集数据进行均值归一化,利于提升收敛速度

  • 注意学习步长的选择

2. 多元线性回归:

  多元线性回归是由单变量线性回归延伸而来的。多变量线性回归中的mm组数据(x1(1),x2(1),,xn(1),y(1)),(x1(2),x2(2),,xn(2),y(2)),,(x1(m),x2(m),,xn(m),y(m))
  预测函数

hθ⃗ (x⃗ )=θ⃗ Tx⃗ =θ0+θ1x1+θ2x2++θnxnhθ→(x→)=θ→Tx→=θ0+θ1x1+θ2x2+…+θnxn
其中,θ⃗ =(θ0,θ1,,θn)Tθ→=(θ0,θ1,…,θn)T,x⃗ =(1,x1,x2,,xn)Tx→=(1,x1,x2,…,xn)T
  代价函数
J(θ⃗ )=12mi=1m(hθ⃗ x⃗ (i)y(i))2J(θ→)=12m∑i=1m(hθ→x→(i)−y(i))2
  目标求解
θ⃗ =argminθ⃗ J(θ⃗ )θ→=argminθ→J(θ→)
2.1 梯度下降法

  多元线性回归依然可以用梯度下架你敢发进行求解,只不过是在高维空间的梯度下降,不再是三维空间那么地可视化。具体算法过程:

temp0=θ0αθ0J(θ⃗ )temp0=θ0−α∂∂θ0J(θ→)
temp1=θ1αθ1J(θ⃗ )temp1=θ1−α∂∂θ1J(θ→)
tempn=θnαθnJ(θ))
θ0=temp0θ0=temp0
θ1=temp1θ1=temp1
θn=tempn
2.2 最小二乘

  另外一个求解多元线性回归的方法是最小二乘法。具体推导如下:
  整个预测过程可以用方程组表示为

x⃗ (1)Tθ⃗ =y(1)x→(1)Tθ→=y(1)
x⃗ (2)Tθ⃗ =y(2)x→(2)Tθ→=y(2)
x(m)Tθ=y(m)

  将方程组表示成矩阵的形式,
Xθ⃗ =y⃗ Xθ→=y→
其中,X=[x⃗ (1)T;x⃗ (2)T;;x⃗ (m)T]矩阵X=[x→(1)T;x→(2)T;…;x→(m)T],可以求解得到
θ⃗ =(XTX)1XTy⃗ θ→=(XTX)−1XTy→
2.3最小二乘的几何意义

  对方程组进行变换,令

x⃗ i=(xi(1)xi(2)xi(m))Tx→i=(xi(1)xi(2)…xi(m))T
y⃗ =(y(1)y(2)y(m))Ty→=(y(1)y(2)…y(m))T
那么
y⃗ =θ0x⃗ 0+θ1x⃗ 1++θnx⃗ ny→=θ0x→0+θ1x→1+…+θnx→n
  其实,上式基本上是没有解的。但是,我们可以找到一个使得代价函数最小的解。代价函数最小的几何意义在于x⃗ 0x⃗ 1x⃗ nx→0x→1…x→n 所张成的线性子空间中,寻找一点,使得这一点到y⃗ y→的距离最短。很显然,这一点就是y⃗ y→的投影,这也是最小二乘法求解的精髓所在。
  其实,每个权值都代表了相应的特征对最终结果的贡献,多个训练样本会让特征对结构的贡献的衡量变得更加准确
2.4 最小二乘与梯度下降法的选择

  当数据的特征nn超过一定界限时,最小二乘中矩阵的求逆运算将会变得十分复杂,此时一般会选择梯度下降法。至于这个界限,可以选择105~106106作为参考。

3. 多项式回归

  多项式回归可以转变为多元线性回归,核心在于用已知的特征组合出新的特征。预测函数,例如

hθ⃗ (x⃗ )=θ0+θ1x1+θ2x21+θ3x1hθ→(x→)=θ0+θ1x1+θ2x12+θ3x1
其中每一项都是可以计算出的。
  至于多项式中应该选择怎样的高次项,就需要根据大概形状进行一个初次的选择。此外,也要根据实际情况,比如,房屋总价随着面积增长一般是不会减少的,所以此时应该二次项是不够的,还需要一个三次项。
  在多项式回归中,特征的缩放将会变得尤其重要,因为其中含有同一特征的不同次项,他们的范围是不同的,但是由于是同一特征,不可能进行不同size的缩放的。
4. 单变量线性回归tensorflow

  代码有参考网上的博客,第一次写tensorflow,万事开头难,还好总算是上手了。关于tensorflow的一些常用函数用法,后面会专门总结一些,毕竟不死不活,记死了才能活。

'''
Author       :  vivalazxp
Date         :  8/23/2018
Description  :  linear regression with one value
'''
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
'''
Description   :  create data for linear regression with one value
Param         :  @weight    weight of the line needed to be fitting
                 @bias      bias of the line needed to be fitting
                 @numData   number of training data
                 @sigma     power of noises
Return        :  @data_horizon   horizontal-axis of training data    shape=(numData,)
                 @data_vertical  vertical-axis of training data    shape=(numData,)
'''
def data_create_lin_reg_one_val(numData, weight, bias, horizon_limit, sigma):
    data_horizon = horizon_limit * 2*(np.random.rand(numData)-0.5)
    data_vertical = weight * data_horizon + bias
    # add noise
    data_vertical += sigma * np.random.randn(numData)
    print('------------- create training data sucessfully --------------')
    return data_horizon, data_vertical
'''
Description   :  ues tensorflow to complete linear regression with one value
Param         :  @alpha     learning rate
                 @steps     sum learning steps
Return        :  @weight_fitted    weight of the fitting line
                 @bias_fitted      bias of the fitting line
'''
def tf_lin_reg_one_val(data_horizon, data_vertical, steps, alpha):
    horizon_from_data = tf.placeholder(tf.float32)
    vertical_from_data = tf.placeholder(tf.float32)
    # initialize randomly weight and bias
    weight_fitted = tf.Variable(tf.random_normal([1]))
    bias_fitted = tf.Variable(tf.random_normal([1]))
    # cost function and optimizer
    vertical_pred = tf.multiply(weight_fitted, horizon_from_data) + bias_fitted
    cost = tf.reduce_mean(tf.pow(vertical_pred - vertical_from_data, 2))
    optimizer = tf.train.GradientDescentOptimizer(alpha).minimize(cost)
    # session initialization
    sess = tf.Session()
    init = tf.global_variables_initializer()
    sess.run(init)
    print('---------------- train started ------------------------')
    loss = np.zeros(steps)
    for step in range(steps):
        sess.run(optimizer, feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
        loss[step] = sess.run(cost,feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
    print('---------------- train finished ------------------------')
    weight_fitted = sess.run(weight_fitted)
    bias_fitted = sess.run(bias_fitted)
    return weight_fitted, bias_fitted, loss


if __name__ == "__main__":
    weight = 100
    bias = 2.0
    horizon_limit = 10
    numData = 1000
    sigma = weight
    steps = 10000
    alpha = 0.0001
    data_horizon, data_vertical = data_create_lin_reg_one_val(numData, weight, bias, horizon_limit, sigma)
    weight_fitted, bias_fitted, loss = tf_lin_reg_one_val(data_horizon, data_vertical, steps, alpha)
    # log
    print('expected  weight = ', weight, ', expected  bias = ', bias)
    print('regression weight = ', weight_fitted, ', regression bias = ', bias_fitted)
    # fitting line
    plt.figure(1)
    horizon_fit = np.linspace(-horizon_limit, horizon_limit, 200)
    vertical_fit = weight_fitted*horizon_fit + bias_fitted
    plt.plot(data_horizon, data_vertical, 'o', label='training data')
    plt.plot(horizon_fit, vertical_fit, 'r', label='regression line')
    plt.legend()
    plt.xlabel('horizontal axis')
    plt.ylabel('vertical axis')
    plt.title('linear regression with one value')
    # cost variation
    plt.figure(2)
    plt.plot(range(steps), loss)
    plt.xlabel('step')
    plt.ylabel('loss')
    plt.title('loss variation in linear regression with one value')

    plt.show()

这里写图片描述
这里写图片描述

5. 多项式回归 梯度下降 tensorflow

  这段代码也有参考网上,在调节参数的时候,遇到了出现nan的问题,后面继续研究一下。

'''
Author       :  vivalazxp
Date         :  11/9/2018
Description  :  non-linear regression regulization
'''
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

'''
Description   :  create data for non-linear regression of sin(x)
Param         :  @numData   numbers of training data     
                 @sigma     power of noises                      
Return        :  @data_horizon   horizontal-axis of training data    shape=(numData,)
                 @data_vertical  vertical-axis of training data    shape=(numData,)
'''
def data_create_sin_non_lin_reg(numData, sigma, horizon_limit):
     data_horizon = np.linspace(-horizon_limit, horizon_limit, numData)
     data_vertical = np.sin(data_horizon)
     # add noise
     data_vertical += sigma * np.random.randn(numData)
     print('---------- create data sucessfully ----------')
     return data_horizon, data_vertical
'''
Description   :  use tensorflow to complete non-linear regression of sin(x)
Param         :  @alpha    learning rate
                 @steps    sum learning steps
                 @n_order  use n-order polynomial to fit sin(x)
Return        :  @theta    weights of fitting sin(x)  shape=(1,n_order+1)  
'''
def tf_non_lin_reg(n_order,data_horizon, data_vertical, alpha, steps):
    numData = data_vertical.shape[0]
    #placeholder for training data
    horizon_from_data = tf.placeholder(tf.float32)
    vertical_from_data = tf.placeholder(tf.float32)
    #initialize randomly theta and theta
    theta = tf.Variable(tf.random_normal([n_order+1]))
    vertical_pred = tf.zeros(numData)
    for index_n in range(n_order+1):
        vertical_pred = tf.add( vertical_pred, tf.multiply( theta[index_n], tf.pow( horizon_from_data, index_n*tf.ones([1,numData]))))

    #cost function and optimizer
    cost = tf.reduce_mean(tf.square(vertical_pred - vertical_from_data))
    optimizer = tf.train.GradientDescentOptimizer(alpha).minimize(cost)
    #session
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())
    print('-------- train started --------')
    loss = np.zeros(steps)
    for step in range(steps):
        sess.run(optimizer, feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
        loss[step] = sess.run(cost, feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
    print('-------- train finished --------')
    theta = sess.run(theta)
    return theta, loss

def main():
    numData = 100
    sigma = 0.2
    n_order = 3
    horizon_limit = 3
    alpha = 0.005
    steps = 1000
    data_horizon, data_vertical = data_create_sin_non_lin_reg(numData, sigma, horizon_limit)
    theta, loss = tf_non_lin_reg(n_order, data_horizon, data_vertical, alpha, steps)
    # fitting line
    plt.figure(1)
    horizon_fit = np.linspace(-horizon_limit, horizon_limit, 200)
    vertical_fit = np.zeros(200)
    for index in range(n_order+1):
        vertical_fit = np.add(vertical_fit, theta[index]* horizon_fit ** index)

    plt.plot(data_horizon, data_vertical, 'o', label='training data')
    plt.plot(horizon_fit, vertical_fit, 'r', label='regression curve')
    plt.legend()
    plt.xlabel('horizontal axis')
    plt.ylabel('vertical axis')
    plt.title('non-linear regression')

    # cost variation
    plt.figure(2)
    plt.plot(range(steps), loss)
    plt.xlabel('step')
    plt.ylabel('loss')
    plt.title('loss variation in non-linear regression')
    plt.show()

if __name__ == "__main__":
    main()

这里写图片描述
这里写图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值