梯度下降 Python

Gradient Descent

Today, I’m going to try this method to solve a linear regression problem.

Function can be written as:

h ( θ ) = θ 0 + θ 1 x h(\theta)=\theta_0+\theta_1x h(θ)=θ0+θ1x

The cost function, “Squared error function”, or “Mean squared error” is:

J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m = 1 2 m ( y i ^ − y i ) 2 = 1 2 m ( h θ ( x i ) − y i ) 2 J(θ_0,θ_1)=\frac{1}{2}m\sum_{i=1}^m=\frac{1}{2m}(\hat{y_i}−y_i)^2=\frac{1}{2m}(h_\theta(x_i)−y_i)^2 J(θ0,θ1)=21mi=1m=2m1(yi^yi)2=2m1(hθ(xi)yi)2

Iterate until function J ( θ ) J(\theta) J(θ) to

M i n J ( θ 0 , θ 1 ) Min J(θ_0,θ_1) MinJ(θ0,θ1)

Iterate by:

θ j : = θ j − α ∂ ∂ θ j J ( θ 0 , θ 1 ) \theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j}J(θ_0,θ_1) θj:=θjαθjJ(θ0,θ1)

“:=” means renew the value.

Import

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
from sklearn import preprocessing
import warnings
%matplotlib inline 
%config InlineBackend.figure_format = 'retina'
warnings.filterwarnings('ignore')

data = pd.read_csv("./Salary_Data.csv")
display(data)
YearsExperienceSalary
01.139343.0
11.346205.0
21.537731.0
32.043525.0
42.239891.0
52.956642.0
63.060150.0
73.254445.0
83.264445.0
93.757189.0
103.963218.0
114.055794.0
124.056957.0
134.157081.0
144.561111.0
154.967938.0
165.166029.0
175.383088.0
185.981363.0
196.093940.0
206.891738.0
217.198273.0
227.9101302.0
238.2113812.0
248.7109431.0
259.0105582.0
269.5116969.0
279.6112635.0
2810.3122391.0
2910.5121872.0

Preprocessing

The scale of the data is too big, so we need to normalize them by:

z = x − m i n ( x ) m a x ( x ) − m i n ( x ) z = \frac{x-min(x)}{max(x)-min(x)} z=max(x)min(x)xmin(x)

x = data.values[:, 0]
y = data.values[:, 1]
x = preprocessing.normalize([x]).T
y = preprocessing.normalize([y]).T

Functions

Then define some functinos:

def h(t0, t1):
    '''linear function'''
    return t0 + t1 * x


def J(t0, t1):
    '''cost function'''
    sum = 0.5 * (1 / len(x)) * np.sum(np.power((t0 + t1 * x) - y, 2))
    return sum


def gd(t0, t1, alpha, n_iter):
    '''main function'''
    theta = [t0, t1]
    temp = np.array([t0, t1])
    cost = []

    k = 0
    while True:
        t0 = theta[0] - alpha * (1 / len(x)) * np.sum((theta[0] + theta[1] * x) - y)
        t1 = theta[1] - alpha * (1 / len(x)) * np.sum(((theta[0] + theta[1] * x) - y) * x)
        theta = [t0, t1]
        cost.append(J(t0, t1))
        temp = np.vstack([temp, theta])
        k += 1
        if k >= n_iter:
            break
    return cost, temp, theta

Output

cost, temp, theta = gd(0, 1, 1, 500)
print("The result is: h = %.2f + %.2f * x" % (theta[0], theta[1]))
yh = h(theta[0], theta[1])
fig1 = plt.figure(dpi=150)
plt.plot(x, yh)
plt.plot(x, y, 'o', markersize=1.5)
for i in range(len(x)):
    plt.plot([x[i], x[i]], [y[i], yh[i]], "r--", linewidth=0.8)
plt.title("Fit")
plt.tight_layout()
plt.show()
The result is: h = 0.06 + 0.71 * x

[外链图片转存失败(img-xxYP48U1-1567243295085)(output_8_1.png)]

fig2 = plt.figure(dpi=150)
plt.plot(range(len(cost)), cost, 'r')
plt.title("Cost Function")
plt.tight_layout()
plt.show()

[外链图片转存失败(img-rLyqDERb-1567243295087)(output_9_0.png)]

Compared to Normal Equation

θ = ( X T X ) − 1 X T y \theta = (X^T X)^{-1} X^T y θ=(XTX)1XTy

X = x
X.shape
(30, 1)
one = np.ones((30, 1))
one.shape
X = np.concatenate([one, X], axis=1)
(30, 1)
theta = np.linalg.pinv(X.T @ X) @ X.T @ y
theta
array([[0.05839456],
       [0.70327706]])
print("The result is: h = %.2f + %.2f * x" % (theta[0], theta[1]))
yh = h(theta[0], theta[1])
fig1 = plt.figure(dpi=150)
plt.plot(x, yh)
plt.plot(x, y, 'o', markersize=1.5)
for i in range(len(x)):
    plt.plot([x[i], x[i]], [y[i], yh[i]], "r--", linewidth=0.8)
plt.title("Fit")
plt.tight_layout()
plt.show()
The result is: h = 0.06 + 0.70 * x

[外链图片转存失败(img-QwNczrQg-1567308983904)(output_14_1.png)]

print("gd cost:", cost[-1])
print("ne cost:", J(theta[0], theta[1]))
gd cost: 8.042499159722341e-05
ne cost: 8.014548878265756e-05
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值