[blabla]a quick code about linear regression using gradient descent_initialize the parameter vector 胃0.-优快云博客

本文链接：https://blog.youkuaiyun.com/myjiayan/article/details/48652087

本文介绍了一个使用线性回归预测房价的Python实现案例。基于Coursera机器学习课程的数据集，文章详细介绍了数据加载、预处理、梯度下降算法实现及正常方程方法，并通过实例演示了如何预测特定房屋的价格。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

use linear regression to predict the house price

the original data is from the cousera machine learning course.
when I finished the exercise with MATLAB, the idea about implementing the algorithm with python comes out.
so I’d like to refresh the knowledge and have fun with the data:)
the algorithm is so simple that you can scan it quickly, and save your time.:)

1. focus on the data

as a data scientist, what data do you have means how far can you deep into the superface of the data.
i nead to load data, and keep eyes on what scheme is the data stored.

import numpy as np

def load_data(filename):
    data = []
    with open(filename,'rb') as f:
        for line in f:
            line = line.decode('utf-8').strip().split(',')
            data.append([int(_) for _ in line])
    return data

filename = r'ex1data2.txt'
data = load_data(filename)
# look at the first three line of the data
print('\n'.join([str(data[i]) for i in range(3)]))

[2104, 3, 399900]
[1600, 3, 329900]
[2400, 3, 369000]

2. math model

so, what the three integers in the first line mean?
the first element 2104 means house width, the second element 3 means the house depth, and the last one means the price.
it is time to choose our math model to deal with the data.
apparently, the article pay attention to linear regression.
all right, the model is linear regression.

to find the parameters $\theta_{0},\theta_{1},\theta_{2}$ of hypothesis $price = \theta_{0} + \theta_{1} x_{1} + \theta_{2} x_{2}$

initialize the vector $\theta = [\theta_{0},\theta_{1},\theta_{2}]$
minimize the error: $error =\frac{0.5}{m} *\sum_{i=1}^{m}(price(x^{i})- y^{i}))^{2}$
to achieve the minimization we use the gradient descent algorithm due to the cost function is a convex function.

talk is cheap, show me the code.

3. implement

# normalization
data = np.array(data)

x = data[:,[0,1]]
y = data[:,2]

mu = np.mean(x, axis=0)
std = np.std(x, axis=0)

x = (x-mu)/std

row = x.shape[0]
X = np.ones((row,3))
X[:,[1,2]] = x
X = np.matrix(X)
# get the X to computation
theta = np.zeros((3,1))
theta = np.matrix(theta)
y = np.matrix(y)
#implement grad descent method
def grad_descent(X, y, theta, iter_num, alpha):
    m = len(y)
    for _ in range(iter_num):
        theta -= alpha/m*(X.T*X*theta-X.T*y.T)
    return theta

# initialize the parameters
iter_num = 900
alpha = 0.01

new_theta = grad_descent(X, y, theta, iter_num, alpha)
print('the theta parameter is:')
print(new_theta)
# Estimate the price of a 1650 sq-ft, 3 br house
price = np.dot(np.array([1, (1650-mu[0])/std[0], (3-mu[1])/std[1]]), new_theta)
print('for a 1650 sq-ft, 3 br house，the price is')
print(price)

the theta parameter is:
[[ 340412.65957447]
 [ 109447.79646964]
 [  -6578.35485416]]
for a 1650 sq-ft, 3 br house，the price is
[[ 293081.4643349]]

3. Normal Euqation

when the number of featuers in data is below 1000.
we always use normal equation to compute theta.

what the relationship between these two methods?

$\theta_{n+1} = \theta_{n} - \alpha/m(X^{'}X\theta-X^{'}y)$

when n becomes infinite the $\theta_{n+1} = \theta_{n}$ and $X^{'}X\theta-X^{'}y = 0$

so $\theta = inv(X^{'}X)X^{'}y$

new_X = np.ones((47,3))
new_X[:,1:] = data[:,:2]
new_X = np.matrix(new_X)
new_theta1 = np.linalg.pinv(new_X.T*new_X)*new_X.T*y.T
print(new_theta1)

[[ 89597.90954435]
 [   139.21067402]
 [ -8738.01911278]]



new_price = np.dot(np.array([1, 1650, 3]), new_theta1)
print(new_price)

[[ 293081.46433506]]

the two result is close enough.