线性回归问题
定义函数:y=w*x + b
拟合和梯度下降
- 定义损失函数
loss = (w*x + b - y)**2
计算平均损失函数值
import numpy as np
# y = wx + b
def compute_error_for_line_given_points(b, w, points):
totalError = 0
for i in range(0, len(points)):
x = points[i, 0]
y = points[i, 1]
totalError += (y - (w * x + b)) ** 2
return totalError / float(len(points))
- 计算梯度,更新变量w,b
loss对w偏导,x*(wx + b)
loss对b偏导,(wx + b)
迭代所有数据点,计算平均偏导并更新w和b;
def step_gradient(b_current, w_current, points, learningRate):
b_gradient = 0
w_gradient = 0
N = float(len(points))
for i in range(0, len(points)):
x = points[i, 0]
y = points[i, 1]
b_gradient += -(2/N) * (y - ((w_current * x) + b_current))
w_gradient += -(2/N) * x * (y - ((w_current * x) + b_current))
new_b = b_current - (learningRate * b_gradient)
new_m = w_current - (learningRate * w_gradient)
return [new_b, new_m]
- 迭代更新
迭代多次得到最终结果;
def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations):
b = starting_b
m = starting_m
for i in range(num_iterations):
b, m = step_gradient(b, m, np.array(points), learning_rate)
return [b, m]
主函数
def run():
points = np.genfromtxt("data.csv", delimiter=",")
learning_rate = 0.0001
initial_b = 0 # initial y-intercept guess
initial_m = 0 # initial slope guess
num_iterations = 1000
print("Starting gradient descent at b = {0}, m = {1}, error = {2}"
.format(initial_b, initial_m,
compute_error_for_line_given_points(initial_b, initial_m, points))
)
print("Running...")
[b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations)
print("After {0} iterations b = {1}, m = {2}, error = {3}".
format(num_iterations, b, m,
compute_error_for_line_given_points(b, m, points))
)
if __name__ == '__main__':
run()
总结
自己手打,能思考和构建整体框架,但对于细节问题还有很多欠考虑得地方;
1.为何loss要用平均代替整体数值?有哪些好处?
2.梯度计算为何是迭代完整个数据集后计算平均梯度才更新参数?不是每计算一次更新一次?
3.未考虑不同function,将功能进行拆分,计算梯度和迭代训练隔离开来,便于嵌套执行。
代码:
- 变量命名不规范,容易产生歧义
- 函数功能未考虑详尽,需先明确主体功能,后细调
- 尽量加上合适的文档或者功能描述说明,便于回溯