机器学习（8）-- 非线性回归

最新推荐文章于 2023-11-15 22:03:27 发布

原创最新推荐文章于 2023-11-15 22:03:27 发布 · 643 阅读

1 ·

CC 4.0 BY-SA版权

机器学习专栏收录该内容

12 篇文章

订阅专栏

本文介绍了一种使用Python生成模拟数据并应用线性回归分析的方法，通过梯度下降算法来最小化损失函数，找到最佳拟合直线的权重参数。详细展示了数据生成过程、损失函数计算以及梯度下降更新权重的实现。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

import numpy as np
import random


def genData(pointCont, bias, variance):
    """
    x是多个二维的点，沿着y=x+b直线附近分布，b为bias，
    variance为y的基础偏差，总偏差为基础偏差+随机偏差
    :param pointCont: 生成的点的数量
    :param bias: 结果的偏差
    :param variance:
    :return: x：平面上的一系列点，y是对应点的标志
    """
    x = np.zeros(shape=(pointCont, 2))
    y = np.zeros(shape=(pointCont))
    for i in range(0, pointCont):
        x[i][0] = 1
        x[i][1] = i
        y[i] = (i + bias) + random.uniform(0, 1) + variance
    return x, y


def gradientDescent(x, y, theta, alpha, itemsCont, iters):
    """
    min cost :cost = sum(loss**2)/2m
                    = sum((h-y)**2)/2m
                    = sum (x*theta - y)**2/2m
            梯度：D(cost) = sum 2*(x*theta - y) * theta/2m
                        = sum 2*loss * theta/2m
                        = sum loss*theta/m
    :param x:
    :param y:
    :param theta: 初始权重参数
    :param alpha: 学习率
    :param itemsCont: 数据集大小
    :param iters: 迭代次数
    :return: 新的权重
    """
    xTran = np.transpose(x)
    for i in range(iters):
        hypothesis = np.dot(x, theta)   #预测值
        loss = hypothesis - y      #偏差
        cost = np.sum(loss**2)/(2*itemsCont)  #损失函数可以自行设置，这只是最简单的
        gradient = np.dot(xTran, loss)/itemsCont
        theta = theta - alpha*gradient
    return theta


x, y = genData(100,25,10)
print(x, y)
theta = np.ones(2)
theta = gradientDescent(x, y, theta, 0.0005, 100, 10000)
print(theta)