【机器学习】欠拟合与过拟合 02

最新推荐文章于 2025-05-16 13:17:09 发布

相洋同学

最新推荐文章于 2025-05-16 13:17:09 发布

阅读量647

点赞数 21

分类专栏：人工智能之机器学习篇文章标签：机器学习人工智能算法

本文链接：https://blog.youkuaiyun.com/CXY819394/article/details/135930921

版权

1 提升扩展阶数产生的现象

2 常用的正则化方法

2.1 正则化对权重的影响：以L2正则化为例

2.2 正则化对权重的影响：以L1正则化为例

2.3 三种正则化方法解决过拟合：

3 交叉验证调整超参数

#学习记录#

1 提升扩展阶数产生的现象

过拟合现象

通过之前的程序我们发现，使用多项式扩展完美的解决了欠拟合问题。如果我们使用更多阶的多项式，甚至可以将拟合度提高为1，但是问题来了，是否阶数越多越好呢？

我们通过代码来看一下：

from sklearn.model_selection import train_test_split
def true_fun(x):
    """数据分布的函数，用于生成由x -> y的映射。
    根据x数组中的每个元素，返回该元素的余弦计算值y。
    Parameters
    ----------
    x : array-like
    训练数据集。
    Returns
    -------
    y : array-like
    x对应的余弦映射结果。
    """
    return np.cos(1.5 * np.pi * x)
def fit_and_plot(model):
    """用来训练模型，并且绘制模型的拟合效果。
    Parameters
    ----------
    model : object
    模型对象。
    """
    np.random.seed(0)
    x = np.random.rand(50)
    # 在映射函数上，增加一定的误差（噪声）。这样更符合现实中数据的分布。
    # 误差服从正态分布。
    y = true_fun(x) + np.random.randn(len(x)) * 0.1
    X = x[:, np.newaxis]
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5,
    random_state=0)
    model.fit(X_train, y_train)
    train_score = model.score(X_train, y_train)
    test_score = model.score(X_test, y_test)
    plt.scatter(X_train, y_train, c="g", label="训练集")
    plt.scatter(X_test, y_test, c="orange", marker="D", label="测试集")
    s = np.linspace(0, 1, 100).reshape(-1, 1)
    plt.plot(s, model.predict(s), c="r", label="拟合线")
    plt.xlabel("x")
    plt.ylabel("y")
    plt.xlim((0, 1))
    plt.ylim((-2, 2))
    plt.legend(loc="upper center")
    plt.title(f"训练集：{train_score:.3f} 测试集：{test_score:.3f}")
# 定义多项式扩展的阶数。
degrees = [1, 3, 8, 15]
plt.figure(figsize=(15, 12))
for i, n in enumerate(degrees):
    plt.subplot(2, 2, i + 1)
    pipe = Pipeline([("poly", PolynomialFeatures(degree=n, include_bias=False)),("lr", LinearRegression())])
    fit_and_plot(pipe)
    # named_steps返回字典对象，提供流水线中每个步骤的名称（key）与对象（value）的映射。
    print(pipe.named_steps["lr"].coef_)

输出：

最低0.47元/天解锁文章