Sparse solutions with L1 regularization

最新推荐文章于 2025-02-19 20:24:09 发布

Leyooo

最新推荐文章于 2025-02-19 20:24:09 发布

阅读量454

点赞数

分类专栏： Machine Learning Learning 文章标签： machine-learning Python

Learning 同时被 2 个专栏收录

23 篇文章

订阅专栏

Machine Learning

19 篇文章

订阅专栏

本文介绍了几种有效减少机器学习中过拟合现象的方法，包括收集更多训练数据、使用正则化（如L1、L2）、选择参数较少的简单模型以及降低数据维度等。通过葡萄酒数据集示例展示了如何利用标准化和最小最大缩放方法来处理特征，并运用L1正则化的逻辑回归进行特征选择。

1. Common solutions to reduce the generalization error are listed as follows :

• Collect more training data

• Introduce a penalty for complexity via regularization【L1、 L2】

• Choose a simpler model with fewer parameters

• Reduce the dimensionality of the data【L1、 L2】

2. Precessing

from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt

df_wine = pd.read_csv('./datasets/wine/wine.data',
                      header=None)
# https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data

df_wine.columns = ['Class label', 'Alcohol', 'Malic acid', 'Ash',
                   'Alcalinity of ash', 'Magnesium', 'Total phenols',
                   'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins',
                   'Color intensity', 'Hue', 'OD280/OD315 of diluted wines',
                   'Proline']
print('Class labels', np.unique(df_wine['Class label']))

print('\nWine data excerpt:\n\n', df_wine.head())

X, y = df_wine.iloc[:, 1:].values, df_wine.iloc[:, 0].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

print('Section: Bringing features onto the same scale')
mms = MinMaxScaler()
X_train_norm = mms.fit_transform(X_train)
X_test_norm = mms.transform(X_test)
stdsc = StandardScaler()
X_train_std = stdsc.fit_transform(X_train)
X_test_std = stdsc.transform(X_test)

3. Common ways to reduce overftting by regularization and dimensionality reduction via feature selection

L1 regularization --- Scikit-learn support L1 regularization

from sklearn.linear_model import LogisticRegression

lr = LogisticRegression(penalty='l1', C=0.1)

lr.fit(X_train_std, y_train)

4. L et's plot the regularization path, which is the weight coeffcients of the different features for different regularization strengths

fig = plt.figure()
ax = plt.subplot(111)

colors = ['blue', 'green', 'red', 'cyan',
          'magenta', 'yellow', 'black',
          'pink', 'lightgreen', 'lightblue',
          'gray', 'indigo', 'orange']

weights, params = [], []

for c in np.arange(-4, 6):
    lr = LogisticRegression(penalty='l1', C=10**c, random_state=0)
    lr.fit(X_train_std, y_train)
    weights.append(lr.coef_[1])
    params.append(10**c)

weights = np.array(weights)

for column, color in zip(range(weights.shape[1]), colors):
    plt.plot(params, weights[:, column],
             label=df_wine.columns[column + 1],
             color=color)
plt.axhline(0, color='black', linestyle='--', linewidth=3)
plt.xlim([10**(-5), 10**5])
plt.ylabel('weight coefficient')
plt.xlabel('C')
plt.xscale('log')
plt.legend(loc='upper left')
ax.legend(loc='upper center',
          bbox_to_anchor=(1.38, 1.03),
          ncol=1, fancybox=True)
plt.show()

5 . Analysis

The resulting plot provides us with further insights about the behavior of L1 regularization. As we can see, all features weights will be zero if we penalize the model with a strong regularization parameter ( C < 0.1 ); C is the inverse of the regularization parameter.

Reference:《Python Machine Learning》