1. Common solutions to reduce the generalization error are listed as follows :
• Introduce a penalty for complexity via regularization【L1、 L2】
• Choose a simpler model with fewer parameters• Reduce the dimensionality of the data【L1、 L2】
2. Precessing
from sklearn.model_selection import train_test_split import pandas as pd from sklearn.preprocessing import MinMaxScaler from sklearn.preprocessing import StandardScaler import numpy as np import matplotlib.pyplot as plt
df_wine = pd.read_csv('./datasets/wine/wine.data', header=None) # https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data df_wine.columns = ['Class label', 'Alcohol', 'Malic acid', 'Ash', 'Alcalinity of ash', 'Magnesium', 'Total phenols', 'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins', 'Color intensity', 'Hue', 'OD280/OD315 of diluted wines', 'Proline'] print('Class labels', np.unique(df_wine['Class label'])) print('\nWine data excerpt:\n\n', df_wine.head()) X, y = df_wine.iloc[:, 1:].values, df_wine.iloc[:, 0].values X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0) print('Section: Bringing features onto the same scale') mms = MinMaxScaler() X_train_norm = mms.fit_transform(X_train) X_test_norm = mms.transform(X_test) stdsc = StandardScaler() X_train_std = stdsc.fit_transform(X_train) X_test_std = stdsc.transform(X_test)
3.
Common ways to reduce overftting by regularization and dimensionality reduction via feature selection
L1 regularization ---
Scikit-learn support L1 regularization
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(penalty='l1', C=0.1)
lr.fit(X_train_std, y_train)
fig = plt.figure() ax = plt.subplot(111) colors = ['blue', 'green', 'red', 'cyan', 'magenta', 'yellow', 'black', 'pink', 'lightgreen', 'lightblue', 'gray', 'indigo', 'orange'] weights, params = [], [] for c in np.arange(-4, 6): lr = LogisticRegression(penalty='l1', C=10**c, random_state=0) lr.fit(X_train_std, y_train) weights.append(lr.coef_[1]) params.append(10**c) weights = np.array(weights) for column, color in zip(range(weights.shape[1]), colors): plt.plot(params, weights[:, column], label=df_wine.columns[column + 1], color=color) plt.axhline(0, color='black', linestyle='--', linewidth=3) plt.xlim([10**(-5), 10**5]) plt.ylabel('weight coefficient') plt.xlabel('C') plt.xscale('log') plt.legend(loc='upper left') ax.legend(loc='upper center', bbox_to_anchor=(1.38, 1.03), ncol=1, fancybox=True) plt.show()
5 . Analysis
The resulting plot provides us with further insights about the behavior of L1 regularization. As we can see, all features weights will be zero if we penalize the model with a strong regularization parameter (
C
<
0.1
);
C
is the inverse of the regularization parameter.
Reference:《Python Machine Learning》