使用sklearn.preprocessing.PolynomialFeatures来进行特征的构造。
它是使用多项式的方法来进行的,如果有a,b两个特征,那么它的2次多项式为(1,a,b,a^2,ab, b^2)。
PolynomialFeatures有三个参数
degree:控制多项式的度
interaction_only: 默认为False,如果指定为True,那么就不会有特征自己和自己结合的项,上面的二次项中没有a^2和b^2。
include_bias:默认为True。如果为True的话,那么就会有上面的 1那一项。
from sklearn.preprocessing import PolynomialFeatures
X_train = [[1],[2],[3],[4]]
quadratic_featurizer_2 = PolynomialFeatures(degree=2)
X_train_quadratic_2 = quadratic_featurizer_2.fit_transform(X_train)
print("feature names")
print(quadratic_featurizer_2.get_feature_names())
print(X_train_quadratic_2)
quadratic_featurizer_3 = PolynomialFeatures(degree=3)
X_train_quadratic_3 = quadratic_featurizer_3.fit_transform(X_train)
print("feature names")
print(quadratic_featurizer_3.get_feature_names())
print(X_train_quadratic_3)
X_train = [[1,3],[2,6],[3,7],[4,8]]
quadratic_featurizer_2 = PolynomialFeatures(degree=2)
X_train_quadratic_2 = quadratic_featurizer_2.fit_transform(X_train)
print("feature names")
print(quadratic_featurizer_2.get_feature_names())
print(X_train_quadratic_2)
quadratic_featurizer_3 = PolynomialFeatures(degree=3)
X_train_quadratic_3 = quadratic_featurizer_3.fit_transform(X_train)
print("feature names")
print(quadratic_featurizer_3.get_feature_names())
print(X_train_quadratic_3)
输出
feature names
['1', 'x0', 'x0^2']
[[ 1. 1. 1.]
[ 1. 2. 4.]
[ 1. 3. 9.]
[ 1. 4. 16.]]
feature names
['1', 'x0', 'x0^2', 'x0^3']
[[ 1. 1. 1. 1.]
[ 1. 2. 4. 8.]
[ 1. 3. 9. 27.]
[ 1. 4. 16. 64.]]
feature names
['1', 'x0', 'x1', 'x0^2', 'x0 x1', 'x1^2']
[[ 1. 1. 3. 1. 3. 9.]
[ 1. 2. 6. 4. 12. 36.]
[ 1. 3. 7. 9. 21. 49.]
[ 1. 4. 8. 16. 32. 64.]]
feature names
['1', 'x0', 'x1', 'x0^2', 'x0 x1', 'x1^2', 'x0^3', 'x0^2 x1', 'x0 x1^2', 'x1^3']
[[ 1. 1. 3. 1. 3. 9. 1. 3. 9. 27.]
[ 1. 2. 6. 4. 12. 36. 8. 24. 72. 216.]
[ 1. 3. 7. 9. 21. 49. 27. 63. 147. 343.]
[ 1. 4. 8. 16. 32. 64. 64. 128. 256. 512.]]
参考链接:
https://blog.youkuaiyun.com/tiange_xiao/article/details/79755793