【摘要】
1、我们可以结合sklearn.MultiOutputRegressor和flaml.AutoML来对多输出变量回归模型进行训练。它启动一次,将为每个目标执行自动化机器学习,这是其方便的地方所在。
2、创建随机输入数据代码如下,输入变量为5个,输出变量为3个。
from sklearn.datasets import make_regression
# 目标变量数量n_targets, 特征数量n_features
n_targets=3
X, y = make_regression(n_targets=n_targets, n_features=5)
【官方参考】
【关键输出】
# python test-flaml-mo.py
[flaml.automl.logger: 10-25 00:34:32] {1728} INFO - task = regression
[flaml.automl.logger: 10-25 00:34:32] {1739} INFO - Evaluation method: cv
[flaml.automl.logger: 10-25 00:34:32] {1838} INFO - Minimizing error metric: 1-r2
[flaml.automl.logger: 10-25 00:34:32] {1955} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth', 'sgd']
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 0, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2393} INFO - Estimated sufficient time budget=207s. Estimated necessary time budget=1s.
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.0s, estimator lgbm's best error=0.8370, best estimator lgbm's best error=0.8370
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 1, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.0s, estimator lgbm's best error=0.8370, best estimator lgbm's best error=0.8370
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 2, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.1s, estimator lgbm's best error=0.5070, best estimator lgbm's best error=0.5070...
[flaml.automl.logger: 10-25 00:34:44] {2258} INFO - iteration 125, current learner sgd
[flaml.automl.logger: 10-25 00:34:44] {2442} INFO - at 4.0s, estimator sgd's best error=0.0716, best estimator sgd's best error=0.0716
[flaml.automl.logger: 10-25 00:34:44] {2685} INFO - retrain sgd for 0.0s
[flaml.automl.logger: 10-25 00:34:44] {2688} INFO - retrained model: SGDRegressor(alpha=9.820909968561154e-05, learning_rate='optimal',
loss='epsilon_insensitive', penalty=None, tol=0.0001)
[flaml.automl.logger: 10-25 00:34:44] {1985} INFO - fit succeeded
[flaml.automl.logger: 10-25 00:34:44] {1986} INFO - Time taken to find the best model: 3.348128318786621
first result: [[-68.01898084 114.18330859 76.12119675]]
R2 Score: 0.8804327929345547
(MSE )Mean Squared Error : 2064.864234989927
(RMSE)Root Mean Squared Error: 45.440777226956925
(MAE )Mean Absolute Error : 26.20185200402915
R2 Score 0: 0.8857770110995447
(MSE )Mean Squared Error 0 : 637.5601445655168
(RMSE)Root Mean Squared Error 0: 25.249953357689925
(MAE )Mean Absolute Error 0 : 17.936946049090224
R2 Score 1: 0.8838947860589346
(MSE )Mean Squared Error 1 : 3615.804943742679
(RMSE)Root Mean Squared Error 1: 60.131563622964926
(MAE )Mean Absolute Error 1 : 35.68283164689059
R2 Score 2: 0.8716265816451845
(MSE )Mean Squared Error 2 : 1941.2276166615861
(RMSE)Root Mean Squared Error 2: 44.05936468744853
(MAE )Mean Absolute Error 2 : 24.985778316106625
【带整体评估和分输出变量(特征)评估例子】
from flaml import AutoML
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
import numpy as np
import math
# 目标变量数量n_targets, 特征数量n_features
n_targets=3
X, y = make_regression(n_targets=n_targets, n_features=5)
# split into train and test data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.30, random_state=42)
first_row = X_test[0,:]
first_col = X_test[:,0]
# train the model
model = MultiOutputRegressor(AutoML(task="regression", time_budget=4))
model.fit(X_train, y_train)
# predict
y_pred = model.predict(X_test)
# 单个预测
npa = np.array([first_row])
print('first result:', model.predict(npa))
# 总体评估
r2 = r2_score(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = math.sqrt(mse)
mae = mean_absolute_error(y_test, y_pred)
print(f"R2 Score: {r2}")
print(f"(MSE )Mean Squared Error : {mse}")
print(f"(RMSE)Root Mean Squared Error: {rmse}")
print(f"(MAE )Mean Absolute Error : {mae}")
print()
# 分列总体评估
for i in range(n_targets):
yi_test=y_test[:,i]
yi_pred=y_pred[:,i]
r2 = r2_score(yi_test, yi_pred)
mse = mean_squared_error(yi_test, yi_pred)
rmse = math.sqrt(mse)
mae = mean_absolute_error(yi_test, yi_pred)
print(f"R2 Score {i}: {r2}")
print(f"(MSE )Mean Squared Error {i} : {mse}")
print(f"(RMSE)Root Mean Squared Error {i}: {rmse}")
print(f"(MAE )Mean Absolute Error {i} : {mae}")
print()
【完整输出】
# python test-flaml-mo.py
[flaml.automl.logger: 10-25 00:34:32] {1728} INFO - task = regression
[flaml.automl.logger: 10-25 00:34:32] {1739} INFO - Evaluation method: cv
[flaml.automl.logger: 10-25 00:34:32] {1838} INFO - Minimizing error metric: 1-r2
[flaml.automl.logger: 10-25 00:34:32] {1955} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth', 'sgd']
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 0, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2393} INFO - Estimated sufficient time budget=207s. Estimated necessary time budget=1s.
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.0s, estimator lgbm's best error=0.8370, best estimator lgbm's best error=0.8370
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 1, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.0s, estimator lgbm's best error=0.8370, best estimator lgbm's best error=0.8370
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 2, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.1s, estimator lgbm's best error=0.5070, best estimator lgbm's best error=0.5070
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 3, current learner sgd
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.1s, estimator sgd's best error=0.8813, best estimator lgbm's best error=0.5070
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 4, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.1s, estimator lgbm's best error=0.3054, best estimator lgbm's best error=0.3054
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 5, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.1s, estimator lgbm's best error=0.3054, best estimator lgbm's best error=0.3054
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 6, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.1s, estimator lgbm's best error=0.2864, best estimator lgbm's best error=0.2864
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 7, current learner xgboost
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.2s, estimator xgboost's best error=0.7272, best estimator lgbm's best error=0.2864
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 8, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.2s, estimator lgbm's best error=0.2864, best estimator lgbm's best error=0.2864
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 9, current learner lgbm
[flaml.automl.logger: 10-25 00:34:32] {2442} INFO - at 0.2s, estimator lgbm's best error=0.2864, best estimator lgbm's best error=0.2864
[flaml.automl.logger: 10-25 00:34:32] {2258} INFO - iteration 10, current learner lgbm
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.2s, estimator lgbm's best error=0.2021, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 11, current learner extra_tree
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.4s, estimator extra_tree's best error=0.6197, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 12, current learner rf
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.6s, estimator rf's best error=0.6211, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 13, current learner xgboost
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.6s, estimator xgboost's best error=0.7272, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 14, current learner xgboost
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.6s, estimator xgboost's best error=0.4619, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 15, current learner xgboost
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.7s, estimator xgboost's best error=0.4619, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 16, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.7s, estimator sgd's best error=0.8813, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 17, current learner lgbm
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.7s, estimator lgbm's best error=0.2021, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 18, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.7s, estimator sgd's best error=0.2627, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 19, current learner lgbm
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.8s, estimator lgbm's best error=0.2021, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 20, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.8s, estimator sgd's best error=0.2627, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 21, current learner xgboost
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 0.8s, estimator xgboost's best error=0.4619, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 22, current learner extra_tree
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.0s, estimator extra_tree's best error=0.4389, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 23, current learner lgbm
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.0s, estimator lgbm's best error=0.2021, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 24, current learner lgbm
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.0s, estimator lgbm's best error=0.2021, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 25, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.0s, estimator sgd's best error=0.2627, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 26, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.0s, estimator sgd's best error=0.2627, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 27, current learner rf
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.2s, estimator rf's best error=0.4799, best estimator lgbm's best error=0.2021
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 28, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.2s, estimator sgd's best error=0.1883, best estimator sgd's best error=0.1883
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 29, current learner sgd
[flaml.automl.logger: 10-25 00:34:33] {2442} INFO - at 1.2s, estimator sgd's best error=0.1532, best estimator sgd's best error=0.1532
[flaml.automl.logger: 10-25 00:34:33] {2258} INFO - iteration 30, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.2s, estimator sgd's best error=0.1532, best estimator sgd's best error=0.1532
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 31, current learner xgboost
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.3s, estimator xgboost's best error=0.4619, best estimator sgd's best error=0.1532
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 32, current learner lgbm
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.3s, estimator lgbm's best error=0.2021, best estimator sgd's best error=0.1532
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 33, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.3s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 34, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.3s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 35, current learner xgboost
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.4s, estimator xgboost's best error=0.3604, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 36, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.4s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 37, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.4s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 38, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.4s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 39, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.4s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 40, current learner extra_tree
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.6s, estimator extra_tree's best error=0.4389, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 41, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.6s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 42, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.6s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 43, current learner lgbm
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.6s, estimator lgbm's best error=0.2021, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 44, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.6s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 45, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.7s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 46, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.7s, estimator sgd's best error=0.1300, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 47, current learner rf
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.8s, estimator rf's best error=0.4799, best estimator sgd's best error=0.1300
[flaml.automl.logger: 10-25 00:34:34] {2258} INFO - iteration 48, current learner sgd
[flaml.automl.logger: 10-25 00:34:34] {2442} INFO - at 1.8s, estimator sgd's best error=0.1248, best estimator sgd's best error=0.1248
[flaml.autom