爬虫收集数据pandas导入数据read_csvread_excelread_json pandas导入数据数据初步查看预处理开始valuecategories预处理进行.describe.shape.head粗略可视化分离train_set and test_settrain_test_split1.80%train_setstratified split1.异常值outliers1.ellipse2.RobustScaler3.IQR2.missing data1.drop2.Imputer3.数据缩放1.MinMaxScaler2.StandardScaler3.Normalizer4.离散化连续数值by digitize cross_validation预处理进行testing_set and validation_setML模型并比较不同模型score调整各种model的超参数GridSearchCVRandomizedSearchCV测试集上评估Launch 测试集上评估features_combinations重新开始