从图片到x_train,y_train的过程汇总

1、这个是从.npz 到x_train,y_train,至于从jpg到npz以后再总结:

(x_tain,y_train),(x_test,y_test) = mnist.loda_data('mnist.npz')
if K.image_data_format() == 'channels first':
	x_train = x_train.reshape([batch_size,1,img_rows,img_cols])  #step 1
	x_test = x_test.reshape([batch_size,1,img_rows,img_cols])
	input_shape = [1,img_rows,img_cols]
else:
	x_train = x_train.reshape([batch_size,img_rows,img_cols,1])
	x_test = x_test.reshape([batch_size,img_rows,img_cols,1])
	input_shape = [img_rows,img_cols,1]

x_train = x_train.astype('float32')     #step 2  GPU对单精度的计算性能更好
x_test = x_test.astype('float32')     #更正,第一遍写的这两块的先后顺序弄反了

x_train /=255
x_test /=255       #step 3



y_train = keras.utils.to_catagorical(y_train,num_classes)
y_test = keras.utils.to_categorical(y_test,num_classes)

得到了x_train ,y_train 就可以作为model.fit(x_train,y_train )的输入了
逻辑:以模型为中心,首先了解模型需要的输入,Keras 这里的model.fit()方法 就是需要x_train,y_train,shape 是 [总样本数,img_rows,img_cols,通道数],注意这个方法里的是总样本数,要实

对不同模型的超参数进行网格搜索和交叉验证,选取不同模型最佳的超参数 def ANN(X_train, X_test, y_train, y_test, StandScaler=None): if StandScaler: scaler = StandardScaler() # 标准化转换 X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) clf = MLPClassifier(activation='relu', alpha=1e-05, batch_size='auto', beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08, hidden_layer_sizes=(10, 8), learning_rate='constant', learning_rate_init=0.001, max_iter=200, momentum=0.9, nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True, solver='lbfgs', tol=0.0001, validation_fraction=0.1, verbose=False, warm_start=False) clf.fit(X_train, y_train.ravel()) y_pred = clf.predict(X_test) # 获取预测值 y_scores = clf.predict_proba(X_test) # 获取预测概率 return y_pred, y_scores def SVM(X_train, X_test, y_train, y_test): from sklearn.svm import SVC from sklearn.metrics import accuracy_score model = SVC(probability=True) # 确保启用概率估计 model.fit(X_train, y_train) y_pred = model.predict(X_test) y_prob = model.predict_proba(X_test) return y_pred def PLS_DA(X_train, X_test, y_train, y_test): # 将y_train转换为one-hot编码形式 y_train_one_hot = pd.get_dummies(y_train) # 创建PLS模型 model = PLSRegression(n_components=2) model.fit(X_train, y_train_one_hot) # 预测测试集 y_pred = model.predict(X_test) # 将预测结果转换为数值标签 y_pred_labels = np.array([np.argmax(i) for i in y_pred]) # 尝试计算近似概率值 # 通过将预测结果归一化到 [0, 1] 区间 y_scores = y_pred / np.sum(y_pred, axis=1, keepdims=True) return y_pred, y_scores def RF(X_train, X_test, y_train, y_test): """ 使用随机森林模型进行分类,并返回预测值和预测概率。 参数: X_train: 训练集特征 X_test: 测试集特征 y_train: 训练集标签 y_test: 测试集标签(用于评估,但不直接用于模型训练) 返回: y_pred: 预测的类别标签 y_prob: 预测的概率 """ # 创建随机森林模型 RF = Rand
03-25
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值