深度学习笔记-----基于TensorFlow2.2.0代码练习（第二课）-优快云博客

本文链接：https://blog.youkuaiyun.com/zdswyh123/article/details/106242648

这篇博文是基于TensorFlow2.2.0的深度学习代码练习，按照KGP Talkie的教程进行，涵盖了从数据处理到构建ANN的全过程，包括特征标准化、输入层、隐藏层的建立，以及模型训练和评估。教程链接和数据提供，便于读者跟随操作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

写在正文之前：
这篇紧接着上一篇的博文
深度学习笔记-----基于TensorFlow2.2.0代码练习（第一课）
主要写的是TensorFlow2.0的代码练习，跟随着KGP Talkie的【TensorFlow 2.0】实战进阶教程进行学习，并将其中一些不适用的代码错误进行修改。
本文跟随视频油管非常火的【TensorFlow 2.0】实战进阶教程（中英字幕+代码实战）第二课

课程所需要的数据链接：https://pan.baidu.com/s/1Lpo3l3UaPANOGE_HGJf2TQ
提取码：dqo4
注意：需要把数据放到jupyter目录下

如何建立第一个ANN

1 数据处理
2 建立输入层
3 初始随机化输入权重W
4 建立隐藏层
5 选择优化，损失和精确性指标
6 编译模型
7 使用model.fit 训练模型
8 评估模型
9 如果有需要的话调整模型

#导入库
import tensorflow as tf
from tensorflow import keras
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Flatten,Dense

#导入包
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split#这是为了把数据分割成训练集和测试集

dataset = pd.read_csv('customer_Churn_Modelling.csv')#读取数据，需要把数据放到和此文件的同一目录

dataset.head()#查看数据

	RowNumber	CustomerId	Surname	CreditScore	Geography	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Exited
0	1	15634602	Hargrave	619	France	Female	42	2	0.00	1	1	1	101348.88	1
1	2	15647311	Hill	608	Spain	Female	41	1	83807.86	1	0	1	112542.58	0
2	3	15619304	Onio	502	France	Female	42	8	159660.80	3	1	0	113931.57	1
3	4	15701354	Boni	699	France	Female	39	1	0.00	2	0	0	93826.63	0
4	5	15737888	Mitchell	850	Spain	Female	43	2	125510.82	1	1	1	79084.10	0

X = dataset.drop(labels=['CustomerId','Surname','RowNumber','Exited'],axis =1)#删除数据中的一些然后存入X中
y = dataset['Exited']#y的数据

X.head()

	CreditScore	Geography	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary
0	619	France	Female	42	2	0.00	1	1	1	101348.88
1	608	Spain	Female	41	1	83807.86	1	0	1	112542.58
2	502	France	Female	42	8	159660.80	3	1	0	113931.57
3	699	France	Female	39	1	0.00	2	0	0	93826.63
4	850	Spain	Female	43	2	125510.82	1	1	1	79084.10

y.head()

0    1
1    0
2    1
3    0
4    0
Name: Exited, dtype: int64

#处理标签
#将国家Geography和性别gender中的字符转换为数字
from sklearn.preprocessing import LabelEncoder
label1 = LabelEncoder()
X['Geography'] = label1.fit_transform(X['Geography'])#将国家通过LabelEncoder转换为数值

X.head()

	CreditScore	Geography	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary
0	619	0	Female	42	2	0.00	1	1	1	101348.88
1	608	2	Female	41	1	83807.86	1	0	1	112542.58
2	502	0	Female	42	8	159660.80	3	1	0	113931.57
3	699	0	Female	39	1	0.00	2	0	0	93826.63
4	850	2	Female	43	2	125510.82	1	1	1	79084.10

label2 = LabelEncoder()
X['Gender'] = label1.fit_transform(X['Gender'])#将国家通过LabelEncoder转换为数值

X.head()

	CreditScore	Geography	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary
0	619	0	42	2	0.00	1	1	1	101348.88
1	608	2	41	1	83807.86	1	0	1	112542.58
2	502	0	42	8	159660.80	3	1	0	113931.57
3	699	0	39	1	0.00	2	0	0	93826.63
4	850	2	43	2	125510.82	1	1	1	79084.10

	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Geography_1	Geography_2
0	619	0	42	2	0.00	1	1	1	101348.88	0	0
1	608	0	41	1	83807.86	1	0	1	112542.58	0	1
2	502	0	42	8	159660.80	3	1	0	113931.57	0	0
3	699	0	39	1	0.00	2	0	0	93826.63	0	0
4	850	0	43	2	125510.82	1	1	1	79084.10	0	1
5	645	1	44	8	113755.78	2	1	0	149756.71	0	1
6	822	1	50	7	0.00	2	1	1	10062.80	0	0
7	376	0	29	4	115046.74	4	1	0	119346.88	1	0
8	501	1	44	4	142051.07	2	0	1	74940.50	0	0
9	684	1	27	2	134603.88	1	1	1	71725.73	0	0

#把国家信息转换为0到1 的二进制数字，即为某个国家就显示1否则为0
X = pd.get_dummies(X, drop_first=True, columns=['Geography'])
X.head(30)

	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Geography_1	Geography_2
0	619	0	42	2	0.00	1	1	1	101348.88	0	0
1	608	0	41	1	83807.86	1	0	1	112542.58	0	1
2	502	0	42	8	159660.80	3	1	0	113931.57	0	0
3	699	0	39	1	0.00	2	0	0	93826.63	0	0
4	850	0	43	2	125510.82	1	1	1	79084.10	0	1
5	645	1	44	8	113755.78	2	1	0	149756.71	0	1
6	822	1	50	7	0.00	2	1	1	10062.80	0	0
7	376	0	29	4	115046.74	4	1	0	119346.88	1	0
8	501	1	44	4	142051.07	2	0	1	74940.50	0	0
9	684	1	27	2	134603.88	1	1	1	71725.73	0	0
10	528	1	31	6	102016.72	2	0	0	80181.12	0	0
11	497	1	24	3	0.00	2	1	0	76390.01	0	1
12	476	0	34	10	0.00	2	1	0	26260.98	0	0
13	549	0	25	5	0.00	2	0	0	190857.79	0	0
14	635	0	35	7	0.00	2	1	1	65951.65	0	1
15	616	1	45	3	143129.41	2	0	1	64327.26	1	0
16	653	1	58	1	132602.88	1	1	0	5097.67	1	0
17	549	0	24	9	0.00	2	1	1	14406.41	0	1
18	587	1	45	6	0.00	1	0	0	158684.81	0	1
19	726	0	24	6	0.00	2	1	1	54724.03	0	0
20	732	1	41	8	0.00	2	1	1	170886.17	0	0
21	636	0	32	8	0.00	2	1	0	138555.46	0	1
22	510	0	38	4	0.00	1	1	0	118913.53	0	1
23	669	1	46	3	0.00	2	0	1	8487.75	0	0
24	846	0	38	5	0.00	1	1	1	187616.16	0	0
25	577	1	25	3	0.00	2	0	1	124508.29	0	0
26	756	1	36	2	136815.64	1	1	1	170041.95	1	0
27	571	1	44	9	0.00	2	0	0	38433.35	0	0
28	574	0	43	3	141349.43	1	1	1	100187.43	1	0
29	411	1	29	0	59697.17	2	1	1	53483.21	0	0

特征标准化

#用自带的预处理包进行
from sklearn.preprocessing import StandardScaler

X_train, X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2, random_state = 0, stratify = y)#分测试训练比例为20%。随机关闭，并且按y中类的比例进行分配，避免出现类分布不均衡
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)##标准化测试和训练

y_test

1344    1
8167    0
4747    0
5004    1
3124    1
       ..
9107    0
8249    0
8337    0
6279    1
412     0
Name: Exited, Length: 2000, dtype: int64

构建ANN

model = Sequential()#序列模型
model.add(Dense(X.shape[1],activation='relu',input_dim = X.shape[1]))#输入层的建立X_shape是提取其所有特征数量
model.add(Dense(128,activation = 'relu'))#隐藏层建立
model.add(Dense(1,activation = 'sigmoid'))#输出层建立

WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.

model.compile(optimizer = 'adam',loss ='binary_crossentropy',metrics=['accuracy'])#采用随机梯度优化，

model.fit(X_train,y_train.to_numpy(),batch_size=10,epochs=10,verbose=1)

WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Epoch 1/10
8000/8000 [==============================] - 1s 94us/sample - loss: 0.4515 - acc: 0.8049
Epoch 2/10
8000/8000 [==============================] - 1s 80us/sample - loss: 0.4185 - acc: 0.8202
Epoch 3/10
8000/8000 [==============================] - 1s 80us/sample - loss: 0.4057 - acc: 0.8324
Epoch 4/10
8000/8000 [==============================] - 1s 77us/sample - loss: 0.3752 - acc: 0.8431
Epoch 5/10
8000/8000 [==============================] - 1s 79us/sample - loss: 0.3507 - acc: 0.8571
Epoch 6/10
8000/8000 [==============================] - 1s 78us/sample - loss: 0.3415 - acc: 0.8591
Epoch 7/10
8000/8000 [==============================] - 1s 79us/sample - loss: 0.3363 - acc: 0.8620
Epoch 8/10
8000/8000 [==============================] - 1s 84us/sample - loss: 0.3345 - acc: 0.8619
Epoch 9/10
8000/8000 [==============================] - 1s 74us/sample - loss: 0.3328 - acc: 0.8602
Epoch 10/10
8000/8000 [==============================] - 1s 74us/sample - loss: 0.3302 - acc: 0.8626





<tensorflow.python.keras.callbacks.History at 0x1d77c75d248>

y_pred = model.predict_classes(X_test)
y_pred

array([[0],
       [0],
       [0],
       ...,
       [0],
       [1],
       [0]])

y_test

1344    1
8167    0
4747    0
5004    1
3124    1
       ..
9107    0
8249    0
8337    0
6279    1
412     0
Name: Exited, Length: 2000, dtype: int64

model.evaluate(X_test, y_test.to_numpy())#利用测试集测试训练下的模型的准确度

2000/2000 [==============================] - 0s 34us/sample - loss: 0.3583 - acc: 0.8535





[0.3583366745710373, 0.8535]

#另一种计算精度的方法
from sklearn.metrics import confusion_matrix, accuracy_score
confusion_matrix(y_test,y_pred)

array([[1525,   68],
       [ 225,  182]], dtype=int64)

accuracy_score(y_test,y_pred)

0.8535