人工智能完整实战教程：从理论基础到工业级应用_人工智能完整实战教程文献-优快云博客

一、人工智能概述：定义、历史与现代格局

1.1 人工智能的定义与分类

人工智能（Artificial Intelligence, AI）是指由人类创造的、能够模拟、延伸和扩展人类智能的理论、方法、技术和应用系统。

1.1.1 AI的三大分类

类型	定义	特点	现实状态
弱人工智能（Narrow AI）	在特定任务上表现优异	专注单一领域，无通用智能	当前主流（99%的AI应用）
强人工智能（General AI）	具备人类水平的通用智能	能理解、学习、推理任何任务	理论阶段（尚未实现）
超人工智能（Super AI）	超越人类智能水平	在所有领域都优于人类	科幻概念

1.1.2 人工智能技术栈全景图

graph TD
    A[人工智能] --> B[机器学习]
    A --> C[深度学习]
    A --> D[自然语言处理]
    A --> E[计算机视觉]
    A --> F[强化学习]
    A --> G[知识图谱]
    
    B --> B1[监督学习]
    B --> B2[无监督学习]
    B --> B3[半监督学习]
    B --> B4[强化学习]
    
    C --> C1[卷积神经网络]
    C --> C2[循环神经网络]
    C --> C3[Transformer]
    C --> C4[生成对抗网络]
    
    D --> D1[文本分类]
    D --> D2[情感分析]
    D --> D3[机器翻译]
    D --> D4[问答系统]
    
    E --> E1[图像分类]
    E --> E2[目标检测]
    E --> E3[图像分割]
    E --> E4[人脸识别]

1.2 人工智能发展简史

1950年：图灵提出"图灵测试"，奠定AI理论基础
1956年：达特茅斯会议，"人工智能"术语正式诞生
1980s：专家系统兴起，AI第一次商业化浪潮
1997年：IBM深蓝击败国际象棋世界冠军卡斯帕罗夫
2012年：AlexNet在ImageNet竞赛中取得突破性成绩，深度学习时代开启
2016年：AlphaGo击败围棋世界冠军李世石
2020年至今：大模型时代，GPT、BERT等预训练模型引领AI革命

1.3 现代AI产业格局

1.3.1 主要技术方向

领域	核心技术	应用场景	代表企业
计算机视觉	CNN、YOLO、Transformer	人脸识别、自动驾驶、医疗影像	商汤、旷视、百度
自然语言处理	BERT、GPT、T5	智能客服、机器翻译、内容生成	OpenAI、Google、阿里
语音技术	RNN、WaveNet、Whisper	语音识别、语音合成、智能音箱	科大讯飞、Apple、Amazon
推荐系统	协同过滤、深度学习	电商推荐、内容分发、广告投放	字节、腾讯、Netflix

1.3.2 AI人才需求分析

根据2025年招聘数据，AI相关岗位技能要求：

岗位	必备技能	加分技能	平均薪资
算法工程师	Python、PyTorch/TensorFlow、数学基础	大模型、分布式训练	¥35,000/月
数据科学家	Python、SQL、统计学、机器学习	深度学习、A/B测试	¥28,000/月
AI产品经理	产品设计、AI技术理解、数据分析	技术背景、项目管理	¥25,000/月
MLOps工程师	Docker、Kubernetes、CI/CD、监控	模型部署、性能优化	¥32,000/月

二、人工智能数学基础

2.1 线性代数

2.1.1 向量与矩阵运算

import numpy as np
import matplotlib.pyplot as plt

# 向量运算
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# 点积（内积）
dot_product = np.dot(v1, v2)  # 1*4 + 2*5 + 3*6 = 32

# 叉积（外积）
cross_product = np.cross(v1[:2], v2[:2])  # 2D向量叉积

# 矩阵运算
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# 矩阵乘法
C = np.dot(A, B)

# 转置
A_T = A.T

# 逆矩阵
A_inv = np.linalg.inv(A)

# 特征值和特征向量
eigenvals, eigenvecs = np.linalg.eig(A)

print(f"点积: {dot_product}")
print(f"矩阵乘法:\n{C}")
print(f"特征值: {eigenvals}")

2.1.2 特征值分解与奇异值分解

特征值分解（EVD）：

A=QΛQ−1A=QΛQ−1

其中 QQ 是特征向量矩阵，ΛΛ 是特征值对角矩阵。

奇异值分解（SVD）：

A=UΣVTA=UΣVT

其中 UU 和 VV 是正交矩阵，ΣΣ 是奇异值对角矩阵。

应用：PCA降维、推荐系统、图像压缩

2.2 概率论与统计学

2.2.1 贝叶斯定理

P(A∣B)=P(B∣A)P(A)P(B)P(A∣B)=P(B)P(B∣A)P(A)

应用案例：垃圾邮件过滤

# 简单的贝叶斯垃圾邮件分类器
class NaiveBayesSpamFilter:
    def __init__(self):
        self.spam_word_probs = {}
        self.ham_word_probs = {}
        self.p_spam = 0.5  # 先验概率
    
    def train(self, emails, labels):
        """训练贝叶斯分类器"""
        spam_emails = [email for email, label in zip(emails, labels) if label == 1]
        ham_emails = [email for email, label in zip(emails, labels) if label == 0]
        
        # 计算先验概率
        self.p_spam = len(spam_emails) / len(emails)
        
        # 统计词频
        spam_words = self._extract_words(spam_emails)
        ham_words = self._extract_words(ham_emails)
        
        total_spam_words = sum(spam_words.values())
        total_ham_words = sum(ham_words.values())
        
        # 计算条件概率（带拉普拉斯平滑）
        vocab = set(spam_words.keys()) | set(ham_words.keys())
        for word in vocab:
            spam_count = spam_words.get(word, 0)
            ham_count = ham_words.get(word, 0)
            
            self.spam_word_probs[word] = (spam_count + 1) / (total_spam_words + len(vocab))
            self.ham_word_probs[word] = (ham_count + 1) / (total_ham_words + len(vocab))
    
    def predict(self, email):
        """预测邮件是否为垃圾邮件"""
        words = self._tokenize(email)
        
        # 计算后验概率的对数（避免下溢）
        log_p_spam = np.log(self.p_spam)
        log_p_ham = np.log(1 - self.p_spam)
        
        for word in words:
            if word in self.spam_word_probs:
                log_p_spam += np.log(self.spam_word_probs[word])
                log_p_ham += np.log(self.ham_word_probs[word])
        
        return 1 if log_p_spam > log_p_ham else 0
    
    def _extract_words(self, emails):
        """提取词频"""
        word_count = {}
        for email in emails:
            words = self._tokenize(email)
            for word in words:
                word_count[word] = word_count.get(word, 0) + 1
        return word_count
    
    def _tokenize(self, text):
        """简单分词"""
        return text.lower().split()

2.2.2 概率分布

分布类型	概率密度函数	应用场景
正态分布	f(x)=1σ2πe−(x−μ)22σ2f(x)=σ2π1e−2σ2(x−μ)2	误差分析、特征标准化
伯努利分布	P(X=1)=p,P(X=0)=1−pP(X=1)=p,P(X=0)=1−p	二分类问题
泊松分布	P(X=k)=λke−λk!P(X=k)=k!λke−λ	事件计数（如网站访问量）
指数分布	f(x)=λe−λxf(x)=λe−λx	等待时间、可靠性分析

2.3 微积分与优化

2.3.1 梯度下降法

梯度：函数在某点处变化最快的方向

∇f(x)=[∂f∂x1,∂f∂x2,…,∂f∂xn]∇f(x)=[∂x1∂f,∂x2∂f,…,∂xn∂f]

梯度下降更新规则：

θt+1=θt−α∇J(θt)θt+1=θt−α∇J(θt)

其中 αα 是学习率，J(θ)J(θ) 是损失函数。

# 从零实现梯度下降
def gradient_descent(X, y, learning_rate=0.01, epochs=1000):
    """
    线性回归的梯度下降实现
    X: 特征矩阵 (m, n)
    y: 目标向量 (m,)
    """
    m, n = X.shape
    theta = np.random.randn(n)  # 初始化参数
    cost_history = []
    
    for epoch in range(epochs):
        # 前向传播
        y_pred = X.dot(theta)
        
        # 计算损失（均方误差）
        cost = np.mean((y_pred - y) ** 2)
        cost_history.append(cost)
        
        # 计算梯度
        gradient = (2/m) * X.T.dot(y_pred - y)
        
        # 参数更新
        theta -= learning_rate * gradient
        
        if epoch % 100 == 0:
            print(f"Epoch {epoch}, Cost: {cost:.4f}")
    
    return theta, cost_history

# 示例使用
np.random.seed(42)
X = np.random.randn(100, 2)
y = 3 * X[:, 0] + 2 * X[:, 1] + np.random.randn(100) * 0.1

# 添加偏置项
X_b = np.c_[np.ones((100, 1)), X]

theta, costs = gradient_descent(X_b, y, learning_rate=0.1, epochs=1000)
print(f"学到的参数: {theta}")

2.3.2 优化算法演进

算法	特点	适用场景
SGD	简单、内存效率高	大数据集
Momentum	加入动量项，减少震荡	深度网络
RMSprop	自适应学习率	非平稳目标
Adam	结合Momentum和RMSprop	通用首选

三、机器学习基础

3.1 机器学习基本概念

3.1.1 监督学习 vs 无监督学习

类型	输入	输出	目标	示例
监督学习	特征 + 标签	预测标签	最小化预测误差	房价预测、图像分类
无监督学习	仅特征	模式/结构	发现数据内在结构	聚类、降维
强化学习	状态	动作	最大化累积奖励	游戏AI、机器人控制

3.1.2 机器学习工作流程

flowchart TD
    A[问题定义] --> B[数据收集]
    B --> C[数据预处理]
    C --> D[特征工程]
    D --> E[模型选择]
    E --> F[模型训练]
    F --> G[模型评估]
    G --> H{性能达标?}
    H -->|否| I[模型调优]
    I --> F
    H -->|是| J[模型部署]
    J --> K[监控维护]

3.2 数据预处理

3.2.1 数据清洗

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder

# 创建示例数据
df = pd.DataFrame({
    'age': [25, 30, np.nan, 35, 40],
    'income': [50000, 60000, 55000, np.nan, 70000],
    'category': ['A', 'B', 'A', 'C', np.nan],
    'target': [0, 1, 0, 1, 1]
})

print("原始数据:")
print(df)
print(f"\n缺失值统计:\n{df.isnull().sum()}")

# 处理缺失值
df_cleaned = df.copy()

# 数值型变量：用中位数填充
df_cleaned['age'].fillna(df_cleaned['age'].median(), inplace=True)
df_cleaned['income'].fillna(df_cleaned['income'].median(), inplace=True)

# 分类型变量：用众数填充
df_cleaned['category'].fillna(df_cleaned['category'].mode()[0], inplace=True)

# 处理异常值（IQR方法）
def remove_outliers_iqr(df, column):
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    return df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]

# df_cleaned = remove_outliers_iqr(df_cleaned, 'income')

print("\n清洗后数据:")
print(df_cleaned)

3.2.2 特征工程

from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.decomposition import PCA

# 特征编码
le = LabelEncoder()
df_cleaned['category_encoded'] = le.fit_transform(df_cleaned['category'])

# 特征缩放
scaler = StandardScaler()
numerical_features = ['age', 'income']
df_cleaned[numerical_features] = scaler.fit_transform(df_cleaned[numerical_features])

# 特征选择
X = df_cleaned[['age', 'income', 'category_encoded']]
y = df_cleaned['target']

selector = SelectKBest(score_func=f_classif, k=2)
X_selected = selector.fit_transform(X, y)

selected_features = X.columns[selector.get_support()]
print(f"选择的特征: {selected_features}")

# 特征降维
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
print(f"PCA解释方差比: {pca.explained_variance_ratio_}")

3.3 经典机器学习算法

3.3.1 线性回归

数学原理：

y=θ0+θ1x1+θ2x2+⋯+θnxn+ϵy=θ0+θ1x1+θ2x2+⋯+θnxn+ϵ

损失函数（均方误差）：

J(θ)=12m∑i=1m(hθ(x(i))−y(i))2J(θ)=2m1i=1∑m(hθ(x(i))−y(i))2

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

# 生成示例数据
np.random.seed(42)
X = np.random.randn(100, 2)
y = 3 * X[:, 0] + 2 * X[:, 1] + np.random.randn(100) * 0.1

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练模型
lr = LinearRegression()
lr.fit(X_train, y_train)

# 预测
y_pred = lr.predict(X_test)

# 评估
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"系数: {lr.coef_}")
print(f"截距: {lr.intercept_:.4f}")
print(f"MSE: {mse:.4f}")
print(f"R²: {r2:.4f}")

3.3.2 逻辑回归

数学原理（Sigmoid函数）：

hθ(x)=11+e−θTxhθ(x)=1+e−θTx1

损失函数（对数损失）：

J(θ)=−1m∑i=1m[y(i)log⁡(hθ(x(i)))+(1−y(i))log⁡(1−hθ(x(i)))]J(θ)=−m1i=1∑m[y(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i)))]

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score, classification_report

# 生成二分类数据
X, y = make_classification(n_samples=1000, n_features=4, n_redundant=0, 
                          n_informative=4, random_state=42, n_clusters_per_class=1)

# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练逻辑回归
lr = LogisticRegression(random_state=42)
lr.fit(X_train, y_train)

# 预测
y_pred = lr.predict(X_test)
y_pred_proba = lr.predict_proba(X_test)

# 评估
accuracy = accuracy_score(y_test, y_pred)
print(f"准确率: {accuracy:.4f}")
print("\n分类报告:")
print(classification_report(y_test, y_pred))

3.3.3 决策树与随机森林

决策树原理：通过特征分割最大化信息增益

随机森林：集成多个决策树，通过Bagging和特征随机选择提高泛化能力

from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris

# 加载数据
iris = load_iris()
X, y = iris.data, iris.target

# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 决策树
dt = DecisionTreeClassifier(random_state=42)
dt.fit(X_train, y_train)
dt_pred = dt.predict(X_test)
dt_accuracy = accuracy_score(y_test, dt_pred)

# 随机森林
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)
rf_pred = rf.predict(X_test)
rf_accuracy = accuracy_score(y_test, rf_pred)

print(f"决策树准确率: {dt_accuracy:.4f}")
print(f"随机森林准确率: {rf_accuracy:.4f}")

# 特征重要性
feature_importance = rf.feature_importances_
feature_names = iris.feature_names

plt.figure(figsize=(10, 6))
plt.bar(feature_names, feature_importance)
plt.title('随机森林特征重要性')
plt.xlabel('特征')
plt.ylabel('重要性')
plt.show()

3.3.4 支持向量机

数学原理：寻找最大间隔超平面

核函数：将线性不可分问题映射到高维空间

from sklearn.svm import SVC
from sklearn.datasets import make_circles
from sklearn.preprocessing import StandardScaler

# 生成非线性可分数据
X, y = make_circles(n_samples=1000, noise=0.1, factor=0.2, random_state=42)

# 标准化
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# 不同核函数的SVM
kernels = ['linear', 'poly', 'rbf', 'sigmoid']
results = {}

for kernel in kernels:
    svm = SVC(kernel=kernel, random_state=42)
    svm.fit(X_train, y_train)
    y_pred = svm.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    results[kernel] = accuracy
    print(f"{kernel}核准确率: {accuracy:.4f}")

# 可视化RBF核结果
svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train, y_train)

# 创建网格
xx, yy = np.meshgrid(np.linspace(-3, 3, 500), np.linspace(-3, 3, 500))
Z = svm_rbf.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.title('原始数据')

plt.subplot(1, 2, 2)
plt.contourf(xx, yy, Z, levels=50, cmap='RdYlBu', alpha=0.7)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.title('SVM RBF核决策边界')
plt.show()

3.4 模型评估与验证

3.4.1 评估指标

任务类型	指标	公式	适用场景
回归	MSE	1m∑(yi−y^i)2m1∑(yi−y^i)2	通用回归评估
	R²	1−SSresSStot1−SStotSSres	模型解释力
二分类	Accuracy	TP+TNTP+TN+FP+FNTP+TN+FP+FNTP+TN	平衡数据集
	Precision	TPTP+FPTP+FPTP	关注假阳性
	Recall	TPTP+FNTP+FNTP	关注假阴性
	F1-score	2precision⋅recallprecision+recall2precision+recallprecision⋅recall	平衡精确率和召回率
多分类	Macro-F1	各类F1的算术平均	各类同等重要
	Micro-F1	全局TP、FP、FN计算F1	数据不平衡

3.4.2 交叉验证

from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.ensemble import RandomForestClassifier

# 使用分层K折交叉验证
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# 评估随机森林
rf = RandomForestClassifier(n_estimators=100, random_state=42)
cv_scores = cross_val_score(rf, X, y, cv=skf, scoring='accuracy')

print(f"交叉验证准确率: {cv_scores}")
print(f"平均准确率: {cv_scores.mean():.4f} ± {cv_scores.std():.4f}")

# 不同模型的交叉验证比较
models = {
    'Logistic Regression': LogisticRegression(random_state=42),
    'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
    'SVM': SVC(kernel='rbf', random_state=42)
}

results = {}
for name, model in models.items():
    scores = cross_val_score(model, X, y, cv=skf, scoring='accuracy')
    results[name] = scores
    print(f"{name}: {scores.mean():.4f} ± {scores.std():.4f}")

3.4.3 学习曲线与验证曲线

from sklearn.model_selection import learning_curve, validation_curve

# 学习曲线
def plot_learning_curve(estimator, X, y, title="Learning Curve"):
    train_sizes, train_scores, val_scores = learning_curve(
        estimator, X, y, cv=5, n_jobs=-1, 
        train_sizes=np.linspace(0.1, 1.0, 10), random_state=42
    )
    
    train_mean = np.mean(train_scores, axis=1)
    train_std = np.std(train_scores, axis=1)
    val_mean = np.mean(val_scores, axis=1)
    val_std = np.std(val_scores, axis=1)
    
    plt.figure(figsize=(10, 6))
    plt.plot(train_sizes, train_mean, 'o-', color='blue', label='Training Score')
    plt.fill_between(train_sizes, train_mean - train_std, train_mean + train_std, alpha=0.1, color='blue')
    
    plt.plot(train_sizes, val_mean, 'o-', color='red', label='Validation Score')
    plt.fill_between(train_sizes, val_mean - val_std, val_mean + val_std, alpha=0.1, color='red')
    
    plt.xlabel('Training Set Size')
    plt.ylabel('Accuracy')
    plt.title(title)
    plt.legend()
    plt.grid(True)
    plt.show()

# 绘制随机森林的学习曲线
plot_learning_curve(RandomForestClassifier(n_estimators=100, random_state=42), X, y)

四、深度学习基础

4.1 神经网络基础

4.1.1 感知机与多层感知机

感知机（单层神经网络）：

y=f(wTx+b)y=f(wTx+b)

其中 ff 是激活函数（如阶跃函数）。

多层感知机（MLP）：

h(1)=f(W(1)x+b(1))h(2)=f(W(2)h(1)+b(2))⋮y=f(W(L)h(L−1)+b(L))h(1)=f(W(1)x+b(1))h(2)=f(W(2)h(1)+b(2))⋮y=f(W(L)h(L−1)+b(L))

4.1.2 激活函数

激活函数	公式	优点	缺点
Sigmoid	σ(x)=11+e−xσ(x)=1+e−x1	输出在(0,1)，概率解释	梯度消失、输出非零中心
Tanh	tanh⁡(x)=ex−e−xex+e−xtanh(x)=ex+e−xex−e−x	零中心、更强梯度	梯度消失
ReLU	ReLU(x)=max⁡(0,x)ReLU(x)=max(0,x)	计算简单、缓解梯度消失	死亡神经元
Leaky ReLU	LReLU(x)=max⁡(0.01x,x)LReLU(x)=max(0.01x,x)	解决死亡神经元	需要调参

import torch
import torch.nn as nn
import torch.optim as optim

# PyTorch实现MLP
class MLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.2)
    
    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.relu(self.fc2(x))
        x = self.dropout(x)
        x = self.fc3(x)
        return x

# 训练MLP
def train_mlp(X_train, y_train, X_test, y_test, epochs=100):
    # 转换为PyTorch张量
    X_train_tensor = torch.FloatTensor(X_train)
    y_train_tensor = torch.LongTensor(y_train)
    X_test_tensor = torch.FloatTensor(X_test)
    y_test_tensor = torch.LongTensor(y_test)
    
    # 创建模型
    model = MLP(input_size=X_train.shape[1], hidden_size=64, output_size=len(np.unique(y_train)))
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    
    # 训练循环
    train_losses = []
    for epoch in range(epochs):
        model.train()
        optimizer.zero_grad()
        
        outputs = model(X_train_tensor)
        loss = criterion(outputs, y_train_tensor)
        loss.backward()
        optimizer.step()
        
        train_losses.append(loss.item())
        
        if epoch % 20 == 0:
            model.eval()
            with torch.no_grad():
                test_outputs = model(X_test_tensor)
                _, predicted = torch.max(test_outputs.data, 1)
                accuracy = (predicted == y_test_tensor).sum().item() / len(y_test_tensor)
                print(f"Epoch {epoch}, Loss: {loss.item():.4f}, Test Accuracy: {accuracy:.4f}")
    
    return model, train_losses

# 使用鸢尾花数据集
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model, losses = train_mlp(X_train, y_train, X_test, y_test, epochs=200)

4.2 卷积神经网络

4.2.1 CNN基本原理

卷积层：提取局部特征

(I∗K)(i,j)=∑m∑nI(i+m,j+n)K(m,n)(I∗K)(i,j)=m∑n∑I(i+m,j+n)K(m,n)

池化层：降维和特征不变性

最大池化：保留最显著特征
平均池化：平滑特征

全连接层：分类决策

4.2.2 CNN架构演进

架构	创新点	年份
LeNet-5	首个CNN架构	1998
AlexNet	ReLU、Dropout、GPU训练	2012
VGGNet	小卷积核堆叠	2014
GoogLeNet	Inception模块	2014
ResNet	残差连接	2015
EfficientNet	复合缩放	2019

import torchvision.models as models
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader

# 数据预处理
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 加载CIFAR-10数据集
train_dataset = CIFAR10(root='./data', train=True, download=True, transform=transform)
test_dataset = CIFAR10(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# 使用预训练的ResNet18（迁移学习）
model = models.resnet18(pretrained=True)

# 修改最后的全连接层以适应CIFAR-10的10个类别
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10)

# 训练配置
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 训练循环（简化版）
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

for epoch in range(10):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    
    for i, (inputs, labels) in enumerate(train_loader):
        inputs, labels = inputs.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
        if i % 100 == 99:
            print(f'Epoch {epoch+1}, Batch {i+1}, Loss: {running_loss/100:.4f}, Accuracy: {100*correct/total:.2f}%')
            running_loss = 0.0
            correct = 0
            total = 0

# 测试
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Test Accuracy: {100 * correct / total:.2f}%')