人工智能完整实战教程:从理论基础到工业级应用

部署运行你感兴趣的模型镜像

一、人工智能概述:定义、历史与现代格局

1.1 人工智能的定义与分类

人工智能(Artificial Intelligence, AI)是指由人类创造的、能够模拟、延伸和扩展人类智能的理论、方法、技术和应用系统。

1.1.1 AI的三大分类
类型定义特点现实状态
弱人工智能(Narrow AI)在特定任务上表现优异专注单一领域,无通用智能当前主流(99%的AI应用)
强人工智能(General AI)具备人类水平的通用智能能理解、学习、推理任何任务理论阶段(尚未实现)
超人工智能(Super AI)超越人类智能水平在所有领域都优于人类科幻概念
1.1.2 人工智能技术栈全景图
graph TD
    A[人工智能] --> B[机器学习]
    A --> C[深度学习]
    A --> D[自然语言处理]
    A --> E[计算机视觉]
    A --> F[强化学习]
    A --> G[知识图谱]
    
    B --> B1[监督学习]
    B --> B2[无监督学习]
    B --> B3[半监督学习]
    B --> B4[强化学习]
    
    C --> C1[卷积神经网络]
    C --> C2[循环神经网络]
    C --> C3[Transformer]
    C --> C4[生成对抗网络]
    
    D --> D1[文本分类]
    D --> D2[情感分析]
    D --> D3[机器翻译]
    D --> D4[问答系统]
    
    E --> E1[图像分类]
    E --> E2[目标检测]
    E --> E3[图像分割]
    E --> E4[人脸识别]

1.2 人工智能发展简史

  • 1950年:图灵提出"图灵测试",奠定AI理论基础
  • 1956年:达特茅斯会议,"人工智能"术语正式诞生
  • 1980s:专家系统兴起,AI第一次商业化浪潮
  • 1997年:IBM深蓝击败国际象棋世界冠军卡斯帕罗夫
  • 2012年:AlexNet在ImageNet竞赛中取得突破性成绩,深度学习时代开启
  • 2016年:AlphaGo击败围棋世界冠军李世石
  • 2020年至今:大模型时代,GPT、BERT等预训练模型引领AI革命

1.3 现代AI产业格局

1.3.1 主要技术方向
领域核心技术应用场景代表企业
计算机视觉CNN、YOLO、Transformer人脸识别、自动驾驶、医疗影像商汤、旷视、百度
自然语言处理BERT、GPT、T5智能客服、机器翻译、内容生成OpenAI、Google、阿里
语音技术RNN、WaveNet、Whisper语音识别、语音合成、智能音箱科大讯飞、Apple、Amazon
推荐系统协同过滤、深度学习电商推荐、内容分发、广告投放字节、腾讯、Netflix
1.3.2 AI人才需求分析

根据2025年招聘数据,AI相关岗位技能要求:

岗位必备技能加分技能平均薪资
算法工程师Python、PyTorch/TensorFlow、数学基础大模型、分布式训练¥35,000/月
数据科学家Python、SQL、统计学、机器学习深度学习、A/B测试¥28,000/月
AI产品经理产品设计、AI技术理解、数据分析技术背景、项目管理¥25,000/月
MLOps工程师Docker、Kubernetes、CI/CD、监控模型部署、性能优化¥32,000/月

二、人工智能数学基础

2.1 线性代数

2.1.1 向量与矩阵运算
import numpy as np
import matplotlib.pyplot as plt

# 向量运算
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# 点积(内积)
dot_product = np.dot(v1, v2)  # 1*4 + 2*5 + 3*6 = 32

# 叉积(外积)
cross_product = np.cross(v1[:2], v2[:2])  # 2D向量叉积

# 矩阵运算
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# 矩阵乘法
C = np.dot(A, B)

# 转置
A_T = A.T

# 逆矩阵
A_inv = np.linalg.inv(A)

# 特征值和特征向量
eigenvals, eigenvecs = np.linalg.eig(A)

print(f"点积: {dot_product}")
print(f"矩阵乘法:\n{C}")
print(f"特征值: {eigenvals}")
2.1.2 特征值分解与奇异值分解

特征值分解(EVD):

A=QΛQ−1A=QΛQ−1

其中 QQ 是特征向量矩阵,ΛΛ 是特征值对角矩阵。

奇异值分解(SVD):

A=UΣVTA=UΣVT

其中 UU 和 VV 是正交矩阵,ΣΣ 是奇异值对角矩阵。

应用:PCA降维、推荐系统、图像压缩

2.2 概率论与统计学

2.2.1 贝叶斯定理

P(A∣B)=P(B∣A)P(A)P(B)P(A∣B)=P(B)P(B∣A)P(A)​

应用案例:垃圾邮件过滤

# 简单的贝叶斯垃圾邮件分类器
class NaiveBayesSpamFilter:
    def __init__(self):
        self.spam_word_probs = {}
        self.ham_word_probs = {}
        self.p_spam = 0.5  # 先验概率
    
    def train(self, emails, labels):
        """训练贝叶斯分类器"""
        spam_emails = [email for email, label in zip(emails, labels) if label == 1]
        ham_emails = [email for email, label in zip(emails, labels) if label == 0]
        
        # 计算先验概率
        self.p_spam = len(spam_emails) / len(emails)
        
        # 统计词频
        spam_words = self._extract_words(spam_emails)
        ham_words = self._extract_words(ham_emails)
        
        total_spam_words = sum(spam_words.values())
        total_ham_words = sum(ham_words.values())
        
        # 计算条件概率(带拉普拉斯平滑)
        vocab = set(spam_words.keys()) | set(ham_words.keys())
        for word in vocab:
            spam_count = spam_words.get(word, 0)
            ham_count = ham_words.get(word, 0)
            
            self.spam_word_probs[word] = (spam_count + 1) / (total_spam_words + len(vocab))
            self.ham_word_probs[word] = (ham_count + 1) / (total_ham_words + len(vocab))
    
    def predict(self, email):
        """预测邮件是否为垃圾邮件"""
        words = self._tokenize(email)
        
        # 计算后验概率的对数(避免下溢)
        log_p_spam = np.log(self.p_spam)
        log_p_ham = np.log(1 - self.p_spam)
        
        for word in words:
            if word in self.spam_word_probs:
                log_p_spam += np.log(self.spam_word_probs[word])
                log_p_ham += np.log(self.ham_word_probs[word])
        
        return 1 if log_p_spam > log_p_ham else 0
    
    def _extract_words(self, emails):
        """提取词频"""
        word_count = {}
        for email in emails:
            words = self._tokenize(email)
            for word in words:
                word_count[word] = word_count.get(word, 0) + 1
        return word_count
    
    def _tokenize(self, text):
        """简单分词"""
        return text.lower().split()
2.2.2 概率分布
分布类型概率密度函数应用场景
正态分布f(x)=1σ2πe−(x−μ)22σ2f(x)=σ2π​1​e−2σ2(x−μ)2​误差分析、特征标准化
伯努利分布P(X=1)=p,P(X=0)=1−pP(X=1)=p,P(X=0)=1−p二分类问题
泊松分布P(X=k)=λke−λk!P(X=k)=k!λke−λ​事件计数(如网站访问量)
指数分布f(x)=λe−λxf(x)=λe−λx等待时间、可靠性分析

2.3 微积分与优化

2.3.1 梯度下降法

梯度:函数在某点处变化最快的方向

∇f(x)=[∂f∂x1,∂f∂x2,…,∂f∂xn]∇f(x)=[∂x1​∂f​,∂x2​∂f​,…,∂xn​∂f​]

梯度下降更新规则

θt+1=θt−α∇J(θt)θt+1​=θt​−α∇J(θt​)

其中 αα 是学习率,J(θ)J(θ) 是损失函数。

# 从零实现梯度下降
def gradient_descent(X, y, learning_rate=0.01, epochs=1000):
    """
    线性回归的梯度下降实现
    X: 特征矩阵 (m, n)
    y: 目标向量 (m,)
    """
    m, n = X.shape
    theta = np.random.randn(n)  # 初始化参数
    cost_history = []
    
    for epoch in range(epochs):
        # 前向传播
        y_pred = X.dot(theta)
        
        # 计算损失(均方误差)
        cost = np.mean((y_pred - y) ** 2)
        cost_history.append(cost)
        
        # 计算梯度
        gradient = (2/m) * X.T.dot(y_pred - y)
        
        # 参数更新
        theta -= learning_rate * gradient
        
        if epoch % 100 == 0:
            print(f"Epoch {epoch}, Cost: {cost:.4f}")
    
    return theta, cost_history

# 示例使用
np.random.seed(42)
X = np.random.randn(100, 2)
y = 3 * X[:, 0] + 2 * X[:, 1] + np.random.randn(100) * 0.1

# 添加偏置项
X_b = np.c_[np.ones((100, 1)), X]

theta, costs = gradient_descent(X_b, y, learning_rate=0.1, epochs=1000)
print(f"学到的参数: {theta}")
2.3.2 优化算法演进
算法特点适用场景
SGD简单、内存效率高大数据集
Momentum加入动量项,减少震荡深度网络
RMSprop自适应学习率非平稳目标
Adam结合Momentum和RMSprop通用首选

三、机器学习基础

3.1 机器学习基本概念

3.1.1 监督学习 vs 无监督学习
类型输入输出目标示例
监督学习特征 + 标签预测标签最小化预测误差房价预测、图像分类
无监督学习仅特征模式/结构发现数据内在结构聚类、降维
强化学习状态动作最大化累积奖励游戏AI、机器人控制
3.1.2 机器学习工作流程
flowchart TD
    A[问题定义] --> B[数据收集]
    B --> C[数据预处理]
    C --> D[特征工程]
    D --> E[模型选择]
    E --> F[模型训练]
    F --> G[模型评估]
    G --> H{性能达标?}
    H -->|否| I[模型调优]
    I --> F
    H -->|是| J[模型部署]
    J --> K[监控维护]

3.2 数据预处理

3.2.1 数据清洗
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder

# 创建示例数据
df = pd.DataFrame({
    'age': [25, 30, np.nan, 35, 40],
    'income': [50000, 60000, 55000, np.nan, 70000],
    'category': ['A', 'B', 'A', 'C', np.nan],
    'target': [0, 1, 0, 1, 1]
})

print("原始数据:")
print(df)
print(f"\n缺失值统计:\n{df.isnull().sum()}")

# 处理缺失值
df_cleaned = df.copy()

# 数值型变量:用中位数填充
df_cleaned['age'].fillna(df_cleaned['age'].median(), inplace=True)
df_cleaned['income'].fillna(df_cleaned['income'].median(), inplace=True)

# 分类型变量:用众数填充
df_cleaned['category'].fillna(df_cleaned['category'].mode()[0], inplace=True)

# 处理异常值(IQR方法)
def remove_outliers_iqr(df, column):
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    return df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]

# df_cleaned = remove_outliers_iqr(df_cleaned, 'income')

print("\n清洗后数据:")
print(df_cleaned)
3.2.2 特征工程
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.decomposition import PCA

# 特征编码
le = LabelEncoder()
df_cleaned['category_encoded'] = le.fit_transform(df_cleaned['category'])

# 特征缩放
scaler = StandardScaler()
numerical_features = ['age', 'income']
df_cleaned[numerical_features] = scaler.fit_transform(df_cleaned[numerical_features])

# 特征选择
X = df_cleaned[['age', 'income', 'category_encoded']]
y = df_cleaned['target']

selector = SelectKBest(score_func=f_classif, k=2)
X_selected = selector.fit_transform(X, y)

selected_features = X.columns[selector.get_support()]
print(f"选择的特征: {selected_features}")

# 特征降维
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
print(f"PCA解释方差比: {pca.explained_variance_ratio_}")

3.3 经典机器学习算法

3.3.1 线性回归

数学原理

y=θ0+θ1x1+θ2x2+⋯+θnxn+ϵy=θ0​+θ1​x1​+θ2​x2​+⋯+θn​xn​+ϵ

损失函数(均方误差):

J(θ)=12m∑i=1m(hθ(x(i))−y(i))2J(θ)=2m1​i=1∑m​(hθ​(x(i))−y(i))2

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

# 生成示例数据
np.random.seed(42)
X = np.random.randn(100, 2)
y = 3 * X[:, 0] + 2 * X[:, 1] + np.random.randn(100) * 0.1

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练模型
lr = LinearRegression()
lr.fit(X_train, y_train)

# 预测
y_pred = lr.predict(X_test)

# 评估
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"系数: {lr.coef_}")
print(f"截距: {lr.intercept_:.4f}")
print(f"MSE: {mse:.4f}")
print(f"R²: {r2:.4f}")
3.3.2 逻辑回归

数学原理(Sigmoid函数):

hθ(x)=11+e−θTxhθ​(x)=1+e−θTx1​

损失函数(对数损失):

J(θ)=−1m∑i=1m[y(i)log⁡(hθ(x(i)))+(1−y(i))log⁡(1−hθ(x(i)))]J(θ)=−m1​i=1∑m​[y(i)log(hθ​(x(i)))+(1−y(i))log(1−hθ​(x(i)))]

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score, classification_report

# 生成二分类数据
X, y = make_classification(n_samples=1000, n_features=4, n_redundant=0, 
                          n_informative=4, random_state=42, n_clusters_per_class=1)

# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练逻辑回归
lr = LogisticRegression(random_state=42)
lr.fit(X_train, y_train)

# 预测
y_pred = lr.predict(X_test)
y_pred_proba = lr.predict_proba(X_test)

# 评估
accuracy = accuracy_score(y_test, y_pred)
print(f"准确率: {accuracy:.4f}")
print("\n分类报告:")
print(classification_report(y_test, y_pred))
3.3.3 决策树与随机森林

决策树原理:通过特征分割最大化信息增益

随机森林:集成多个决策树,通过Bagging和特征随机选择提高泛化能力

from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris

# 加载数据
iris = load_iris()
X, y = iris.data, iris.target

# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 决策树
dt = DecisionTreeClassifier(random_state=42)
dt.fit(X_train, y_train)
dt_pred = dt.predict(X_test)
dt_accuracy = accuracy_score(y_test, dt_pred)

# 随机森林
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)
rf_pred = rf.predict(X_test)
rf_accuracy = accuracy_score(y_test, rf_pred)

print(f"决策树准确率: {dt_accuracy:.4f}")
print(f"随机森林准确率: {rf_accuracy:.4f}")

# 特征重要性
feature_importance = rf.feature_importances_
feature_names = iris.feature_names

plt.figure(figsize=(10, 6))
plt.bar(feature_names, feature_importance)
plt.title('随机森林特征重要性')
plt.xlabel('特征')
plt.ylabel('重要性')
plt.show()
3.3.4 支持向量机

数学原理:寻找最大间隔超平面

核函数:将线性不可分问题映射到高维空间

from sklearn.svm import SVC
from sklearn.datasets import make_circles
from sklearn.preprocessing import StandardScaler

# 生成非线性可分数据
X, y = make_circles(n_samples=1000, noise=0.1, factor=0.2, random_state=42)

# 标准化
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# 不同核函数的SVM
kernels = ['linear', 'poly', 'rbf', 'sigmoid']
results = {}

for kernel in kernels:
    svm = SVC(kernel=kernel, random_state=42)
    svm.fit(X_train, y_train)
    y_pred = svm.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    results[kernel] = accuracy
    print(f"{kernel}核准确率: {accuracy:.4f}")

# 可视化RBF核结果
svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train, y_train)

# 创建网格
xx, yy = np.meshgrid(np.linspace(-3, 3, 500), np.linspace(-3, 3, 500))
Z = svm_rbf.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.title('原始数据')

plt.subplot(1, 2, 2)
plt.contourf(xx, yy, Z, levels=50, cmap='RdYlBu', alpha=0.7)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.title('SVM RBF核决策边界')
plt.show()

3.4 模型评估与验证

3.4.1 评估指标
任务类型指标公式适用场景
回归MSE1m∑(yi−y^i)2m1​∑(yi​−y^​i​)2通用回归评估
1−SSresSStot1−SStot​SSres​​模型解释力
二分类AccuracyTP+TNTP+TN+FP+FNTP+TN+FP+FNTP+TN​平衡数据集
PrecisionTPTP+FPTP+FPTP​关注假阳性
RecallTPTP+FNTP+FNTP​关注假阴性
F1-score2precision⋅recallprecision+recall2precision+recallprecision⋅recall​平衡精确率和召回率
多分类Macro-F1各类F1的算术平均各类同等重要
Micro-F1全局TP、FP、FN计算F1数据不平衡
3.4.2 交叉验证
from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.ensemble import RandomForestClassifier

# 使用分层K折交叉验证
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# 评估随机森林
rf = RandomForestClassifier(n_estimators=100, random_state=42)
cv_scores = cross_val_score(rf, X, y, cv=skf, scoring='accuracy')

print(f"交叉验证准确率: {cv_scores}")
print(f"平均准确率: {cv_scores.mean():.4f} ± {cv_scores.std():.4f}")

# 不同模型的交叉验证比较
models = {
    'Logistic Regression': LogisticRegression(random_state=42),
    'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
    'SVM': SVC(kernel='rbf', random_state=42)
}

results = {}
for name, model in models.items():
    scores = cross_val_score(model, X, y, cv=skf, scoring='accuracy')
    results[name] = scores
    print(f"{name}: {scores.mean():.4f} ± {scores.std():.4f}")
3.4.3 学习曲线与验证曲线
from sklearn.model_selection import learning_curve, validation_curve

# 学习曲线
def plot_learning_curve(estimator, X, y, title="Learning Curve"):
    train_sizes, train_scores, val_scores = learning_curve(
        estimator, X, y, cv=5, n_jobs=-1, 
        train_sizes=np.linspace(0.1, 1.0, 10), random_state=42
    )
    
    train_mean = np.mean(train_scores, axis=1)
    train_std = np.std(train_scores, axis=1)
    val_mean = np.mean(val_scores, axis=1)
    val_std = np.std(val_scores, axis=1)
    
    plt.figure(figsize=(10, 6))
    plt.plot(train_sizes, train_mean, 'o-', color='blue', label='Training Score')
    plt.fill_between(train_sizes, train_mean - train_std, train_mean + train_std, alpha=0.1, color='blue')
    
    plt.plot(train_sizes, val_mean, 'o-', color='red', label='Validation Score')
    plt.fill_between(train_sizes, val_mean - val_std, val_mean + val_std, alpha=0.1, color='red')
    
    plt.xlabel('Training Set Size')
    plt.ylabel('Accuracy')
    plt.title(title)
    plt.legend()
    plt.grid(True)
    plt.show()

# 绘制随机森林的学习曲线
plot_learning_curve(RandomForestClassifier(n_estimators=100, random_state=42), X, y)

四、深度学习基础

4.1 神经网络基础

4.1.1 感知机与多层感知机

感知机(单层神经网络):

y=f(wTx+b)y=f(wTx+b)

其中 ff 是激活函数(如阶跃函数)。

多层感知机(MLP):

h(1)=f(W(1)x+b(1))h(2)=f(W(2)h(1)+b(2))⋮y=f(W(L)h(L−1)+b(L))h(1)=f(W(1)x+b(1))h(2)=f(W(2)h(1)+b(2))⋮y=f(W(L)h(L−1)+b(L))

4.1.2 激活函数
激活函数公式优点缺点
Sigmoidσ(x)=11+e−xσ(x)=1+e−x1​输出在(0,1),概率解释梯度消失、输出非零中心
Tanhtanh⁡(x)=ex−e−xex+e−xtanh(x)=ex+e−xex−e−x​零中心、更强梯度梯度消失
ReLUReLU(x)=max⁡(0,x)ReLU(x)=max(0,x)计算简单、缓解梯度消失死亡神经元
Leaky ReLULReLU(x)=max⁡(0.01x,x)LReLU(x)=max(0.01x,x)解决死亡神经元需要调参
import torch
import torch.nn as nn
import torch.optim as optim

# PyTorch实现MLP
class MLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.2)
    
    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.relu(self.fc2(x))
        x = self.dropout(x)
        x = self.fc3(x)
        return x

# 训练MLP
def train_mlp(X_train, y_train, X_test, y_test, epochs=100):
    # 转换为PyTorch张量
    X_train_tensor = torch.FloatTensor(X_train)
    y_train_tensor = torch.LongTensor(y_train)
    X_test_tensor = torch.FloatTensor(X_test)
    y_test_tensor = torch.LongTensor(y_test)
    
    # 创建模型
    model = MLP(input_size=X_train.shape[1], hidden_size=64, output_size=len(np.unique(y_train)))
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    
    # 训练循环
    train_losses = []
    for epoch in range(epochs):
        model.train()
        optimizer.zero_grad()
        
        outputs = model(X_train_tensor)
        loss = criterion(outputs, y_train_tensor)
        loss.backward()
        optimizer.step()
        
        train_losses.append(loss.item())
        
        if epoch % 20 == 0:
            model.eval()
            with torch.no_grad():
                test_outputs = model(X_test_tensor)
                _, predicted = torch.max(test_outputs.data, 1)
                accuracy = (predicted == y_test_tensor).sum().item() / len(y_test_tensor)
                print(f"Epoch {epoch}, Loss: {loss.item():.4f}, Test Accuracy: {accuracy:.4f}")
    
    return model, train_losses

# 使用鸢尾花数据集
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model, losses = train_mlp(X_train, y_train, X_test, y_test, epochs=200)

4.2 卷积神经网络

4.2.1 CNN基本原理

卷积层:提取局部特征

(I∗K)(i,j)=∑m∑nI(i+m,j+n)K(m,n)(I∗K)(i,j)=m∑​n∑​I(i+m,j+n)K(m,n)

池化层:降维和特征不变性

  • 最大池化:保留最显著特征
  • 平均池化:平滑特征

全连接层:分类决策

4.2.2 CNN架构演进
架构创新点年份
LeNet-5首个CNN架构1998
AlexNetReLU、Dropout、GPU训练2012
VGGNet小卷积核堆叠2014
GoogLeNetInception模块2014
ResNet残差连接2015
EfficientNet复合缩放2019
import torchvision.models as models
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader

# 数据预处理
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 加载CIFAR-10数据集
train_dataset = CIFAR10(root='./data', train=True, download=True, transform=transform)
test_dataset = CIFAR10(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# 使用预训练的ResNet18(迁移学习)
model = models.resnet18(pretrained=True)

# 修改最后的全连接层以适应CIFAR-10的10个类别
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10)

# 训练配置
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 训练循环(简化版)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

for epoch in range(10):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    
    for i, (inputs, labels) in enumerate(train_loader):
        inputs, labels = inputs.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
        if i % 100 == 99:
            print(f'Epoch {epoch+1}, Batch {i+1}, Loss: {running_loss/100:.4f}, Accuracy: {100*correct/total:.2f}%')
            running_loss = 0.0
            correct = 0
            total = 0

# 测试
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Test Accuracy: {100 * correct / total:.2f}%')

4.3 循环神经网络

4.3.1 RNN基本原理

基本RNN单元

ht=tanh⁡(Whhht−1+Wxhxt+bh)yt=Whyht+byht​=tanh(Whh​ht−1​+Wxh​xt​+bh​)yt​=Why​ht​+by​

问题:梯度消失/爆炸,难以处理长序列

4.3.2 LSTM与GRU

LSTM(长短期记忆):

  • 遗忘门:ft=σ(Wf⋅[ht−1,xt]+bf)ft​=σ(Wf​⋅[ht−1​,xt​]+bf​)
  • 输入门:$i_t = \sigma(W_i \cdot [h_{t-1

您可能感兴趣的与本文相关的镜像

ComfyUI

ComfyUI

AI应用
ComfyUI

ComfyUI是一款易于上手的工作流设计工具,具有以下特点:基于工作流节点设计,可视化工作流搭建,快速切换工作流,对显存占用小,速度快,支持多种插件,如ADetailer、Controlnet和AnimateDIFF等

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ZTLJQ

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值