机器学习西瓜书学习笔记（二）— 模型评估与选择

最新推荐文章于 2025-01-17 18:40:35 发布

luminous_y

最新推荐文章于 2025-01-17 18:40:35 发布

阅读量910

点赞数 1

CC 4.0 BY-SA版权

分类专栏： ML

本文链接：https://blog.youkuaiyun.com/qq_43528771/article/details/99708629

文章目录

经验误差与过拟合

【错误率】E = $\frac{a}{m}$ ( m: 样本总数 - a: 分类错误的样本数 )
【误差】学习器的实际预测输出与样本的真是输出之间的差异
【训练误差 / 经验误差】学习器在训练集上的误差
【泛化误差】学习器在新样本上的误差
【欠拟合】【过拟合】（ $\because$ P != NP $\therefore$ 过拟合不可避免）

评估方法（实验估计）

1. 留出法 hold-out / 验证集法 validation set approach
直接将数据集D划分成两个互斥的集合S（训练集）和T（测试集）
* 要保持数据分布的一致性（分层采样）
* 在给定S和T的比例后，不同划分方法结果不同，返回n次随机划分结果的平均值
* S集的比例约 $\frac23$ ~ $\frac45$
* S大T小时，评估结果方差较大；S小T大时，评估结果偏差较大
* 优：简单，易于实现；缺：每次随机产生的MSE变化大；且只用到了部分数据

# hold out / validation set approach #

library(ISLR) 
# the package for the dataset Auto

set.seed(1)
train=sample(392,196) 
# select a random training set of 196 observations out of the original 392 observations. 

attach(Auto)
# linear regression
lm.fit=lm(mpg~horsepower,data=Auto,subset=train) 
mean((mpg-predict(lm.fit,Auto))[-train]^2) 
# calculate MSE
# the -train index selects only the observations that are not in the training set

# quadratic regression
lm.fit2=lm(mpg~poly(horsepower,2),data=Auto,subset=train)
mean((mpg-predict(lm.fit2,Auto))[-train]^2)

# cubic regression
lm.fit3=lm(mpg~poly(horsepower,3),data=Auto,subset=train)
mean((mpg-predict(lm.fit3,Auto))[-train]^2)

2. 交叉验证法 cross validation (CV)（p次k折交叉验证）
先将数据集D划分为k个大小相似的互斥子集D₁, D₂, …, D_k, 然后每次用k-1个子集的并集作为训练集，余下的那个子集作为测试集，从而进行k次测试
* 要保持每个子集数据分布的一致性（分层采样）
* k通常为5、10或20
* 在给定k值后，不同划分方法结果不同，返回p次随机划分结果的平均值

# k-fold cross validation #

library(boot)
# the package for the function cv.glm

set.seed(1)
cv.error.10=rep(0,5)
for(i in 1:5){
   
   
  glm.fit=glm(mpg~poly(horsepower,i),data=Auto)
  cv.error.10[i]=cv.glm(Auto,glm.fit,K =10)$delta[1] 
}
# k=10 for k-fold CV
cv.error.10

-留一法 LOOCV（m次m折交叉验证）
m个样本用唯一的方式划分为m个子集，每个子集1个样本
* 优：评估结果较为准确（偏差很小）；但计算开销大，且方差比 k - fold CV 大

# LOOCV #

library(boot)

# linear regression
glm.fit=glm(mpg~horsepower,data=Auto) 
cv.err=cv.glm(Auto,glm.fit)
cv.err$delta

# the errors of linear and higher-order polynomial regression
cv.error=rep(0,5)
for(i in 1:5){
   
   
  glm.fit=glm(mpg~poly(horsepower,i),data=Auto)
  cv.error[i]=cv.glm(Auto,glm.fit)$delta[1]
}
cv.error

3. 自助法 bootstrapping
【自助采样法】给定包含m个样本的数据集D，每次随机从D 中挑选一个样本，将其拷贝放入D’，然后再将该样本放回初始数据集D中，使得该样本在下次采样时仍有可能被采到。重复执行m 次，就可以得到包含m个样本的数据集D’。可以得知在m次采样中，样本始终不被采到的概率取极限为： $\lim_{m\to\infty}{(1-\frac1m)^m}\to\frac1e\approx0.368$ 。通过自助采样，初始样本集D中大约有36.8%的样本没有出现在D’中，于是可以将D’作为训练集，D\D’作为测试集。测试结果称为【外包估计(out-of bag estimate)】
* 适用于测试集小，难以有效划分测试集和训练集的情况（特别是集成学习）
* 但改变了初始数据集的分布，会引入估计偏差（数据量足够时，不常用该方法）

# bootstrapping #
library(boot)

# Estimating the Accuracy of a Statistic of Interest 
# create a function that computes the statistic of interest
alpha.fn=function(data,index){
   
   
  X=data$X[index]
  Y=data$Y[index]
  return((var(Y)-cov(X,Y))/(var(X)+var(Y)-2*cov(X,Y)))
}
# use the boot() function to perform the bootstrap by repeatedly sampling observations from the dataset with replacement
boot(Portfolio,alpha.fn,R=1000)

# Estimating the Accuracy of a Linear Regression Model 
boot.fn=function(data,index)
  return(coef(lm(mpg~horsepower,data=data,subset=index)))
boot.fn(Auto,1:392)
boot(Auto,boot.fn,1000)

4. 调参 parameter tuning
ML中的参数类型：①算法的参数(超参数)，数目在10以内，由人工设定多个参数候选值后产生模型 ②模型的参数，数目很多，通过学习来产生多个候选模型(深度学习)

性能度量（评价标准）

回归任务：

均方误差MSE $\qquad E(f;D)=\frac1m\sum_{i=1}^{m}(f(\bm{x_i})-y_i)^2 \quad or \quad E(f;\mathcal{D})=\int_{x \sim \mathcal{D}}(f(\bm{x})-y)^2 \mathcal{p}(\bm{x})d\bm{x}$

分类任务：
性能度量

错误率 (error) $E(f;D)=\frac1m \sum_{i=1}^{m}\mathbb{I}(f(\bm{x_i}) \ne y_i)\quad or \quad E(f;\mathcal{D})=\int_{x \sim \mathcal{D}}\mathbb{I}(f(\bm{x})\ne y) \mathcal{p}(\bm{x})d\bm{x}$
精度 (accuracy) $acc(f;D)=\frac1m \sum_{i=1}^{m}\mathbb{I}(f(\bm{x_i}) = y_i)=1-E(f;D) \quad or \quad E(f;\mathcal{D})=\int_{x \sim \mathcal{D}}\mathbb{I}(f(\bm{x})= y) \mathcal{p}(\bm{x})d\bm{x}=1-E(f;\mathcal{D})$
二分类问题分类结果的混淆矩阵
查准率/准确率 (presicion) $P=\frac{TP}{TP+FP}$
查全率/召回率 (recall) $R=\frac{TP}{TP+FN}$
真正例率(true positive rate) $\quad(TPR=R)$ $TPR=\frac{TP}{TP+FN}$