03-Data Resampling

本文介绍了三种常用的统计学习方法:Bootstrap用于估计抽样过程的变异性及置信区间;置换测试通过重新组合数据集来验证特定的零假设;交叉验证则通过移除样本点并使用剩余数据进行拟合,以此评估模型的预测性能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Related image

1. Bootstrap

Draw a “bootstrap sample" by sampling n times with replacement from the sample.

The bootstrap estimates the variability of the sampling process and works well for estimating confidence intervals.

A confidence interval provides a range of values which is likely to contain the population parameter of interest.

ex. I have 95% confidence to believe that the mean of this parameter is in range(x1, x2)

Image result for confidence interval



2. Permutation

Concatenate two datasets A & B, randomly reset the indexes, then output new A and new B with no replacement.

Permutation tests test a specific null hypothesis of exchangeability.


3.Cross validation

Cross-validation removes one point at a time, then fits to the remaining points, then sees how well the removed point is fit.

Cross-validation is primarily a way of measuring the predictive performance of a statistical model.

Cross Validation is used to assess the predictive performance of the models and and to judge how they perform outside the sample to a new data set also known as test data
The motivation to use cross validation techniques is that when we fit a model, we are fitting it to a training dataset. Without cross validation we only have information on how does our model perform to our in-sample data. Ideally we would like to see how does the model perform when we have a new data in terms of accuracy of its predictions. In science, theories are judged by its predictive performance.  
There two types of cross validation you can perform: leave one out and k fold.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值