5.1 TRAINING AND TESTING

本文介绍了在机器学习中常用的三种数据集:训练集用于构建分类器;验证集用于优化分类器参数或选择分类器;测试集则用于评估最终分类器的错误率。为确保评估准确性,这三组数据需保持独立。
   people often talk about three datasets: (三种数据集)
   The training data
the training data is used by one or morelearning schemes to
come up with classifiers. (训练集:使用训练集来构造分类器)
   The validation data
the validation data is used to optimize parameters of those
classifier or to select aparticular one
(验证集:优化分类器的参数或者选择特定的分类器)
   The test data:Then the test data is used to calculate
the error rate of the final, optimized, method.
(检验集:检查最终所得到的分类器的错误率)
   Each of the three sets must be chosen independently:
The validation set must be different from the training set to obtain
good performance in the optimization or selection stage, and the
test set must be different from both to obtain a reliable estimate of
the true error rate.(三个集合往往相互独立)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值