此工具利用交叉验证的原理来选择参数c和gamma,是否有更佳的参数,有待考证。其中两参数的作用见如下:
The SVM with a Gaussian kernel function has two such training parameters: C which controls overfitting of the model, and gamma (γ) which controls the degree of nonlinearity of the model. Gamma is inversely related to sigma which is a degree for spread around a mean in statistics: the higher the value of gamma, the lower the value of sigma, thus the less spread or the more nonlinear the behavior of the kernel. The values of these training parameters C and gamma are determined by grid search and cross validation: the model with the highest estimated performance determines the selected training parameters. Then, the performance of the constructed model is estimated by using 5-fold cross validation on the training data. Finally, the constructed model is validated by predicting the validation data and comparing these predictions with the real observations by means of ROC curves.
gamma(或Epsilon ε)---不敏感损失

本文探讨了libSVM中grid.py工具如何通过交叉验证选择C和gamma参数,以平衡模型过拟合和非线性程度。高gamma值对应低sigma,增加模型非线性行为。通过网格搜索和5折交叉验证评估模型性能,最终通过ROC曲线验证模型在验证集上的表现。
最低0.47元/天 解锁文章
1万+

被折叠的 条评论
为什么被折叠?



