Weka中的Correlation based Feature Selection（特征选择）方法简介

最新推荐文章于 2019-08-06 22:26:30 发布

蛐蛐蛐

最新推荐文章于 2019-08-06 22:26:30 发布

阅读量8.9k

点赞数 4

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/qysh123/article/details/77540532

本文介绍了Correlation-based Feature Selection (CFS)方法的基本原理及其在机器学习中的应用。CFS通过评估每个特征的预测能力和特征间的冗余度来选择最佳特征子集，偏好与类别高度相关且内部相关性低的特征组合。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

最近看了几篇文章，在机器学习过程中，特征选择方法都用的是Correlation based Feature Selection （CFS），我之前对这个Feature Selection的方法实在不了解，今天简单看了一下。具体而言，实际上就是Explorer界面中“Select Attributes”中的第一个方法“CfsSubsetEval”，具体介绍如下：

========================================

根据属性子集中每一个特征的预测能力以及它们之间的关联性进行评估，单个特征预测能力强且特征子集内的相关性低的子集表现好。

Evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.Subsets of features that are highly correlated with the class while having low intercorrelation are preferred.

For more information see:

M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand.

这篇文章竟然还是一篇博士论文，链接：

http://www.cs.waikato.ac.nz/~mhall/thesis.pdf

=======================================

这个网页中，有对所有特征选择方法比较详细的介绍：http://www.cnblogs.com/lutaitou/p/5818027.html 就简单总结介绍这么多。