classifier算法优缺点

railsconf时,在oreilly展台见到一本'Programming Collective Intelligence'的书,其实是讲data mining的。比其他的教科书类书易懂的多。下面摘抄了一下有用的内容:

=knn=

+ new data can be added at any time--does not require any computation at all; the data is simply added to the set.

- it requires all the trainning data to be present in order to make predictions. In a dataset with millions of examples, this is not just a space issue but also a time issue.

=svm=

+ after training they are very fast to classify new observations.

- black box technique. A SVM may give great answers, but you will never really know why.

- require retrainning if the data changes


=neural network=

+ allow incremental training and generally don't require a lot space to store the trained models.

- black box technique

=decision tree=

+ easy to interpret trained model, brings important factors to the top of the tree.

- Have to start from scartch each time (decision trees that support incremental training are an active area of research)

- tree can becomes extremely large and complex and would be slow to make classification.

=naive bayesian=



+ speed is good for training and querying, even with large data set

+ incremental

+ easy to interpret what the classifier has actually learned

- unable to deal with outcomes that change based on combinations of features.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值