我的文章来自@fghdvbgt的一篇博客:http://m.blog.youkuaiyun.com/article/details?id=51277716
发现里面确实有着两个问题:
(1):
def binSplitDataSet(dataSet, feature, value):
mat0 = dataSet[nonzero(dataSet[:,feature] > value)[0],:][0]
mat1 = dataSet[nonzero(dataSet[:,feature] <= value)[0],:][0]
return mat0,mat1
mat0,mat1应去掉后面的[0],这样才能将原dataSet按照feature的种类分割
(2): 是chooseBestSplit函数中的:
for splitVal in set(dataSet[:,featIndex]):
应该改为:
for splitVal in set(i[0] for i in dataSet[:,index].tolist()):
这样才可以正确运行