数据格式:
1,1.52101,13.64,4.49,1.10,71.78,0.06,8.75,0.00,0.00,1
2,1.51761,13.89,3.60,1.36,72.73,0.48,7.83,0.00,0.00,1
3,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0.00,0.00,1
4,1.51766,13.21,3.69,1.29,72.61,0.57,8.22,0.00,0.00,1
5,1.51742,13.27,3.62,1.24,73.08,0.55,8.07,0.00,0.00,1
6,1.51596,12.79,3.61,1.62,72.97,0.64,8.07,0.00,0.26,1
7,1.51743,13.30,3.60,1.14,73.09,0.58,8.17,0.00,0.00,1
8,1.51756,13.15,3.61,1.05,73.24,0.57,8.24,0.00,0.00,1
9,1.51918,14.04,3.58,1.37,72.08,0.56,8.30,0.00,0.00,1
一、生成描述文件
命令:hadoop jar mahout-examples-0.9-job.jar org.apache.mahout.classifier.df.tools.Describe
--path(-p) 任务的输入路径,必选
--file(-f) &nb

本文档介绍了如何使用Hadoop和Mahout库建立随机森林模型。首先通过`Describe`命令描述数据,接着使用`BuildForest`构建随机森林,最后用`TestForest`进行模型测试和评估。主要涉及的参数包括数据输入路径、描述文件、随机选取属性数量、决策树个数等。
最低0.47元/天 解锁文章
1757

被折叠的 条评论
为什么被折叠?



