一、mahout 简单例子测试
mahout 安装配置可以参考:mahout安装配置
1、kmeans 聚类算法测试数据来源:
地址:http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
2、下载数据,把数据存放到hdfs上(hadoop2.6.1 已经启动)
创建测试目录testdata,并把数据导入到这个tastdata目录中(这里的目录的名字只能是testdata)
$ hdfs dfs -mkdir testdata$ hdfs dfs -put /home/lin/hadoop/mahout-distribution-0.10.0/test.data testdata
3、执行kmeans算法,等待运行结果
$ hadoop jar /home/lin/hadoop/mahout-distribution-0.10.0/mahout-examples-0.10.0-job.jar org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
4、运行成功查看运行结果
hdfs dfs -ls output
显示如下结果证明运行成功:
lin@lin162:~/hadoop/hadoop-2.6.1/etc/hadoop$ hdfs dfs -ls outputFound 15 items-rw-r--r-- 2 lin supergroup 194 2015-12-01 12:27 output/_policydrwxr-xr-x - lin supergroup 0 2015-12-01 12:27 output/clusteredPointsdrwxr-xr-x - lin supergroup 0 2015-12-01 12:22 output/clusters-0drwxr-xr-x - lin supergroup 0 2015-12-01 12:23 output/clusters-1drwxr-xr-x - lin supergroup 0 2015-12-01 12:27 output/clusters-10-finaldrwxr-xr-x - lin supergroup 0 2015-12-01 12:23 output/clusters-2drwxr-xr-x - lin supergroup 0 2015-12-01 12:24 output/clusters-3drwxr-xr-x - lin supergroup 0 2015-12-01 12:24 output/clusters-4drwxr-xr-x - lin supergroup 0 2015-12-01 12:25 output/clusters-5drwxr-xr-x - lin supergroup 0 2015-12-01 12:25 output/clusters-6drwxr-xr-x - lin supergroup 0 2015-12-01 12:25 output/clusters-7drwxr-xr-x - lin supergroup 0 2015-12-01 12:26 output/clusters-8drwxr-xr-x - lin supergroup 0 2015-12-01 12:26 output/clusters-9drwxr-xr-x - lin supergroup 0 2015-12-01 12:22 output/datadrwxr-xr-x - lin supergroup 0 2015-12-01 12:22 output/random-seeds
784

被折叠的 条评论
为什么被折叠?



