运行Hadoop WordCount
1.启动Hadoop
./root/hadoop/hadoop-2.6.0/sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
或者使用:
./root/hadoop/hadoop-2.6.0/sbin/start-dfs.sh
./root/hadoop/hadoop-2.6.0/sbin/start-yarn.sh
2.准备测试文件,在某个目录创建测试文件
[root@localhost /]# mkdir /root/testFile
[root@localhost /]# echo "Hello Hadoop" > /root/testFile/hello.txt
[root@localhost /]# echo "Hello Java" > /root/testFile/hello2.txt
3.在HDFS上创建输入文件夹目录 input
/root/hadoop/hadoop-2.6.0/bin
[root@localhost bin]# hadoop fs -mkdir /input
- 把本地硬盘上创建的文件传进input里面
[root@localhost bin]# hadoop fs -put /root/testFile/hello*.txt /input
- hadoop自带的wordcount jar包位置 WordCount类代码
/root/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar
- 开始运行 wordcount
[root@localhost bin]# hadoop jar /root/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input/ /output/wordcount1
17/02/05 19:48:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/02/05 19:48:39 INFO input.FileInputFormat: Total input paths to process : 2
17/02/05 19:48:39 INFO mapreduce.JobSubmitter: number of splits:2
17/02/05 19:48:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1486108015974_0001
17/02/05 19:48:43 INFO impl.YarnClientImpl: Submitted application application_1486108015974_0001
17/02/05 19:48:44 INFO mapreduce.Job: The url to track the job: http://localhost:8099/proxy/application_1486108015974_0001/
17/02/05 19:48:44 INFO mapreduce.Job: Running job: job_1486108015974_0001
17/02/05 19:49:20 INFO mapreduce.Job: Job job_1486108015974_0001 running in uber mode : false
17/02/05 19:49:20 INFO mapreduce.Job: map 0% reduce 0%
17/02/05 19:49:47 INFO mapreduce.Job: map 50% reduce 0%
17/02/05 19:49:49 INFO mapreduce.Job: map 100% reduce 0%
17/02/05 19:49:58 INFO mapreduce.Job: map 100% reduce 100%
17/02/05 19:49:59 INFO mapreduce.Job: Job job_1486108015974_0001 completed successfully
17/02/05 19:49:59 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=54
FILE: Number of bytes written=316700
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=229
HDFS: Number of bytes written=24
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=52251
Total time spent by all reduces in occupied slots (ms)=6032
Total time spent by all map tasks (ms)=52251
Total time spent by all reduce tasks (ms)=6032
Total vcore-seconds taken by all map tasks=52251
Total vcore-seconds taken by all reduce tasks=6032
Total megabyte-seconds taken by all map tasks=53505024
Total megabyte-seconds taken by all reduce tasks=6176768
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=40
Map output materialized bytes=60
Input split bytes=205
Combine input records=4
Combine output records=4
Reduce input groups=3
Reduce shuffle bytes=60
Reduce input records=4
Reduce output records=3
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=679
CPU time spent (ms)=9280
Physical memory (bytes) snapshot=707444736
Virtual memory (bytes) snapshot=2677784576
Total committed heap usage (bytes)=516423680
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=24
File Output Format Counters
Bytes Written=24
[root@localhost bin]#
- 查看运行结果
[root@localhost bin]# hdfs dfs -cat /output/wordcount1/*
Hadoop 1
Hello 2
Java 1
参考:http://www.itnose.net/detail/6197823.html