运行Hadoop WordCount

本文介绍了如何在Hadoop环境中运行WordCount程序,包括启动Hadoop集群、创建测试文件、上传文件到HDFS、执行WordCount任务及查看结果等步骤。

运行Hadoop WordCount

1.启动Hadoop

./root/hadoop/hadoop-2.6.0/sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
或者使用:
./root/hadoop/hadoop-2.6.0/sbin/start-dfs.sh
./root/hadoop/hadoop-2.6.0/sbin/start-yarn.sh

2.准备测试文件,在某个目录创建测试文件

[root@localhost /]# mkdir /root/testFile
[root@localhost /]# echo "Hello Hadoop" > /root/testFile/hello.txt
[root@localhost /]# echo "Hello Java" > /root/testFile/hello2.txt

3.在HDFS上创建输入文件夹目录 input

/root/hadoop/hadoop-2.6.0/bin
[root@localhost bin]# hadoop fs -mkdir /input
  1. 把本地硬盘上创建的文件传进input里面
[root@localhost bin]# hadoop fs -put /root/testFile/hello*.txt /input
  1. hadoop自带的wordcount jar包位置 WordCount类代码
/root/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar
  1. 开始运行 wordcount
[root@localhost bin]# hadoop jar /root/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input/ /output/wordcount1
17/02/05 19:48:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/02/05 19:48:39 INFO input.FileInputFormat: Total input paths to process : 2
17/02/05 19:48:39 INFO mapreduce.JobSubmitter: number of splits:2
17/02/05 19:48:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1486108015974_0001
17/02/05 19:48:43 INFO impl.YarnClientImpl: Submitted application application_1486108015974_0001
17/02/05 19:48:44 INFO mapreduce.Job: The url to track the job: http://localhost:8099/proxy/application_1486108015974_0001/
17/02/05 19:48:44 INFO mapreduce.Job: Running job: job_1486108015974_0001
17/02/05 19:49:20 INFO mapreduce.Job: Job job_1486108015974_0001 running in uber mode : false
17/02/05 19:49:20 INFO mapreduce.Job:  map 0% reduce 0%
17/02/05 19:49:47 INFO mapreduce.Job:  map 50% reduce 0%
17/02/05 19:49:49 INFO mapreduce.Job:  map 100% reduce 0%
17/02/05 19:49:58 INFO mapreduce.Job:  map 100% reduce 100%
17/02/05 19:49:59 INFO mapreduce.Job: Job job_1486108015974_0001 completed successfully
17/02/05 19:49:59 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=54
		FILE: Number of bytes written=316700
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=229
		HDFS: Number of bytes written=24
		HDFS: Number of read operations=9
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=2
		Launched reduce tasks=1
		Data-local map tasks=2
		Total time spent by all maps in occupied slots (ms)=52251
		Total time spent by all reduces in occupied slots (ms)=6032
		Total time spent by all map tasks (ms)=52251
		Total time spent by all reduce tasks (ms)=6032
		Total vcore-seconds taken by all map tasks=52251
		Total vcore-seconds taken by all reduce tasks=6032
		Total megabyte-seconds taken by all map tasks=53505024
		Total megabyte-seconds taken by all reduce tasks=6176768
	Map-Reduce Framework
		Map input records=2
		Map output records=4
		Map output bytes=40
		Map output materialized bytes=60
		Input split bytes=205
		Combine input records=4
		Combine output records=4
		Reduce input groups=3
		Reduce shuffle bytes=60
		Reduce input records=4
		Reduce output records=3
		Spilled Records=8
		Shuffled Maps =2
		Failed Shuffles=0
		Merged Map outputs=2
		GC time elapsed (ms)=679
		CPU time spent (ms)=9280
		Physical memory (bytes) snapshot=707444736
		Virtual memory (bytes) snapshot=2677784576
		Total committed heap usage (bytes)=516423680
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=24
	File Output Format Counters 
		Bytes Written=24
[root@localhost bin]#
  1. 查看运行结果
[root@localhost bin]# hdfs dfs -cat /output/wordcount1/*
Hadoop	1
Hello	2
Java	1

参考:http://www.itnose.net/detail/6197823.html

转载于:https://my.oschina.net/himrliu/blog/832425

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值