hadoop examples(wordcount.class)例子

本文详细介绍如何使用Hadoop MapReduce进行WordCount统计,包括环境搭建、文件准备、任务执行及结果查看,适合初学者实践。

今天我们学习了一个hadoop下的examples,全名是hadoop-mapreduce-examples-2.8.0.jar
hadoop-mapreduce-examples-2.8.0.jar下,有很多的算法,用来实现很多的功能。其中有个wordcount.txt,功能是统计文件内容的个数(按空格分隔)。

例题

开启hadoop

[root@Tyler01 ~]# start-all.sh

创建一个有内容的文件 word.txt

[root@Tyler01 ~]# vi word.txt

hello tyler
hello kopmgkomg
hello tylerhjghjghjghjgjh
hello as
hello pp
hello as
hello pp
hello as
hello daniu
hello daniu

在hdfs上创建个目录

[root@Tyler01 ~]# hadoop fs -mkdir /wc/

将文件上传到hdfs上

hadoop fs -put ./word.txt /wc/input

执行hadoop官方提供的mapreduce的wordcount的例子
首先进入到/home/tyler/apps/hadoop-2.8.0/share/hadoop/mapreduce目录下

[root@Tyler01 mapreduce]#hadoop jar hadoop-mapreduce-examples-2.8.0.jar wordcount /wc/word.txt /wc/output/

命令说明:
hadoop jar :用hadoop发方式运行jar文件
hadoop-mapreduce-examples-2.8.0.jar:具体的jar文件
wordcountjar文件中的具体类
/wc/input/wordcount.txt:word类运行需要的第一个参数,hdfs文件系统的输入目录
/wc/output/:word类运行需要的第二个参数,hdfs文件系统的输出目录

查看执行完word后,hdfs的输出目录,最后的计算结果如下:

hadoop fs -ls /wc/out
hadoop fs -cat /wc/out/part-r-00000
as	3
daniu	2
hello	10
kopmgkomg	1
pp	2
tyler	1
tylerhjghjghjghjgjh	1

执行完后,在192.168.72.110:8088下查看执行的状态。若是以下结果,则说明执行完毕。
在这里插入图片描述

org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://hadoop1:8020/wordcount/output already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:277) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1576) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1573) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1573) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1594) at org.apache.hadoop.examples.WordCount.main(WordCount.java:87) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
10-17
duce-examples-2.7.6.jar wordcount /testdata/input /testdata/output/result java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1260) at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1256) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1758) at org.apache.hadoop.mapreduce.Job.connect(Job.java:1256) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1284) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308) at org.apache.hadoop.examples.WordCount.main(WordCount.java:87) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
06-19
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar \ > wordcount \ > /dfstest/email_log.txt \ > /dfstest/output 2025-10-24 09:53:50,758 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at master/192.168.116.12:8032 org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://master:8020/dfstest/output already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:277) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1678) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1675) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1675) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1696) at org.apache.hadoop.examples.WordCount.main(WordCount.java:87) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.
最新发布
10-25
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值