Hadoop 自带WordCount 操作步骤_hadoop自带的wordcount-优快云博客

本文链接：https://blog.youkuaiyun.com/rain_qingtian/article/details/69791799

本文介绍了如何利用Hadoop自带的WordCount程序处理文本数据。首先，通过hadoop fs命令上传数据到HDFS，接着将test1.txt文件放入指定目录。然后，使用hadoop jar命令运行WordCount示例，指定输入和输出路径。最后，展示如何通过hadoop dfs命令查看处理后的结果。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

运行一个wordcount 任务的命令：bin/hadoop jar /usr/hddemo/wordcount.jar 包名.WordCount input output

说明：input 指定的是执行map任务是的数据源所在目录，output 是指定reduce任务执行完后将结果输出的目录

data在配置文件配完后是不需要见这个目录的
name目录是执行 hadoop namenode -format 才会生成的目录，也不是我们手动建的;

countworld的基本流程

在linux一个input目录下见两个文件
echo "Hello world Hello me! cwq solo" >test1.txt
echo " Hello world Hello you! solo" >test2.txt

hadoop fs -put /input/ /input

bin/hadoop jar /usr/hddemo/wordcount.jar 包名.WordCount input output
2.6以后的版本不用指定类名
bin/hadoop jar /usr/hddemo/wordcount.jar input output

Hadoop自带的运行 wordcount 例子的 jar 包在
/share/hadoop/mapreduce/hadoop-mapreduce-example.jar
用这个包要这样写

hadoop jar /home/yanzefeng/apps/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /input /output/wordcount1

查看文件内容
hdfs dfs -cat /output/wordcount1/*

create database shizhan05;
create table t_sz01(id int,name string) row format delimited fields terminated by ',';

数据

hadoop fs -put sz.dat /user/hive/warehouse/shizhan05.db/t_sz01

----------------------------------------------

hadoop dfs -put test1.txt /

/home/hadoop/bigdater/hadoop-2.5.0-cdh5.3.6/share/hadoop/mapreduce

hadoop jar hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /test1.txt /output/1234

hadoop dfs -cat /output/1234/part-r-00000