集群上使用
jar包
- 首先将之前
FileExist文件进行打包,得到.jar文件:
- 将其拷贝到集群中,并使用
hadoop jar命令运行:
WordCount
添加依赖
- 首先我们需要新建一个
WordCount项目,首先要添加Hadoop的包依赖
/usr/local/hadoop/share/hadoop/common
hadoop-common-xxx.jar
hadoop-nfs-xxx.jar
/usr/local/hadoop/share/hadoop/common/lib 下的所有Jar包
/usr/local/hadoop/share/hadoop/mapreduce该目录下所有JAR包
/usr/local/hadoop/share/hadoop/mapreduce/lib目录下所有JAR包

编写程序
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;