由于是本地模式运行,所以需要导入windows下的hadoop.dll和winutils.exe,否则将会抱错
java.lang.UnsatisfiedLinkError,并且最好要配置本地HADOOP_HOME环境变量
并且配置path指向%HADOOP_HOME%/bin;%HADOOP_HOME%/sbin;
Driver启动类代码中需要如下配置:
Configuration configuration = new Configuration();
configuration.set("mapreduce.framework.name", "local"); #必须要设置
configuration.set("mapreduce.app-submission.cross-platform", "true");#必须要设置
1.数据在hdfs
configuration.set("fs.defaultFS", "hdfs://bigdata131:9000/");
args = new String[]{"/input/input.txt", "/out1"};
2.数据在本地
args = new String[]{"C:\\Users\\Administrator\\Desktop\\input.txt", "C:\\Users\\Administrator\\Desktop\\output3"};
configuration.set("fs.defaultFS", "file:///"); #可以不设置
你可能还需要打印日志信息:resources/log4j.properties
log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
pom.xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.3</version>
</dependency>