一、配置环境按照下面网址进行搭建
https://blog.youkuaiyun.com/YL0701/article/details/86589538
二、开启日志监控JobHistory功能需要修改mapred-site.xml和yarn-site.xml
mapred-site.xml里加入下面设置:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/history/done</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/history/done_intermediate</value>
</property>
</configuration>
yarn-site.xml里加入下面设置:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
<!-- 日志聚集功能使能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志聚合目录 -->
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/history/container/logs</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>localhost:19888/jobhistory/logs</value>
</property>
</configuration>
上面两个文件修改完成后需要重启hadoop服务,重启后还需执行
sbin/mr-jobhistory-daemon.sh start historyserver
来启动historyserver
三、使用命令行来进行代码的编译和打包jar文件
首先要修改hadoop-env.sh文件,加入
export HADOOP_CLASSPATH=$PATH:/usr/java/jdk1.8.0_11/lib/tools.jar
此时就可以使用命令行打包了
编译指令(可以多个java文件一起编译):
bin/hadoop com.sun.tools.javac.Main WordCount.java
打包jar文件指令(可以多个class打包在一起):
jar cf wc.jar WordCount*.class