本实验环境:云服务器CentOS 7.4
JDK 1.8
Hadoop 2.7.5
一、安装和配置JDK
1.下载jdk
[root@localhost ~]#wget http://download.oracle.com/otn-pub/java/jdk/8u161-b12/2f38c3b165be4555a1fa6e98c45e0808/jdk-8u161-linux-x64.tar.gz?AuthParam=1520664270_b0eda8a9778a082600265e6b879ccc72
[root@localhost ~]#tar xf jdk-8u161-linux-x64.tar.gz?AuthParam=1520664270_b0eda8a9778a082600265e6b879ccc72 -C /usr/local
[root@localhost local]#cd /usr/local
[root@localhost local]#ln -sv jdk1.8.0_161 jdk
2.配置环境变量
[root@localhost local]#vi /etc/profile
在etc/profile文件中添加以下内容
JAVA_HOME=/usr/local/jdk
PATH=$JAVA_HOME/bin:$PATH
export JAVA_HOME PATH
保存后
[root@localhost local]#source /etc/profile //让配置文件生效
测试jdk是否安装成功
[root@localhost local]#java -version
二、下载和安装Hadoop
1.下载Hadoop
[root@localhost ~]#wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.7.5/hadoop-2.7.5.tar.gz
[root@localhost ~]#mkdir/usr/local/hadoop
[root@localhost ~]#tar xf hadoop-2.7.5.tar.gz -C /usr/local/hadoop
[root@localhost ~]#mkdir/usr/local/hadoop/tmp
[root@localhost ~]#mkdir/usr/local/hadoop/hdfs
[root@localhost ~]#mkdir/usr/local/hadoop/hdfs/data
[root@localhost ~]#mkdir/usr/local/hadoop/hdfs/name
2.配置环境变量
[root@localhost ~]#vi /etc/profile
在etc/profile文件中添加以下内容
# set hadoop path
export HADOOP_HOME=/usr/hadoop/hadoop-2.7.5
export PATH=$PATH:$HADOOP_HOME/bin
保存后
[root@localhost local]#source /etc/profile //让配置文件生效
3,Hadoop配置
进入/usr/local/hadoop/hadoop-2.7.5/etc/hadoop/目录,配置以下六个配置文件如下:
hadoop-env.sh yarn-env.sh core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml
(1)配置hadoop-env.sh
export JAVA_HOME=/usr/local/jdk
(2)配置yarn-env.sh
export JAVA_HOME=/usr/local/jdk
(3)配置core-site.xml
添加以下内容:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>HDFS的URI,文件系统://namenode标识:端口号</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
<description>namenode上本地的hadoop临时文件夹</description>
</property>
</configuration>
(4)配置hdfs-site.xml
添加以下内容:
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/usr/local/hadoop/hdfs/name</value>
<description>namenode上存储hdfs名字空间元数据 </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop/hdfs/data</value>
<description>datanode上数据块的物理存储位置</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>副本个数,配置默认是3,应小于datanode机器数量</description>
</property>
</configuration>
(5)配置mapred-site.xml
添加以下内容:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(6)配置yarn-site.xml
添加以下内容:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:8099</value>
</property>
</configuration>
4,Hadoop启动 (路径:/usr/local/hadoop/)
1)格式化namenode
[root@localhost hadoop]#source hadoop-2.7.5/etc/hadoop/hadoop-env.sh
[root@localhost hadoop]#hadoop namenode -format
2)启动NameNode 和 DataNode 守护进程
[root@localhost hadoop]#./hadoop-2.7.5/sbin/start-dfs.sh
3)启动ResourceManager 和 NodeManager 守护进程
[root@localhost hadoop]#./hadoop-2.7.5/sbin/start-yarn.sh
5,启动验证
1)执行jps命令,有如下进程,说明Hadoop正常启动
[root@localhost hadoop]# jps
20167 Jps
19387 NodeManager
2414 Bootstrap
17903 ResourceManager