安装说明
OS: **CentOS 7**
hadoop: **hadoop-2.7.4**
操作工具:**Xshell** (可同时向多个机器传输命令)
基础环境配置
-
所有节点 修改hosts
[root@hadoop-01 ~]# vi /etc/hosts # 新增 192.168.74.139 hadoop-01 192.168.74.140 hadoop-02 192.168.74.141 hadoop-03
-
所有节点 防火墙配置
[root@hadoop-01 ~]# systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2021-07-21 19:44:31 +08; 19min ago Docs: man:firewalld(1) Main PID: 717 (firewalld) CGroup: /system.slice/firewalld.service └─717 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid Jul 21 19:44:29 hadoop-01 systemd[1]: Starting firewalld - dynamic firewall daemon... Jul 21 19:44:31 hadoop-01 systemd[1]: Started firewalld - dynamic firewall daemon. Jul 21 19:44:31 hadoop-01 firewalld[717]: WARNING: AllowZoneDrifting is enabled. This is considered an insecure configuration option. It will be removed in a future release. Please consider disabling it now. [root@hadoop-01 ~]# systemctl stop firewalld.service [root@hadoop-01 ~]# systemctl disable firewalld.service Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
注:虚拟机可以直接关闭防火墙, 阿里云需要配置防火请策略
-
所有节点 生成 ssh密钥
[root@hadoop-01 ~]# cd [root@hadoop-01 ~]# ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: SHA256:C4IRep0I9fGDxkMFgcDS/450hjrSsA6xp4rDu23oZLw root@hadoop-01 The key's randomart image is: +---[DSA 1024]----+ |++o.=+. | |.+oB = | |o +.O o | | . +.. . | |. . .o. S | |oo o.+. . | |+Boo = . | |O=*.. . | |BE+o | +----[SHA256]-----+ [root@hadoop-01 ~]# cd /root/.ssh/ [root@hadoop-01 .ssh]# cat id_dsa.pub >> authorized_keys
注: 若原来有密钥,则先删除
-
密钥复制
# hadoop-01上执行 [root@hadoop-01 .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03 # hadoop-02上执行 [root@hadoop-02 .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03 # hadoop-03上执行 [root@hadoop-03 .ssh]# scp /root/.ssh/authorized_keys hadoop-01:/root/.ssh/ [root@hadoop-03 .ssh]# scp /root/.ssh/authorized_keys hadoop-02:/root/.ssh/
-
测试ssh登陆
[root@hadoop-01 ~]# ssh hadoop-03 Last login: Wed Jul 21 23:13:22 2021 from 192.168.74.1 [root@hadoop-03 ~]#
JDK配置
-
所有节点 检查有无系统自带java
[root@hadoop-01 ~]# rpm -qa |grep java [root@hadoop-01 ~]#
注: 由于本系统使用的centos7 mini 所有没有java相关的包, 若有请使用命令: ‘rpm -e --nodeps 软件名’ 卸载
-
所有节点 根目录创建software目录,用于安装软件并传输jdk至software
[root@hadoop-01 ~]# mkdir /software [root@hadoop-01 ~]# cd /software/ [root@hadoop-01 software]# scp jdk-8u181-linux-x64.tar.gz root@hadoop-02:/software [root@hadoop-01 software]# scp jdk-8u181-linux-x64.tar.gz root@hadoop-03:/software [root@hadoop-01 software]#
-
所有节点 解压文件并重命名
[root@hadoop-01 software]# tar -zxvf jdk-8u181-linux-x64.tar.gz [root@hadoop-01 software]# mv jdk1.8.0_181/ jdk
-
所有节点 修改配置
# 文件备份 [root@hadoop-01 software]# cp /etc/profile /etc/profile_back [root@hadoop-01 ~]# vi /etc/profile # 添加 export JAVA_HOME=/software/jdk export PATH=.:$PATH:$JAVA_HOME/bin [root@hadoop-01 ~]# source /etc/profile [root@hadoop-01 ~]# java -version java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
hadoop安装
注:先在一台机器上面安装,然后复制到其他的机器上面即可。
-
hadoop-01 解压并重命名hadoop
[root@hadoop-01 software]# tar -xzvf hadoop-2.7.4.tar.gz [root@hadoop-01 software]# mv hadoop-2.7.4 hadoop
-
所有节点 配置路径
[root@hadoop-01 ~]# vi /etc/profile export HADOOP_HOME=/software/hadoop export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin [root@hadoop-01 ~]# source /etc/profile
-
hadoop-01 hadoop-env.sh 配置
[root@hadoop-01 hadoop]# cd /software/hadoop/etc/hadoop/ [root@hadoop-01 hadoop]# vi hadoop-env.sh # 添加 export JAVA_HOME=/software/jdk export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin export HADOOP_PID_DIR=/software/hadoop/pids
-
hadoop-01 hdfs-site.xml 配置
[root@hadoop-01 hadoop]# vi hdfs-site.xml
<configuration></configuration> 中添加
<property> <name>dfs.datanode.data.dir</name> <value>file:///software/hadoop/data/datanode</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///software/hadoop/data/namenode</value> </property> <property> <name>dfs.namenode.http-address</name> <value>hadoop-01:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop-02:50090</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property>
-
hadoop-01 yarn-site.xml 配置
[root@hadoop-01 hadoop]# vi yarn-site.xml
<configuration></configuration> 中添加
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop-01:8025</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop-01:8030</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hadoop-01:8050</value> </property>
-
hadoop-01 core-site.xml 配置
[root@hadoop-01 hadoop]# vi core-site.xml
<configuration></configuration> 中添加
<property> <name>fs.defaultFS</name> <value>hdfs://hadoop-01/</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop-01:2181,hadoop-02:2181,hadoop-03:2181</value> </property>
-
hadoop-01 slaves 配置
# slaves文件指明哪些节点运行DateNode进程 [root@hadoop-01 hadoop]# vi slaves # 替换 hadoop-02 hadoop-03
-
hadoop-01 yarn-env.sh 配置
[root@hadoop-01 hadoop]# vi yarn-env.sh # 添加 export YARN_PID_DIR=/software/hadoop/pids
-
hadoop-01 拷贝hadoop到其它的节点
[root@hadoop-01 software]# scp -r /software/hadoop hadoop-02:/software/ [root@hadoop-01 software]# scp -r /software/hadoop hadoop-03:/software/
-
所有节点
[root@hadoop-01 hadoop]# source /etc/profile
启动/停止 Hadoop 集群
-
hadoop-01 格式化文件系统
[root@hadoop-01 ~]# hdfs namenode -format
结果中出现successfully 则表示初始化成功。
-
hadoop-01 start-dfs.sh 启动报错
[root@hadoop-01 sbin]# cd /software/hadoop/sbin/ [root@hadoop-01 sbin]# start-dfs.sh Starting namenodes on [hadoop-01] The authenticity of host 'hadoop-01 (192.168.74.139)' can't be established. ECDSA key fingerprint is SHA256:GYHhhWIAfdzHvTH54Vq36wY0IBckonbF6oPFb4k0ALc. ECDSA key fingerprint is MD5:9d:0f:94:a1:f8:98:ab:1c:c9:54:0f:87:88:91:57:ec. Are you sure you want to continue connecting (yes/no)? yes hadoop-01: Warning: Permanently added 'hadoop-01,192.168.74.139' (ECDSA) to the list of known hosts. hadoop-01: Error: JAVA_HOME is not set and could not be found. hadoop-03: Error: JAVA_HOME is not set and could not be found. hadoop-02: Error: JAVA_HOME is not set and could not be found. Starting secondary namenodes [hadoop-02] hadoop-02: Error: JAVA_HOME is not set and could not be found.
解决办法: 在所有节点上修改/etc/hadoop/hadoop-env.sh 中设 JAVA_HOME
[root@hadoop-01 sbin]# vi /software/hadoop/etc/hadoop/hadoop-env.sh # 添加 export JAVA_HOME=/software/jdk # 重新启动 [root@hadoop-01 sbin]# start-dfs.sh Starting namenodes on [hadoop-01] hadoop-01: starting namenode, logging to /software/hadoop/logs/hadoop-root-namenode-hadoop-01.out hadoop-03: starting datanode, logging to /software/hadoop/logs/hadoop-root-datanode-hadoop-03.out hadoop-02: datanode running as process 2078. Stop it first. Starting secondary namenodes [hadoop-02] hadoop-02: starting secondarynamenode, logging to /software/hadoop/logs/hadoop-root-secondarynamenode-hadoop-02.out
-
hadoop-01 start-yarn.sh 启动
[root@hadoop-01 sbin]# start-yarn.sh starting yarn daemons starting resourcemanager, logging to /software/hadoop/logs/yarn-root-resourcemanager-hadoop-01.out hadoop-03: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-03.out hadoop-02: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-02.out
-
各个节点进程
[root@hadoop-01 ~]# jps 1889 ResourceManager 1611 NameNode 2142 Jps
[root@hadoop-02 ~]# jps 1712 NodeManager 1534 DataNode 1631 SecondaryNameNode 1839 Jps
[root@hadoop-03 ~]# jps 1640 NodeManager 1531 DataNode 1772 Jps
-
浏览器访问:http://192.168.74.139:50070
-
浏览器访问:http://192.168.74.139:8088