001 Hadoop分布式集群搭建

本文档详细介绍了在CentOS7系统中安装配置Hadoop2.7.4集群的步骤,包括基础环境配置(如hosts、防火墙、SSH密钥),JDK的安装,Hadoop的下载、解压、配置以及启动过程。在配置过程中,涉及到hadoop-env.sh、hdfs-site.xml、yarn-site.xml、core-site.xml等文件的修改,以及slaves文件的设定。最后,通过jps命令验证了Hadoop进程的启动情况,并展示了如何访问Hadoop Web UI。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

安装说明

OS: **CentOS 7** 
hadoop: **hadoop-2.7.4**
操作工具:**Xshell** (可同时向多个机器传输命令)

基础环境配置

  1. 所有节点 修改hosts

    [root@hadoop-01 ~]# vi /etc/hosts
    
    # 新增
    192.168.74.139 hadoop-01
    192.168.74.140 hadoop-02
    192.168.74.141 hadoop-03
    
  2. 所有节点 防火墙配置

    [root@hadoop-01 ~]# systemctl status firewalld.service
    
    
    ● firewalld.service - firewalld - dynamic firewall daemon
       Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
       Active: active (running) since Wed 2021-07-21 19:44:31 +08; 19min ago
         Docs: man:firewalld(1)
     Main PID: 717 (firewalld)
       CGroup: /system.slice/firewalld.service
               └─717 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid
    
    Jul 21 19:44:29 hadoop-01 systemd[1]: Starting firewalld - dynamic firewall daemon...
    Jul 21 19:44:31 hadoop-01 systemd[1]: Started firewalld - dynamic firewall daemon.
    Jul 21 19:44:31 hadoop-01 firewalld[717]: WARNING: AllowZoneDrifting is enabled. This is considered an insecure configuration option. It will be removed in a future release. Please consider disabling it now.
    
    [root@hadoop-01 ~]# systemctl stop firewalld.service
    [root@hadoop-01 ~]# systemctl disable firewalld.service
    
    Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
    Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. 
    

    注:虚拟机可以直接关闭防火墙, 阿里云需要配置防火请策略

  3. 所有节点 生成 ssh密钥

    [root@hadoop-01 ~]# cd
    [root@hadoop-01 ~]# ssh-keygen -t dsa
    
    Generating public/private dsa key pair.
    Enter file in which to save the key (/root/.ssh/id_dsa): 
    Created directory '/root/.ssh'.
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
    Your identification has been saved in /root/.ssh/id_dsa.
    Your public key has been saved in /root/.ssh/id_dsa.pub.
    The key fingerprint is:
    SHA256:C4IRep0I9fGDxkMFgcDS/450hjrSsA6xp4rDu23oZLw root@hadoop-01
    The key's randomart image is:
    +---[DSA 1024]----+
    |++o.=+.          |
    |.+oB =           |
    |o +.O o          |
    | . +.. .         |
    |. . .o. S        |
    |oo  o.+. .       |
    |+Boo =  .        |
    |O=*.. .          |
    |BE+o             |
    +----[SHA256]-----+
    
    [root@hadoop-01 ~]# cd /root/.ssh/
    [root@hadoop-01 .ssh]# cat id_dsa.pub >> authorized_keys
    

    注: 若原来有密钥,则先删除

  4. 密钥复制

    # hadoop-01上执行
    [root@hadoop-01 .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03
    
    # hadoop-02上执行
    [root@hadoop-02 .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03
    
    # hadoop-03上执行
    [root@hadoop-03 .ssh]# scp /root/.ssh/authorized_keys hadoop-01:/root/.ssh/
    [root@hadoop-03 .ssh]# scp /root/.ssh/authorized_keys hadoop-02:/root/.ssh/
    
  5. 测试ssh登陆

    [root@hadoop-01 ~]# ssh hadoop-03
    Last login: Wed Jul 21 23:13:22 2021 from 192.168.74.1
    [root@hadoop-03 ~]#
    

JDK配置

  1. 所有节点 检查有无系统自带java

    [root@hadoop-01 ~]#  rpm -qa |grep java
    [root@hadoop-01 ~]# 
    

    注: 由于本系统使用的centos7 mini 所有没有java相关的包, 若有请使用命令: ‘rpm -e --nodeps 软件名’ 卸载

  2. 所有节点 根目录创建software目录,用于安装软件并传输jdk至software

    [root@hadoop-01 ~]# mkdir /software
    [root@hadoop-01 ~]# cd /software/
    [root@hadoop-01 software]# scp jdk-8u181-linux-x64.tar.gz root@hadoop-02:/software	
    [root@hadoop-01 software]# scp jdk-8u181-linux-x64.tar.gz root@hadoop-03:/software
    [root@hadoop-01 software]# 
    
  3. 所有节点 解压文件并重命名

    [root@hadoop-01 software]# tar -zxvf jdk-8u181-linux-x64.tar.gz
    [root@hadoop-01 software]# mv jdk1.8.0_181/ jdk
    
  4. 所有节点 修改配置

    # 文件备份
    [root@hadoop-01 software]# cp /etc/profile /etc/profile_back  
    [root@hadoop-01 ~]# vi /etc/profile
    
    # 添加
    export JAVA_HOME=/software/jdk
    export PATH=.:$PATH:$JAVA_HOME/bin
    
    [root@hadoop-01 ~]# source /etc/profile
    [root@hadoop-01 ~]# java -version
    
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
    

hadoop安装

注:先在一台机器上面安装,然后复制到其他的机器上面即可。

  1. hadoop-01 解压并重命名hadoop

    [root@hadoop-01 software]# tar -xzvf hadoop-2.7.4.tar.gz
    [root@hadoop-01 software]# mv hadoop-2.7.4 hadoop
    
  2. 所有节点 配置路径

    [root@hadoop-01 ~]# vi /etc/profile
    
    export HADOOP_HOME=/software/hadoop
    export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    [root@hadoop-01 ~]# source /etc/profile
    
  3. hadoop-01 hadoop-env.sh 配置

    [root@hadoop-01 hadoop]# cd /software/hadoop/etc/hadoop/
    [root@hadoop-01 hadoop]# vi hadoop-env.sh
    
    # 添加
    export JAVA_HOME=/software/jdk
    export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin
    export HADOOP_PID_DIR=/software/hadoop/pids
    
  4. hadoop-01 hdfs-site.xml 配置

    [root@hadoop-01 hadoop]# vi  hdfs-site.xml
    

    <configuration></configuration> 中添加

    <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:///software/hadoop/data/datanode</value>
    </property>
    <property>
     <name>dfs.namenode.name.dir</name>
     <value>file:///software/hadoop/data/namenode</value>
    </property>
    <property>
     <name>dfs.namenode.http-address</name>
     <value>hadoop-01:50070</value>
    </property>
    <property>
     <name>dfs.namenode.secondary.http-address</name>
     <value>hadoop-02:50090</value>
    </property>
    <property>
     <name>dfs.replication</name>
     <value>1</value>
    </property>
    
  5. hadoop-01 yarn-site.xml 配置

    [root@hadoop-01 hadoop]# vi yarn-site.xml
    

    <configuration></configuration> 中添加

    <property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
    </property>
    <property>
     <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
     <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
     <name>yarn.resourcemanager.resource-tracker.address</name>
     <value>hadoop-01:8025</value>
    </property>
    <property>
     <name>yarn.resourcemanager.scheduler.address</name>
     <value>hadoop-01:8030</value>
    </property>
    <property>
     <name>yarn.resourcemanager.address</name>
     <value>hadoop-01:8050</value>
    </property>
    
  6. hadoop-01 core-site.xml 配置

    [root@hadoop-01 hadoop]# vi core-site.xml
    

    <configuration></configuration> 中添加

    <property>
     <name>fs.defaultFS</name>
     <value>hdfs://hadoop-01/</value>
    </property>
    <property>
     <name>ha.zookeeper.quorum</name>
     <value>hadoop-01:2181,hadoop-02:2181,hadoop-03:2181</value>
    </property>
    
  7. hadoop-01 slaves 配置

    # slaves文件指明哪些节点运行DateNode进程
    
    [root@hadoop-01 hadoop]# vi slaves
    
    # 替换
    hadoop-02
    hadoop-03
    
  8. hadoop-01 yarn-env.sh 配置

    [root@hadoop-01 hadoop]# vi yarn-env.sh
    
    # 添加
    export YARN_PID_DIR=/software/hadoop/pids
    
  9. hadoop-01 拷贝hadoop到其它的节点

    [root@hadoop-01 software]# scp -r /software/hadoop hadoop-02:/software/
    [root@hadoop-01 software]# scp -r /software/hadoop hadoop-03:/software/
    
  10. 所有节点

    [root@hadoop-01 hadoop]# source /etc/profile
    

启动/停止 Hadoop 集群

  1. hadoop-01 格式化文件系统

    [root@hadoop-01 ~]# hdfs namenode -format
    

    结果中出现successfully 则表示初始化成功。
    在这里插入图片描述

  2. hadoop-01 start-dfs.sh 启动报错

    [root@hadoop-01 sbin]# cd /software/hadoop/sbin/
    [root@hadoop-01 sbin]# start-dfs.sh
    
    Starting namenodes on [hadoop-01]
    The authenticity of host 'hadoop-01 (192.168.74.139)' can't be established.
    ECDSA key fingerprint is SHA256:GYHhhWIAfdzHvTH54Vq36wY0IBckonbF6oPFb4k0ALc.
    ECDSA key fingerprint is MD5:9d:0f:94:a1:f8:98:ab:1c:c9:54:0f:87:88:91:57:ec.
    Are you sure you want to continue connecting (yes/no)? yes
    hadoop-01: Warning: Permanently added 'hadoop-01,192.168.74.139' (ECDSA) to the list of known hosts.
    hadoop-01: Error: JAVA_HOME is not set and could not be found.
    hadoop-03: Error: JAVA_HOME is not set and could not be found.
    hadoop-02: Error: JAVA_HOME is not set and could not be found.
    Starting secondary namenodes [hadoop-02]
    hadoop-02: Error: JAVA_HOME is not set and could not be found.
    
    

    解决办法: 在所有节点上修改/etc/hadoop/hadoop-env.sh 中设 JAVA_HOME

    [root@hadoop-01 sbin]# vi /software/hadoop/etc/hadoop/hadoop-env.sh
    
    # 添加
    export JAVA_HOME=/software/jdk
    
    # 重新启动 
    [root@hadoop-01 sbin]# start-dfs.sh
    
    Starting namenodes on [hadoop-01]
    hadoop-01: starting namenode, logging to /software/hadoop/logs/hadoop-root-namenode-hadoop-01.out
    hadoop-03: starting datanode, logging to /software/hadoop/logs/hadoop-root-datanode-hadoop-03.out
    hadoop-02: datanode running as process 2078. Stop it first.
    Starting secondary namenodes [hadoop-02]
    hadoop-02: starting secondarynamenode, logging to /software/hadoop/logs/hadoop-root-secondarynamenode-hadoop-02.out
    
  3. hadoop-01 start-yarn.sh 启动

    [root@hadoop-01 sbin]#  start-yarn.sh
    
    starting yarn daemons
    starting resourcemanager, logging to /software/hadoop/logs/yarn-root-resourcemanager-hadoop-01.out
    hadoop-03: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-03.out
    hadoop-02: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-02.out
    
  4. 各个节点进程

    [root@hadoop-01 ~]# jps
    
    1889 ResourceManager
    1611 NameNode
    2142 Jps
    
    [root@hadoop-02 ~]# jps
    
    1712 NodeManager
    1534 DataNode
    1631 SecondaryNameNode
    1839 Jps
    
    [root@hadoop-03 ~]# jps
    
    1640 NodeManager
    1531 DataNode
    1772 Jps
    
  5. 浏览器访问:http://192.168.74.139:50070

    在这里插入图片描述

  6. 浏览器访问:http://192.168.74.139:8088
    在这里插入图片描述

集群安装完成!!!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值