hadoop-高可用搭建实验

本文详细介绍使用Docker搭建Hadoop高可用集群的过程,包括节点规划、Docker镜像设计及集群的启动与停止等关键步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

hadoop-高可用搭建实验



节点规划

功能h1h2h3h4h5
namenodeyesyesnonono
datanodenonoyesyesyes
resource manageryesyesnonono
journalnodeyesyesyesyesyes
zookeeperyesyesyesyesyes

Docker Image设计

为便于Image重用,设计如下:
    1. base image
        from centos:6
        构建基础环境:时区、ssh免秘钥登录
    2. java image
        from base
        构建java运行环境:jdk安装、java环境变量设置
    3. zoo image
        from java
        构建zookeeper集群:zookeeper安装、环境变量设置、基本配置
    4. hadoop image
        from zoo
        构建hadoop集群:hadoop安装、环境变量设置、基本配置

步骤

Base Dockerfile

FROM centos:6
MAINTAINER Charles Yu
#设置时区
RUN cat /usr/share/zoneinfo/Asia/Shanghai > /etc/localtime 
#安装ssh服务端和客户端
RUN yum install -y openssh-server
RUN yum install -y openssh-clients
RUN yum clean all
#配置ssh服务端
RUN sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
RUN ssh-keygen -t dsa -N "" -f /etc/ssh/ssh_host_dsa_key
RUN ssh-keygen -t rsa -N "" -f /etc/ssh/ssh_host_rsa_key
RUN echo "root:root"|chpasswd
#实现免秘钥登录
COPY ssh_config /root/.ssh/config
COPY id_rsa /root/.ssh/id_rsa
COPY id_rsa.pub /root/.ssh/authorized_keys
RUN chmod 600 /root/.ssh/*
RUN chown root:root /root/.ssh/config
#运行启动sshd
RUN mkdir /var/run/sshd
EXPOSE 22
CMD ["/usr/sbin/sshd","-D"]

java Dockerfile

FROM base
ADD jdk-7u80-linux-x64.tar.gz /usr/local
RUN ln -s /usr/local/jdk1.7.0_80 /usr/local/jdk;
ENV JAVA_HOME /usr/local/jdk
ENV CLASSPATH .:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
ENV PATH $JAVA_HOME/bin:$PATH
RUN echo "export JAVA_HOM=$JAVA_HOME" >> /etc/profile
RUN echo "export CLASSPATH=$CLASSPATH" >> /etc/profile
RUN echo "export PATH=$PATH" >> /etc/profile

zookeeper Dockerfile

FROM java

ADD zookeeper-3.4.6.tar.gz /usr/local
RUN ln -s /usr/local/zookeeper-3.4.6 /usr/local/zookeeper
COPY zoo.cfg /usr/local/zookeeper/conf/
COPY tools/* /usr/local/tools/
RUN chmod a+x /usr/local/tools/*
RUN mkdir -p /root/data/zookeeper/zkdata
RUN mkdir -p /root/data/zookeeper/zkdatalog
ENV ZOOKEEPER_HOME /usr/local/zookeeper
ENV PATH /usr/local/tools:$ZOOKEEPER_HOME/bin:$PATH
RUN echo "export ZOOKEEPER_HOME=$ZOOKEEPER_HOME" >> /etc/profile
RUN echo "export PATH=$PATH" >> /etc/profile
EXPOSE 2181 2888 3888

hadoop Dockerfile

FROM zoo
ADD  hadoop-2.6.0.tar.gz /usr/local/
RUN ln -s /usr/local/hadoop-2.6.0 /usr/local/hadoop
ENV HADOOP_HOME /usr/local/hadoop
COPY core-site.xml $HADOOP_HOME/etc/hadoop/
COPY hdfs-site.xml $HADOOP_HOME/etc/hadoop/
COPY hadoop-env.sh $HADOOP_HOME/etc/hadoop/
COPY slaves $HADOOP_HOME/etc/hadoop/
COPY mapred-site.xml $HADOOP_HOME/etc/hadoop/
COPY yarn-site.xml $HADOOP_HOME/etc/hadoop/
RUN mkdir /root/data/journaldata

ENV PATH $HADOOP_HOME/bin:$PATH
RUN echo "export HADOOP_HOME=$HADOOP_HOME" >> /etc/profile
RUN echo "PATH=$PATH" >> /etc/profile

EXPOSE 9000 50070 50020
EXPOSE 8485 2181 8032 8034 8088

container启动

正常启动container,需注意hostname需与hadoop,zookeeper中的配置一致。

sudo docker run -it --net=hadoop --name h1 --hostname h1  -p 50070:50070 -p 8088:8088 -p 50020:50020 ymt/hadoop
sudo docker run -it --net=hadoop --name h2 --hostname h2 ymt/hadoop
sudo docker run -it --net=hadoop --name h3 --hostname h3 ymt/hadoop
sudo docker run -it --net=hadoop --name h4 --hostname h4 ymt/hadoop
sudo docker run -it --net=hadoop --name h5 --hostname h5 ymt/hadoop

集群首次启动

#登录h1,后配置各集群myid
ssh root@h1 "echo 1 > data/zookeeper/zkdata/myid"
ssh root@h2 "echo 2 > data/zookeeper/zkdata/myid"
ssh root@h3 "echo 3 > data/zookeeper/zkdata/myid"
ssh root@h4 "echo 4 > data/zookeeper/zkdata/myid"
ssh root@h5 "echo 5 > data/zookeeper/zkdata/myid"

#启动zookeeper,检查zookeeper状态
runRemoteCmd.sh "/usr/local/zookeeper/bin/zkServer.sh start" zookeeper
runRemoteCmd.sh "/usr/local/zookeeper/bin/zkServer.sh status" zookeeper

zoo
正常情况可以看到一个leader,四个follow。

#启动journalnode
runRemoteCmd.sh "/usr/local/hadoop/sbin/hadoop-daemon.sh start journalnode" all
#进入hadoop安装目录
cd /usr/local/hadoop
#格式化namenode
bin/hdfs namenode -format
#格式化zk
bin/hdfs zkfc -formatZK
#启动namenode
bin/hdfs namenode

#打开另一个终端,登录h2
#进行namenode同步
bin/hdfs namenode -bootstrapStandby

#同步结束后,返回h1终端
ctrl-c //停止namenode
#停止journalnode
runRemoteCmd.sh "/usr/local/hadoop/sbin/hadoop-daemon.sh stop journalnode" all 
#启动hdfs集群
sbin/start-dfs.sh
#查看运行状态
runRemoteCmd.sh "jps" all

hdfs

#h1终端启动yarn 
sbin/start-yarn.sh 
#h2终端启动resourcemanager
sbin/yarn-daemon.sh start resourcemanager
#查看yarn运行状态
bin/yarn rmadmin -getServiceState rm1
bin/yarn rmadmin -getServiceState rm2

yarn
一个为active,一个为standby
至此集群搭建完成。
通过IP:50070访问HDFS
通过IP:8088访问查看任务

#测试任务
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /test/a.txt /test/out/
#执行完,可以查看下结果
hdfs dfs -cat /test/out/*

集群停止

#h2终端执行
sbin/yarn-daemon.sh stop resourcemanager
#h1终端执行
sbin/stop-yarn.sh 
sbin/stop-dfs.sh
runRemoteCmd.sh "/usr/local/zookeeper/bin/zkServer.sh stop" zookeeper

#停止container
sudo docker stop h1
sudo docker stop h2
sudo docker stop h3
sudo docker stop h4
sudo docker stop h5

集群启动

#启动container
sudo docker start h1
sudo docker start h2
sudo docker start h3
sudo docker start h4
sudo docker start h5
#h1终端执行
cd /usr/local/hadoop
runRemoteCmd.sh "/usr/local/zookeeper/bin/zkServer.sh start" zookeeper
sbin/start-dfs.sh
sbin/start-yarn.sh
#h2终端执行
cd /usr/local/hadoop
sbin/yarn-daemon.sh start resourcemanager
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值