hadoop2.x HA配置

1.     安装zookeeper

 

1.      解压缩tar -zxf  zookeeper-3.4.5.tar.gz

2.      conf目录下修改文件名zoo_sample.cfg 改为 zoo.cfg  # mv zoo_sample.cfgzoo.cfg

3.      修改成如下内容即可(每个主机的这个配置文件一样)

dataDir=/export/crawlspace/mahadev/zookeeper/server1/data
clientPort=2181
initLimit=5
syncLimit=2
server.0=172.17.138.67:4888:5888
server.1=172.17.138.68:4888:5888
server.2=172.17.138.69:4888:5888
server.3=172.17.138.70:4888:5888

4.      启动服务,三台电脑先后执行zkServer start 指令,但三台电脑间执行此指令的间隔不宜过久,如果没有出,错则成功启动

5.  执行测试,在一台机器执行下面操作,要保证能成功连接,否则后面的hadoop namenode可能都是standby模式

#bin/zkCli.sh -server 127.0.0.1:2181

2.     修改hadoop配置

#vi core-site.xml
<configuration>
    
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/hadoopData/tmp</value>
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>
 
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>EBPPTEST01:2181,EBPPTEST02:2181,EBPPTEST03:2181</value>
    </property>
  
    <property>
        <name>hadoop.proxyuser.hadoop.hosts</name>
        <value>172.17.138.67</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hadoop.groups</name>
        <value>*</value>
    </property>
 
</configuration>
 
 
#vi mapred-site.xml
<configuration>
    
    <property>  
        <name>mapreduce.framework.name</name>  
        <value>yarn</value>   
    </property>
 
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>EBPPTEST01:10020</value>
    </property>
 
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>EBPPTEST01:19888</value>
    </property>
 
    <property>
        <name>mapreduce.tasktracker.map.tasks.maximum</name>
        <value>3</value>
    </property>
    
    <property>
        <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
        <value>3</value>
    </property>
 
</configuration>
 
 
#vi hdfs-site.xml
<configuration>
 
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
 
    <property>  
        <name>dfs.namenode.name.dir</name>  
        <value>/home/hadoop/hadoopData/filesystem/name</value>  
    </property>
 
    <property>  
        <name>dfs.datanode.data.dir</name>  
        <value>/data/data,/home/hadoop/hadoopData/filesystem/data</value>  
    </property>
 
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>
 
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>
 
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>EBPPTEST01:9000</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>EBPPTEST01:50070</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>EBPPTEST02:9000</value>
    </property>     
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>EBPPTEST02:50070</value>
    </property>
    
    <property>
        <name>dfs.namenode.shared.edits.dir</name>  
        <value>qjournal://EBPPTEST01:8485;EBPPTEST02:8485;EBPPTEST03:8485/mycluster</value>  
    </property>
 
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
 
 
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>
 
    <property>
       <name>dfs.ha.fencing.ssh.private-key-files</name>
       <value>/home/hadoop/.ssh/id_dsa</value>
    </property>
 
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/home/hadoop/hadoopData/journalData</value>
    </property>
 
    <property>  
        <name>dfs.ha.automatic-failover.enabled</name>  
        <value>true</value>  
</property>
 
</configuration>
 
 
#vi yarn-site.xml
<configuration>
    <property>  
         <name>yarn.nodemanager.aux-services</name>  
         <value>mapreduce_shuffle</value>   
    </property>  
            
    <property>  
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>  
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>  
    </property>  
</configuration>

 

3.     部署

1.      启动JournalNode集群

在需要部署JournalNode的节点执行命令 hadoop-daemon.sh  start  journalnode

2.      格式化QJM文件系统

./hdfs namenode-initializeSharedEdits

3.      启动之前的namenode

hadoop-daemon.sh  start namenode

4.      格式化另一个NameNode,需要在另一台机器上执行

./hdfs namenode  -bootstrapStandby

5.      关闭HDFS

stop-dfs.sh

6.      在zookeeper中安装HA

hdfs zkfc -formatZK

7.      启动集群

start-dfs.sh

start-yarn.sh

 

 

参考文档:http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html

http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值