1、YARN-HA架构原理介绍

2、配置yarn-site.xml
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>10000</value>
</property>
<!-- HA not use-->
<!--
<property>
<name>yarn.resourcemanager.hostname</name>
<value>bigdata-pro01.kfk.com</value>
</property>
-->
<!-- HA -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- change cluster1 to rs-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rs</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bigdata-pro01.kfk.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bigdata-pro02.kfk.com</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
</property>
<!-- default is false-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!-- default is FileSystemRMStateStore-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
3、分发配置
scp -r etc/hadoop/yarn-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp -r etc/hadoop/yarn-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
4、启动服务
机器1、2启动resourcemanager 3启动nodemanager
实际先用sbin/start-all.sh 时1已经启动resourcemanager 同时123nodemanager都启动了
[kfk@bigdata-pro01 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
[kfk@bigdata-pro02 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
[kfk@bigdata-pro03 hadoop-2.5.0]$ sbin/yarn-daemon.sh start nodemanager
5、效果看地址
http://bigdata-pro01.kfk.com:8088/cluster/cluster
一个active一个standby 自动由zookeeper选举成功


6、实验自动故障转移
[kfk@bigdata-pro01 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
resourcemanager running as process 13247. Stop it first.
[kfk@bigdata-pro01 hadoop-2.5.0]$ jps
13360 NodeManager
13072 JournalNode
11156 QuorumPeerMain
11780 DFSZKFailoverController
12884 DataNode
12777 NameNode
13693 Jps
13247 ResourceManager
跑到map的时候杀active进程 13247 ResourceManager
输入文件及文件夹创建好后执行MR
[kfk@bigdata-pro01 hadoop-2.5.0]$ bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount /user/kfk/data/wc.input /user/kfk/data/output/1

[kfk@bigdata-pro01 hadoop-2.5.0]$ kill -9 13247
连接不到rm1了 切换到rm2

最终成功


本文详细介绍了YARN-HA(高可用)架构的原理,并提供了具体配置步骤,包括修改yarn-site.xml文件实现资源管理器的高可用设置,通过Zookeeper进行故障转移,以及如何分发和启动配置,最后验证了自动故障转移的效果。
2081

被折叠的 条评论
为什么被折叠?



