基于Zookeeper的HDFS HA配置主要涉及两个文件,core-site和hdfs-site.xml。
测试环境有三台
hadoop.master
hadoop.slave1
hadoop.slave2
hadoop.master包含的组件NameNode, JournalNode, Zookeeper,DFSZKFailoverController
hadoop.slave1 包含的组件Standby NameNode, DataNode, JournaleNode,DFSZKFailoverController
hadoop.slave2 包含的组件DataNode,JournalNode
1. core-site.xml配置
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdfsHA</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/data/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop.master:2181</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value></value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value></value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
<description>Should native hadoop libraries, if present, be used.</description>
</property>
</configuration>
2. hdfs-site.xml配置
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.nameservices</name>
<value>hdfsHA</value>
</property>
<property>
<name>dfs.ha.namenodes.hdfsHA</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdfsHA.nn1</name>
<value>hadoop.master:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdfsHA.nn2</name>
<value>hadoop.slave1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfsHA.nn1</name>
<value>hadoop.master:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfsHA.nn2</name>
<value>hadoop.slave1:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop.master:8485;hadoop.slave1:8485;hadoop.slave2:8485/hdfsHA</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.hdfsHA</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hdfsHA</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/data/dfs/journal</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop.master:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
3.启动过程
3.1 将两个配置文件分发到hadoop.slave1和hadoop.slave2节点
3.2 在三台机器上启动journalnode
sbin/hadoop-daemon.sh start journalnode
启动进程为6725 org.apache.hadoop.hdfs.qjournal.server.JournalNode
3.3 在hadoop.master上格式化Zookeeper(实际上三台机器哪一台都可以)
bin/hdfs zkfc -formatZK
成功信息为:ha.ActiveStandbyElector: Successfully created /hadoop-ha/hdfsHA in ZK
3.4 在hadoop.master上初始化namenode并启动
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode
3.5 对hadoop.slave1 namenode进行格式化并启动
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode
此时,两台机器都处于standby状态
3.6 在hadoop.master和hadoop.slave1上启动zkfc
sbin/hadoop-daemon.sh start zkfc
启动进程为DFSZKFailoverController
此时,有一台处于active状态,另一台处于standby状态
3.7 在hadoop.master上启动datanode,此时slave1和slave2两台机器的datanode启动
本文介绍了基于Zookeeper的HDFS高可用配置方法,详细解释了core-site.xml和hdfs-site.xml的设置,并概述了HDFS HA集群的搭建步骤。
2948

被折叠的 条评论
为什么被折叠?



