1.背景
上篇记录了hadoop的核心配置和zookeeper的基本配置,这篇将我的配置记录下,包括启动过程的总结!简单的分布式环境搭建了四遍,也算是懂些了皮毛,总算是可以启动了!我的运行环境这里不在详述。还是声明一点,所有的均是在root用户下完成的!
2.Hadoop 配置
2.1 etc/hadoop 目录下
先进入 该目录下 :
- root@note1:~/hadoop-2.6/etc/hadoop#
(1)hadoop-env.sh
配置JAVA运行环境 , JAVA_HOME ;
- root@note1:~/hadoop-2.6/etc/hadoop# vi hadoop-env.sh
(2) core-site.xml
- root@note1:~/hadoop-2.6/etc/hadoop# more core-site.xml
全部配置如下 :
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
-
-
- <configuration>
-
- <property>
- <name>fs.defaultFS</name>
- <pre name="code" class="html"> <value>hdfs://yuannews</value>
</property>
<property> <name>ha.zookeeper.quorum</name> <value>note1:2181,note3:2181,note4:2181</value> </property><property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop2</value></property></configuration>
(3)hdfs-site.xml
- root@note1:~/hadoop-2.6/etc/hadoop# cat hdfs-site.xml
配置如下 :
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
-
-
- <configuration>
-
- <property>
- <name>dfs.nameservices</name>
- <value>yuannews</value>
- </property>
-
- <property>
- <name>dfs.ha.namenodes.yuannews</name>
- <value>nn1,nn2</value>
- </property>
-
- <property>
- <name>dfs.namenode.rpc-address.yuannews.nn1</name>
- <value>note1:8020</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.yuannews.nn2</name>
- <value>note3:8020</value>
- </property>
-
- <property>
- <name>dfs.namenode.http-address.yuannews.nn1</name>
- <value>note1:50070</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.yuannews.nn2</name>
- <value>note3:50070</value>
- </property>
-
- <property>
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://note3:8485;note4:8485;note5:8485/yuannews</value>
- </property>
-
- <property>
- <name>dfs.client.failover.proxy.provider.yuannews</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
-
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>sshfence</value>
- </property>
-
- <property>
- <name>dfs.ha.fencing.ssh.private-key-files</name>
- <value>/root/.ssh/id_rsa</value>
- </property>
-
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/opt/hadoop/jn/data/</value>
- </property>
-
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
-
-
- </configuration>
(4)maperd-site.xml
将 maperd-site.xml.template 重命名为 mapred-site.xml
- root@note1:~/hadoop-2.6/etc/hadoop# mv mapred-site.xml.template mapred-site.xml
配置如下 :
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
-
-
- <configuration>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- </configuration>
(5)yarn-site.xml
- root@note1:~/hadoop-2.6/etc/hadoop# more yarn-site.xml
配置如下 :配置 主运行节点,我的是 note1 ;
- <?xml version="1.0"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <configuration>
-
-
-
- <property>
- <name>yarn.resourcemanager.hostname</name>
- <value>note1</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
-
- </configuration>
(6)slaves
配置其他集群机子地址,相当于 datanode 所在的地址!
配置如下 :
- 192.168.56.3
- 192.168.56.4
- 192.168.56.5
(7) 配置总结
上面配置的 dfs.journalnode.edits.dir 的时候,需要手动创建该目录,其余的就是服务名称了,一定要对!
3.zookeeper配置
(1)zoo.cfg
- root@note1:~/zookeeper-3.4.6/conf# more zoo.cfg
将 zoo.simple.cfg 重命名为 zoo.cfg , 配置如下 :
- # The number of milliseconds of each tick
- tickTime=2000
- # The number of ticks that the initial
- # synchronization phase can take
- initLimit=10
- # The number of ticks that can pass between
- # sending a request and getting an acknowledgement
- syncLimit=5
- # the directory where the snapshot is stored.
- # do not use /tmp for storage, /tmp here is just
- # example sakes.
- dataDir=/opt/zookeeper
- # the port at which the clients will connect
- clientPort=2181
- # the maximum number of client connections.
- # increase this if you need to handle more clients
- # maxClientCnxns=60
- #
- # Be sure to read the maintenance section of the
- # administrator guide before turning on autopurge.
- #
- # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
-
- # The number of snapshots to retain in dataDir
- # autopurge.snapRetainCount=3
- # Purge task interval in hours
- # Set to "0" to disable auto purge feature
- # autopurge.purgeInterval=1
- server.1=note1:2888:3888
- server.2=note3:2888:3888
- server.3=note4:2888:3888
注意 :
(1)手动创建 dataDir 目录,我的是 /opt/zookeeper
(2) 在该目录下创建 /opt/zookeeper 目录下,创建 myid 文件 ,文件内容为 上面zoo.cfg配置文件的最后的 server.x 的x , 规则 如下:
1)m节点机子运行zookeeper,在m上,复制zookeeper的程序,即解压出来的,并且相同的配置!
2)每个节点机子上都创建 dataDir目录,并创建myid文件
3)myid 文件内容与 zoo.cfg 最后的对应,比如 server.2=note3:2888:3888 , 那么note3节点机子上的myid 内容为 2,仅仅一个2 ,就可以了,依次类推!
(2)全局配置
将 zookeeper的bin 目录配置到 /etc/profile文件中,我的如下 :
- export PATH=$PATH:/root/zookeeper-3.4.6/bin
别忘了 ,执行 source /etc/profile !
(3)zookeeper 测试启动过程
4.初始化过程
(1)测试启动 journalnode
进入 hadoop/sbin 目录
- ./hadoop-daemon.sh start journalnode
(2)格式化一台namenode
我的有两台namenode , 在 所以在 一台机子上进行 格式化 namenode,这里成为namenode1 , 其他的不需要格式化,但是需要进行以后的操作;
- root@note1:~/hadoop-2.6/bin# ./hdfs namenode -format
(3)初始化其他namenode
已经格式化了 namenode1 , 现在初始化 namenode2 , 所以,先启动刚才格式化的 namenode1 :
- root@note1:~/hadoop-2.6/sbin# ./hadoop-daemon.sh start namenode
后在 namenode2 的节点机子上执行 初始化操作:
- root@note3:~/hadoop-2.6/bin# ./hdfs namenode -bootstrapStandby
(4)初始化 zkfc
前提是 ,在你配置的 zookeeper的机子上,启动 zookeeper (ZK), 然后才能格式化 zkfc , 否则,会报错!
- root@note1:~/hadoop-2.6/bin# ./hdfs zkfc -formatZK
(5)启动与停止
- start-dfs.sh 和 stop-dfs.sh
(6) 注意
在启动的时候,如果发现没有启动的话,注意检查2点,节点机子ip是否可以 ping通 和 节点机子的防火墙是否关闭(有时候);
5.启动过程
先启动 zookeeper , 在启动 hadoop -dfs , 后启动 hadoop - yarn ;