环境:
阿里云三台:centos7.2
hadoop-2.6.0-cdh5.15.1.tar.gz
jdk-8u45-linux-x64.gz
zookeeper-3.4.6.tar.gz
配nat
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
IPADDR=192.168.137.10
NETMASK=255.255.255.0
GATEWAY=192.168.137.2
DNS1=192.168.124.1
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
#IPV6INIT="yes"
#IPV6_AUTOCONF="yes"
#IPV6_DEFROUTE="yes"
#IPV6_FAILURE_FATAL="no"
#IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="ecead710-5507-457d-8736-281abf08ede0"
DEVICE="ens33"
ONBOOT="yes"
--修改主机名,三台对应修改自己的主机名
默认主机名为:localhost
[root@localhost ~]#
修改:
hostnamectl set-hostname hadoop001
hostnamectl set-hostname hadoop002
hostnamectl set-hostname hadoop003
先在三台机器上分别创建一个hadoop用户,并切换至hadoop用户
[root@hadoop001 ~]# useradd hadoop
[root@hadoop002 ~]# useradd hadoop
[root@hadoop003 ~]# useradd hadoop
使用su - hadoop切换至hadoop用户(了解su - 和 su的区别??)
创建我们的目录结构
[hadoop@hadoop001 ~]# mkdir app software source data lib script tmp maven_repos
[hadoop@hadoop003 ~]# mkdir app software source data lib script tmp maven_repos
[hadoop@hadoop002 ~]# mkdir app software source data lib script tmp maven_repos
在hadoop001机器上上传hadoop,zookeeper,jdk 相关包
[hadoop@hadoop001 ~]# cd software/
...上传相关安装包...
[hadoop@hadoop001 software]# ll
total 433192
-rw-r--r-- 1 hadoop hadoop 252606214 Aug 19 06:43 hadoop-2.6.0-cdh5.15.1.tar.gz
-rw-r--r-- 1 hadoop hadoop 173271626 Aug 19 06:43 jdk-8u45-linux-x64.gz
-rw-r--r-- 1 hadoop hadoop 17699306 Aug 19 06:44 zookeeper-3.4.6.tar.gz
然后使用root用户配置主机名与ip映射,三台机器都要配置(这里需要配置阿里云内网的ip)
[root@hadoop001 software]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.17.33.6 hadoop001
172.17.33.5 hadoop002
172.17.33.7 hadoop003
其它两台同样是这样的配置
三台机器登陆hadoop用户,配置ssh免密钥
[root@hadoop001 ~]# su - hadoop
[root@hadoop002 ~]# su - hadoop
[root@hadoop003 ~]# su - hadoop
首先在分别在三台机器上生成公钥和私钥(在.ssh隐藏文件夹中)
[hadoop@hadoop001 ~]$ ssh-keygen
...一直按enter回车即可...
其它两台机器同理
[hadoop@hadoop001 ~]$ cd .ssh/
[hadoop@hadoop001 .ssh]$ ll
total 8
-rw------- 1 hadoop hadoop 1675 Aug 20 21:27 id_rsa
-rw-r--r-- 1 hadoop hadoop 398 Aug 20 21:27 id_rsa.pub
然后在hadoop001机器上生成保存公钥的文件authorized_keys
[hadoop@hadoop001 .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop001 .ssh]$ ll
total 12
-rw-rw-r-- 1 hadoop hadoop 398 Aug 20 21:34 authorized_keys
-rw------- 1 hadoop hadoop 1675 Aug 20 21:27 id_rsa
-rw-r--r-- 1 hadoop hadoop 398 Aug 20 21:27 id_rsa.pub
因为此时hadoop用户是没有密码的,所以没办法使用scp命令,所以需要将其它两台机器上
的id_rsa.pub公钥文件下载到windows本机,再上传到hadoop001机器上,并将其公钥内容
追加到authorized_keys文件中,然后将authorized_keys下载到本机,分别上传到其它
两台机器上,最后将三个authorized_keys文件的权限改为600,原因请参考hadoop官网
修改authorized_keys文件的600权限
[hadoop@hadoop001 .ssh]$ chmod 600 ~/.ssh/authorized_keys
[hadoop@hadoop002 .ssh]$ chmod 600 ~/.ssh/authorized_keys
[hadoop@hadoop003 .ssh]$ chmod 600 ~/.ssh/authorized_keys
然后分别在三台机器上生成包含三个known_hosts文件(包含了信任关系)
[hadoop@hadoop001 .ssh]$ ssh hadoop001 date
[hadoop@hadoop001 .ssh]$ ssh hadoop002 date
[hadoop@hadoop001 .ssh]$ ssh hadoop003 date
其它两台同理,这样即可实现免密钥登陆
-------------------------
安装jdk
因为jdk是所有用户都需要使用的,所有使用root用户创建jdk目录,对于cdh来说,
jdk的目录是死的,必须在这个位置
[root@hadoop001 ~]# mkdir /usr/java
[root@hadoop002 ~]# mkdir /usr/java
[root@hadoop003 ~]# mkdir /usr/java
然后使用root用户解压jdk到/usr/java目录下,然后配置环境变量
[root@hadoop001 ~]# tar -zxcf /home/hadoop/software/jdk-8u45-linux-x64.gz -C /usr/java/
[root@hadoop002 ~]# tar -zxcf /home/hadoop/software/jdk-8u45-linux-x64.gz -C /usr/java/
[root@hadoop003 ~]# tar -zxcf /home/hadoop/software/jdk-8u45-linux-x64.gz -C /usr/java/
[root@hadoop001 jdk1.8.0_45]# vi /etc/profile
....在最后添加....
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin:$PATH
其它两台同理,然后一定要修改jdk的权限为root用户root组
[root@hadoop001 java]# ll !!!!!发现jdk所属用户用户所属组有问题,需要修改
total 4
drwxr-xr-x 8 10 143 4096 Apr 11 2015 jdk1.8.0_45
[root@hadoop001 java]# chown -R root:root /usr/java/*
[root@hadoop002 java]# chown -R root:root /usr/java/*
[root@hadoop003 java]# chown -R root:root /usr/java/*
最后source一下环境变量生效,jdk安装结束
---------------------------
切换hadoop用户安装zookeeper
解压zookeeper至app目录下,然后创建软连接
[hadoop@hadoop001 app]$ tar -zxvf software/zookeeper-3.4.6.tar.gz -C app/
[hadoop@hadoop002 app]$ tar -zxvf software/zookeeper-3.4.6.tar.gz -C app/
[hadoop@hadoop003 app]$ tar -zxvf software/zookeeper-3.4.6.tar.gz -C app/
[hadoop@hadoop001 app]$ ln -s zookeeper-3.4.6/ zookeeper
[hadoop@hadoop002 app]$ ln -s zookeeper-3.4.6/ zookeeper
[hadoop@hadoop003 app]$ ln -s zookeeper-3.4.6/ zookeeper
然后进入conf目录修改zookeeper配置文件
[hadoop@hadoop001 conf]$ cp zoo_sample.cfg zoo.cfg
[hadoop@hadoop002 conf]$ cp zoo_sample.cfg zoo.cfg
[hadoop@hadoop003 conf]$ cp zoo_sample.cfg zoo.cfg
修改zoo.cfg 文件中的dataDir目录位置,增加server
dataDir=/home/hadoop/data/zookeeper
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
以上三台都要做
然后手动创建dataDir的目录
[hadoop@hadoop001 conf]$ mkdir ~/data/zookeeper
[hadoop@hadoop002 conf]$ mkdir ~/data/zookeeper
[hadoop@hadoop003 conf]$ mkdir ~/data/zookeeper
然后在刚创建的zookeeper目录下创建myid文件,并对应写入1,2,3
[hadoop@hadoop001 zookeeper]$ echo 1 > ~/data/zookeeper/myid
[hadoop@hadoop002 zookeeper]$ echo 2 > ~/data/zookeeper/myid
[hadoop@hadoop003 zookeeper]$ echo 3 > ~/data/zookeeper/myid
然后在hadoop个人环境变量文件.bash_profile中配置zookeeper的环境变量
[hadoop@hadoop001 ~]$ vi .bash_profile
......
export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper
export PATH=$ZOOKEEPER_HOME/bin:$PATH
其它两台同理
然后source一下.bash_profile文件生效,使用which命令检查是否生效
[hadoop@hadoop001 ~]$ which zkServer.sh
~/app/zookeeper/bin/zkServer.sh
[hadoop@hadoop002 ~]$ which zkServer.sh
~/app/zookeeper/bin/zkServer.sh
[hadoop@hadoop003 ~]$ which zkServer.sh
~/app/zookeeper/bin/zkServer.sh
至此zookeeper简单配置完成
--------------------
安装hadoop,解压hadoop至app目录中,创建软连接
[hadoop@hadoop001 ~]$ tar -zxvf software/hadoop-2.6.0-cdh5.15.1.tar.gz -C app/
[hadoop@hadoop002 ~]$ tar -zxvf software/hadoop-2.6.0-cdh5.15.1.tar.gz -C app/
[hadoop@hadoop003 ~]$ tar -zxvf software/hadoop-2.6.0-cdh5.15.1.tar.gz -C app/
[hadoop@hadoop001 app]$ ln -s ~/app/hadoop-2.6.0-cdh5.15.1/ ~/app/hadoop
[hadoop@hadoop002 app]$ ln -s ~/app/hadoop-2.6.0-cdh5.15.1/ ~/app/hadoop
[hadoop@hadoop003 app]$ ln -s ~/app/hadoop-2.6.0-cdh5.15.1/ ~/app/hadoop
修改hadoop配置文件hadoop-env.sh,显示的写出jdk的绝对路径
[hadoop@hadoop001 hadoop]$ vi hadoop-env.sh
......
export JAVA_HOME=/usr/java/jdk1.8.0_45
......
其它两台机器同理
修改core-site.xml文件,删除或者清空原core-site.xml文件内容,直接拿一份core-site.xml文件上传,文章最后贴出配置文件
先创建core-site.xml文件中hadoop.tmp.dir指定的目录,需要手动创建,并且给它777权限
[hadoop@hadoop001 ~]$ mkdir -p /home/hadoop/tmp/hadoop
[hadoop@hadoop002 ~]$ mkdir -p /home/hadoop/tmp/hadoop
[hadoop@hadoop003 ~]$ mkdir -p /home/hadoop/tmp/hadoop
[hadoop@hadoop001 ~]$ chmod 777 /home/hadoop/tmp/hadoop
[hadoop@hadoop002 ~]$ chmod 777 /home/hadoop/tmp/hadoop
[hadoop@hadoop003 ~]$ chmod 777 /home/hadoop/tmp/hadoop
将core-sitx.xml,hdfs-site.xml,yarn-site.xml,mapred.site.xml 配置文件上传到hadoop001机器上,然后可以cat一下检查一下,确认无误后
修改slaves文件,指定datanode
[hadoop@hadoop001 hadoop]$ vi slaves
hadoop001
hadoop002
hadoop003
然后通过scp命令,将修改后的配置文件同步到其它两台机器上
[hadoop@hadoop001 hadoop]$ scp *.xml slaves hadoop002:/home/hadoop/app/hadoop/etc/hadoop/
[hadoop@hadoop001 hadoop]$ scp *.xml slaves hadoop003:/home/hadoop/app/hadoop/etc/hadoop/
配置hadoop用户的hadoop环境变量
[hadoop@hadoop001 ~]$ vi .bash_profile
.......
export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper
export HADOOP_HOME=/home/hadoop/app/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$PATH
其它两台同理,然后记得source一下
至此,hadoop简单配置完成
------------------------
启动集群,首先启动zookeeper,检查zookeeper状态
[hadoop@hadoop001 hadoop]$ zkServer.sh start
[hadoop@hadoop002 hadoop]$ zkServer.sh start
[hadoop@hadoop003 hadoop]$ zkServer.sh start
[hadoop@hadoop001 hadoop]$ zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[hadoop@hadoop002 tmp]$ zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[hadoop@hadoop003 ~]$ zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[hadoop@hadoop001 hadoop]$ jps
19125 QuorumPeerMain
19182 Jps
[hadoop@hadoop002 tmp]$ jps
19174 QuorumPeerMain
19231 Jps
[hadoop@hadoop003 ~]$ jps
19235 Jps
19171 QuorumPeerMain
启动hadoop,第一次启动hdfs需要先进行格式化:
格式化前,需先在 journalnode 节点机器上先启动 JournalNode 进程
[hadoop@hadoop001 hadoop]$ hadoop-daemon.sh start journalnode
[hadoop@hadoop002 hadoop]$ hadoop-daemon.sh start journalnode
[hadoop@hadoop003 hadoop]$ hadoop-daemon.sh start journalnode
[hadoop@hadoop001 hadoop]$ jps
19125 QuorumPeerMain
19207 JournalNode
19256 Jps
[hadoop@hadoop002 tmp]$ jps
19174 QuorumPeerMain
19256 JournalNode
19305 Jps
[hadoop@hadoop003 ~]$ jps
19171 QuorumPeerMain
19260 JournalNode
19309 Jps
格式化namenode:
在hadoop001上进行格式化namenode
[hadoop@hadoop001 hadoop]$ hadoop namenode -format
......日志中出现name has been successfully formatted表示格式化成功.....
同步元数据到第二台namenode上,以达到元数据同步:(为什么要将nn1的name文件scp到nn2的dfs目录下呢,因为两台nn的目录结构是一样的,数据存放的位置也是一样的,这在hdfs-site.xml配置文件中是有指定的,换句话说就是 dfs.namenode.name.dir,dfs.namenode.edits.dir 还应该确保共享存储目录下
(dfs.namenode.shared.edits.dir ) 包含 NameNode 所有的元数据。)
[hadoop@hadoop001 dfs]$ scp -r name/ hadoop002:/home/hadoop/data/dfs/
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
VERSION 100% 203 0.2KB/s 00:00
fsimage_0000000000000000000 100% 308 0.3KB/s 00:00
seen_txid
初始化 ZFCK:因为zk是一个集群,所以只需要在一台主节点上格式化即可
[hadoop@hadoop001 ~]$ hdfs zkfc -formatZK
.....日志....
19/08/21 01:17:00 INFO ha.ActiveStandbyElector: Session connected.
19/08/21 01:17:00 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/yoohhwz in ZK.
19/08/21 01:17:00 INFO zookeeper.ZooKeeper: Session: 0x16caff2e2f00000 closed
19/08/21 01:17:00 INFO zookeeper.ClientCnxn: EventThread shut down
19/08/21 01:17:00 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG:
........
启动 HDFS 分布式存储系统:
[hadoop@hadoop001 ~]$ start-dfs.sh
[hadoop@hadoop001 ~]$ jps
19652 DataNode
20004 Jps
19125 QuorumPeerMain
19525 NameNode
19207 JournalNode
19934 DFSZKFailoverController
[hadoop@hadoop002 ~]$ jps
19681 DFSZKFailoverController
19729 Jps
19460 NameNode
19557 DataNode
19174 QuorumPeerMain
19256 JournalNode
[hadoop@hadoop003 ~]$ jps
19424 DataNode
19171 QuorumPeerMain
19525 Jps
19260 JournalNode
此时就可以通过nn1和nn2的公网ip+50070端口号访问hdfs了
启动yarn:
[hadoop@hadoop001 ~]$ start-yarn.sh
[hadoop@hadoop001 ~]$ jps
20080 ResourceManager
20484 Jps
19652 DataNode
19125 QuorumPeerMain
19525 NameNode
20182 NodeManager
19207 JournalNode
19934 DFSZKFailoverController
[hadoop@hadoop002 ~]$ jps
19968 Jps
19681 DFSZKFailoverController
19460 NameNode
19557 DataNode
19174 QuorumPeerMain
19862 NodeManager
19256 JournalNode
[hadoop@hadoop003 ~]$ jps
19424 DataNode
19571 NodeManager
19171 QuorumPeerMain
19260 JournalNode
19695 Jps
我们发现第二台ResourceManager并没有启动,因为这是需要手动单独启动的:
[hadoop@hadoop002 ~]$ yarn-daemon.sh start resourcemanager
[hadoop@hadoop002 ~]$ jps
19681 DFSZKFailoverController
20051 ResourceManager
19460 NameNode
19557 DataNode
19174 QuorumPeerMain
19862 NodeManager
19256 JournalNode
20106 Jps
然后启动yarn的历史执行记录服务:
[hadoop@hadoop001 ~]$ ./home/hadoop/app/hadoop/sbin/mr-jobhistory-daemon.sh start historyserver
[hadoop@hadoop001 ~]$ jps
3680 Jps
3152 ResourceManager
2880 JournalNode
2690 DataNode
3607 JobHistoryServer
3256 NodeManager
3066 DFSZKFailoverController
2587 NameNode
2444 QuorumPeerMain
完美启动
页面:记得云主机要配置安全组(50070,8088,19888不然页面访问不到)
hdfs页面:
第一台:112.126.100.90:50070
第二台:112.125.24.49:50070
yarn页面:第二台standby状态的要加cluster/cluster才可以访问
第一台:112.126.100.90:8088
第二台:112.125.24.49:8088/cluster/cluster
yarnJobHistory:http://hadoop001:19888/jobhistory
启动集群:
启动zookeeper
[hadoop001@hadoop001 ~]# zkServer.sh start
[hadoop002@hadoop002 ~]# zkServer.sh start
[hadoop003@hadoop003 ~]# zkServer.sh start
启动 yarn
[hadoop001 @hadoop001 ~]# start-yarn.sh
[hadoop001 @hadoop001 ~]#mr-jobhistory-daemon.sh start historyserver
启动secondaryRM
[hadoop001 @hadoop002 ~]# yarn-daemon.sh start resourcemanager
启动hdfs
[hadoop001 @hadoop001 ~]# start-dfs.sh
关闭集群:
关闭hadoop
[hadoop001 @hadoop001 ~]#mr-jobhistory-daemon.sh stop historyserver
[hadoop001 @hadoop002 ~]# yarn-daemon.sh stop resourcemanager
[hadoop001 @hadoop001 ~]# stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
hadoop003: stopping nodemanager
hadoop002: stopping nodemanager
hadoop001: stopping nodemanager
[hadoop001 @hadoop001 ~]# stop-dfs.sh
Stopping namenodes on [hadoop001 hadoop002]
hadoop002: stopping namenode
hadoop001: stopping namenode
hadoop003: stopping datanode
hadoop001: stopping datanode
hadoop002: stopping datanode
Stopping journal nodes [hadoop001 hadoop002 hadoop003]
hadoop003: stopping journalnode
hadoop002: stopping journalnode
hadoop001: stopping journalnode
Stopping ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: stopping zkfc
hadoop001: stopping zkfc
关闭Zookeeper
[hadoop001@hadoop001 ~]# zkServer.sh stop
[hadoop002@hadoop002 ~]# zkServer.sh stop
[hadoop003@hadoop003 ~]# zkServer.sh stop
如果要测试ha可以使用hdfs haadmin命令,做一些切换状态的操作
常用的:
hdfs haadmin failover nn1 nn2 ---意思就是将nn1的active转给nn2,是nn2变为active状态
hdfs haadmin -transitionToStandby -forcemanul nn1 ---意思是说强行将nn1状态转为standby
自动failover需要增加两个新的组件到HDFS上:
一个是Zookeeper quorum(仲裁),另一个是ZKFailoverController进程(简称ZKFC)。
Apache Zookeeper是一个高可用的服务,对于小规模数据协调,通知客户端数据变化,监控客户端失败。自动failover的实现是基于ZK以下的事实:
Failure detection - 集群中的每个NameNode机器在ZK上保持持久化会话。如果机器崩溃,ZK会话过期,通知其它NameNode有一个failover将被触发。
Active NameNode election - ZK提供一个简单机制,选举出唯一的一个节点作为active。如果当前的active NameNode崩溃,另一个节点可能在ZK持有特定的互斥型锁,表名它将成为下一个active。
ZKFC是一个新的组件,是一个ZK客户端,也监控和管理NameNode的状态。NameNode运行的所在的每个机器也要运行一个ZKFC,ZKFC负责:
Health monitoring(健康监控) - ZKFC使用健康检查命令定期地ping本地NameNode。只要NameNode及时响应一个健康状态,ZKFC则确认节点健康。如果节点已经崩溃、冻结、其它不健康状态,健康监控器将标记它为不健康。
ZooKeeper session management(ZK会话管理)- 当本地NameNode健康,ZKFC在ZK上持有一个打开的会话。如果本地NameNode是active的,它持有指定的znode锁。这个锁用于ZK支持“短暂”的节点,如果会话过期,锁节点将自动被删除。
ZooKeeper-based election(基于ZK的选举) - 如果本地NameNode健康,ZKFC看到当前没有其它节点控制znode锁,它自己将尝试获取这个所有。如果成功,则它胜选,负责运行一个failover,使本地NameNode成为active。failover进程类似于上面描述的手工failover:首先,如果必要前一个active被fencing了,然后,本地的NameNode转换为active状态。
-------------------------------------------------------------------------------
kafka部署:这里是使用的5.15.1
部署kafka首先得启动zookeeper
[hadoop@hadoop001 hadoop]$ zkServer.sh start
[hadoop@hadoop002 hadoop]$ zkServer.sh start
[hadoop@hadoop003 hadoop]$ zkServer.sh start
[hadoop@hadoop001 hadoop]$ zkServer.sh status
上传压缩包,解压,创建软连接
修改三台kafka配置文件server.properties
broker.id=0
host.name=hadoop001
port=9092
log.dirs=/home/hadoop/log/kafka-logs
zookeeper.connect=hadoop001:2181,hadoop002:2181,hadoop003:2181/kafka
broker.id=1
host.name=hadoop002
port=9092
log.dirs=/home/hadoop/log/kafka-logs
zookeeper.connect=hadoop001:2181,hadoop002:2181,hadoop003:2181/kafka
broker.id=2
host.name=hadoop003
port=9092
log.dirs=/home/hadoop/log/kafka-logs
zookeeper.connect=hadoop001:2181,hadoop002:2181,hadoop003:2181/kafka
启动命令:
启动脚本为 bin/kafka-server-start.sh 查看启动命令
USAGE: ./kafka-server-start.sh [-daemon] server.properties [--override property=value]*
启动三台kafka:
[hadoop@hadoop001 bin]$ ./kafka-server-start.sh -daemon ../config/server.properties
[hadoop@hadoop002 bin]$ ./kafka-server-start.sh -daemon ../config/server.properties
[hadoop@hadoop003 bin]$ ./kafka-server-start.sh -daemon ../config/server.properties
[hadoop@hadoop001 bin]$ jps
3904 Jps
2163 QuorumPeerMain
3838 Kafka
完毕