Zookeeper不但可以在单机上运行单机模式Zookeeper,而且可以在单机模拟集群模式 Zookeeper的运行,也就是将不同节点运行在同一台机器。我们知道伪分布模式下Hadoop的操作和分布式模式下有着很大的不同,但是在集群为分布 式模式下对Zookeeper的操作却和集群模式下没有本质的区别。显然,集群伪分布式模式为我们体验Zookeeper和做一些尝试性的实验提供了很大 的便利。比如,我们在实验的时候,可以先使用少量数据在集群伪分布模式下进行测试。当测试可行的时候,再将数据移植到集群模式进行真实的数据实验。这样不 但保证了它的可行性,同时大大提高了实验的效率。这种搭建方式,比较简便,成本比较低,适合测试和学习,如果你的手头机器不足,就可以在一台机器上部署了 3个server。
注意事项
在一台机器上部署了3个server,需要注意的是在集群为分布式模式下我们使用的每个配置文档模拟一台机器,也就是说单台机器及上运行多个Zookeeper实例。但是,必须保证每个配置文档的各个端口号不能冲突,除了clientPort不同之外,dataDir也不同。另外,还要在dataDir所对应的目录中创建myid文件来指定对应的Zookeeper服务器实例。
(1)clientPort端口:如果在1台机器上部署多个server,那么每台机器都要不同的 clientPort,比如 server1是2181,server2是2182,server3是2183,
(2)dataDir和dataLogDir:dataDir和dataLogDir也需要区分下,将数据文件和日志文件分开存放,同时每个server的这两变量所对应的路径都是不同的。
(3)server.X和myid: server.X 这个数字就是对应,data/myid中的数字。在3个server的myid文件中分别写入了0,1,2,那么每个server中的zoo.cfg都配 server.0 server.2,server.3就行了。因为在同一台机器上,后面连着的2个端口,3个server都不要一样,否则端口冲突。
配置/home/hadoop/zookeeper/conf下的文件zoo_sample.cfg
cp zoo_sample.cfg zoo1.cfg
代码清单 1.1 zoo1.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
把 dataDir=/tmp/zookeeper
改为 dataDir=/home/hadoop/zookeeper/data_1
# the port at which the clients will connect
clientPort=2181
#the location of the log file
添加 dataLogDir=/home/hadoop/zookeeper/logs_1
server.0=localhost:2287:3387
server.1=localhost:2288:3388
server.2=localhost:2289:3389
代码清单 1.2 zoo2.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/usr/local/zk/data_2
# the port at which the clients will connect
clientPort=2182
#the location of the log file
dataLogDir=/usr/local/zk/logs_2
server.0=localhost:2287:3387
server.1=localhost:2288:3388
server.2=localhost:2289:3389
代码清单 zoo3.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/usr/local/zk/data_3
# the port at which the clients will connect
clientPort=2183
#the location of the log file
dataLogDir=/usr/local/zk/logs_3
server.0=localhost:2287:3387
server.1=localhost:2288:3388
server.2=localhost:2289:3389
1. zoo.cfg配置文件如下:
-
# The number of milliseconds of each tick
-
tickTime=2000
-
# The number of ticks that the initial
-
# synchronization phase can take
-
initLimit=10
-
# The number of ticks that can pass between
-
# sending a request and getting an acknowledgement
-
syncLimit=5
-
# the directory where the snapshot is stored.
-
# do not use /tmp for storage, /tmp here is just
-
# example sakes.
-
#dataDir=/usr/local/var/run/zookeeper/data
-
dataDir=/Users/userName/Documents/zookeeper/data
-
dataLogDir=/Users/userName/Documents/zookeeper/logs
-
# the port at which the clients will connect
-
clientPort=2181
-
# the maximum number of client connections.
-
# increase this if you need to handle more clients
-
#maxClientCnxns=60
-
#
-
# Be sure to read the maintenance section of the
-
# administrator guide before turning on autopurge.
-
#
-
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
-
#
-
# The number of snapshots to retain in dataDir
-
#autopurge.snapRetainCount=3
-
# Purge task interval in hours
-
# Set to "0" to disable auto purge feature
-
#autopurge.purgeInterval=1
-
server.1=localhost:2881:3881
-
server.2=localhost:2882:3882
-
server.3=localhost:2883:3883
2. 复制zoo.cfg文件命名为zoo2.cfg, 改动dataDir, dataLogDir和clientPort三个参数
-
dataDir=/Users/userName/Documents/zookeeper/data2
-
dataLogDir=/Users/userName/Documents/zookeeper/logs2
-
clientPort=2182
3. 再次复制zoo.cfg文件命令为zoo3.cfg, 改动dataDir, dataLogDir和clientPort三个参数
-
dataDir=/Users/userName/Documents/zookeeper/data3
-
dataLogDir=/Users/userName/Documents/zookeeper/logs3
-
clientPort=2183
4. 在配置的/Users/userName/Documents/zookeeper/目录下创建data, data2, data3, logs, logs2, logs3目录
5. 在data文件夹中创建文件myid, 内容填1 (和server.1=localhost:2881:3881中的1一致即可)
在data2文件夹中创建文件myid, 内容填2
在data3文件夹中创建文件myid, 内容填3
注: myid文件中的内容和配置中的server.X的X一致即可
6. 启动三个服务
-
$ zkServer start zoo.cfg
-
ZooKeeper JMX enabled by default
-
Using config: /usr/local/etc/zookeeper/zoo.cfg
-
Starting zookeeper ... STARTED
-
$ zkServer start zoo2.cfg
-
ZooKeeper JMX enabled by default
-
Using config: /usr/local/etc/zookeeper/zoo2.cfg
-
Starting zookeeper ... STARTED
-
$ zkServer start zoo3.cfg
-
ZooKeeper JMX enabled by default
-
Using config: /usr/local/etc/zookeeper/zoo3.cfg
-
Starting zookeeper ... STARTED
7. 通过zkServer status命令查看各自的角色
-
$ zkServer status zoo.cfg
-
ZooKeeper JMX enabled by default
-
Using config: /usr/local/etc/zookeeper/zoo.cfg
-
Mode: follower
-
$ zkServer status zoo2.cfg
-
ZooKeeper JMX enabled by default
-
Using config: /usr/local/etc/zookeeper/zoo2.cfg
-
Mode: leader
-
$ zkServer status zoo3.cfg
-
ZooKeeper JMX enabled by default
-
Using config: /usr/local/etc/zookeeper/zoo3.cfg
-
Mode: follower
可看出, 选举的结果为: server2为leader(master), server1和server3为follower(slave)节点
注意事项:
1. 第4步中的几个文件夹需要提前创建, 否则可能报错
2. 注意cfg中的配置(server.1=localhost:2881:3881), localhost别写错了, 2881端口别写成2181了
3. 其他错误可参见: https://blog.youkuaiyun.com/xiewendong93/article/details/50500471