ZooKeeper基本原理与分布式主从节点实践

本文介绍如何使用ZooKeeper实现主从节点的切换,确保系统高可用性。通过Java API实现节点监听,当主节点宕机时,从节点能够迅速接管任务。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

我们在运行一些worker实时任务时,为了防止单个节点宕掉后,能有从节点继续工作,从而保证系统的高可用。在此场景下我们需要实现:
1,从节点不能与主节点同时工作,也就是同一时刻只能有一个节点在运行任务。
2,从节点要随时知晓主节点是否正常工作ing,一旦发现主节点宕,立马争取得主节点权顶替工作。

Zookeeper 分布式服务框架是Apache Hadoop 的一个子项目,能够帮助我们很好的实现这个场景,虽然ZooKeeper的功能不仅仅限于用作主从场景实现,参考:https://blog.youkuaiyun.com/duke370503/article/details/52623192

ZooKeeper基本原理:

zookeeper的节点有两种类型,持久节点跟临时节点。临时节点有个特性,就是如果注册这个节点的机器失去连接(通常是宕机),那么这个节点会被zookeeper删除。选主过程就是利用这个特性,举例:
(1)服务器A,B,C 在启动的时候,会争相去ZooKeeper的同一个目录下注册相同名
称的临时节点假设为/ha/master,
(2)假如A率先创建临时节点/ha/master成功,则A机器成为主,B,C发现
该/ha/master节点已经被A抢先注册,则B,C机器成为从机。B,C 从机会一直
监听/ha/master的变化。
(3)假如A机器宕机或者断网与ZooKeeper失去连接,则临时节点/ha/master会被
删除,B,C机器监听到这一变化,就重新去争取注册,谁再次注册/ha/master
成功就成为新主

选主的过程,其实就是简单的争抢在zookeeper注册临时节点的操作,谁注册了约定的临时节点,谁就是master

下面介绍如何使用它的Java API 实现上述主从管理的场景,本文假设你已经在服务器上安装并启动了ZooKeeper,如何安装请自行百度,我只介绍Java 客户端的实现。

1, 在pom.xml中添加如下maven依赖

 <dependency>
			<groupId>org.apache.zookeeper</groupId>
			<artifactId>zookeeper</artifactId>
			<version>3.4.8</version>
			<exclusions>
				<exclusion>
					<groupId>com.sun.jmx</groupId>
					<artifactId>jmxri</artifactId>
				</exclusion>
				<exclusion>
					<groupId>com.sun.jdmk</groupId>
					<artifactId>jmxtools</artifactId>
				</exclusion>
				<exclusion>
					<groupId>javax.jms</groupId>
					<artifactId>jms</artifactId>
				</exclusion>
			</exclusions>
  </dependency>

2,登陆ZooKeeper服务器 cd 进入到安装的bin目录下,打开客户端,创建父节点
create /stock_mot “stockmotnode” 可看到新建了一个父主节点/stock_mot

cd /opt/applications/zookeeper-3.4.10/bin
sh zkCli.sh
[zk: localhost:2181(CONNECTED) 3] ls /
[zookeeper, zk_demo, worker]
[zk: localhost:2181(CONNECTED) 4] create /stock_mot "stockmotnode"
Created /stock_mot
[zk: localhost:2181(CONNECTED) 5] ls /                            
[stock_mot, zookeeper, zk_demo, worker]
[zk: localhost:2181(CONNECTED) 6] 

为了从ZooKeeper获取主从节点创建与查询状态,有两种方式:
1,同步方式,通过while循环不断查询状态
2,异动方式,通过监听回调接口,在回调函数中进行处理

应用程序常常由异步变化通知所驱动,如果采用同步方式,会造成应用程序本身的阻塞。因此我们采用异步回调的方式来实现编码。

3,为了简化应用程序的使用,我们进行如下封装:
(1) 创建配置类

import lombok.Getter;

@Getter
public class ZkConfig {
    private final String connectString;//zookeeper服务器ip与地址
    private final int sessionTimeout; //超时时间
    private final String znodeName; //节点名称

    public ZkConfig(String connectString, int sessionTimeout, String znodeName) {
        this.connectString = connectString;
        this.sessionTimeout = sessionTimeout;
        this.znodeName = znodeName;
    }
}

(2)创建回调接口,方便应用程序获取异步主从状态

public interface LeaderSelectorListener {
    void isLeader();

    void notLeader();
}

(3)ZooKeeper的API管理类

public class LeaderSelector  {
    private final ZkConfig zkConfig;
    private volatile ConnectionState state = ConnectionState.NONE;
    private ZooKeeper zk;
    @Setter
    private LeaderSelectorListener listener;

    private final Random random = new Random(System.currentTimeMillis());
    private final String serverId = Integer.toHexString(random.nextInt());


    enum ConnectionState {NONE, CONNECTED, DISCONNECTED, EXPIRED}

    public LeaderSelector(ZkConfig zkConfig) {
        this.zkConfig = zkConfig;
    }

    private final Watcher ZkSessionWatcher = new Watcher() {
        @Override
        public void process(WatchedEvent watchedEvent) {
            if (watchedEvent.getType() == Event.EventType.None) {
                switch (watchedEvent.getState()) {
                    case SyncConnected:
                        state = ConnectionState.CONNECTED;
                        handleConnectionState(state);
                        break;
                    case Disconnected:
                        state = ConnectionState.DISCONNECTED;
                        handleConnectionState(state);
                        break;
                    case Expired:
                        state = ConnectionState.EXPIRED;
                        break;
                    default:
                        break;
                }
            }
        }
    };

    private void handleConnectionState(ConnectionState state) {
        switch (state) {
            case CONNECTED:
                log.info("connection state: {}", state);
                enroll();
                break;
            case DISCONNECTED:
                log.info("connection state: {}", state);
                handleSelectorState(false);
                break;
            case EXPIRED:
                log.info("connection state: {}", state);
                handleSelectorState(false);
                //todo 需要重新建立连接
                break;
            default:
                log.info("unknown connection state: {}", state);
                break;
        }
    }

    private AsyncCallback.StringCallback masterCreateCallback = new AsyncCallback.StringCallback() {
        @Override
        public void processResult(int rc, String path, Object ctx, String name) {
            switch (KeeperException.Code.get(rc)) {
                case CONNECTIONLOSS:
                    checkMaster();
                    break;
                case OK:
                    handleSelectorState(true);
                    break;
                case NODEEXISTS:
                    log.info("leader node exists, add watcher.");
                    handleSelectorState(false);
                    addMasterWatcher();
                    break;
                default:
                    break;
            }
        }
    };

    private void addMasterWatcher() {
        zk.exists(zkConfig.getZnodeName(), masterExistWatcher, masterExistCallback, null);
    }

    private Watcher masterExistWatcher = new Watcher() {
        @Override
        public void process(WatchedEvent watchedEvent) {
            if (watchedEvent.getType() == Event.EventType.NodeDeleted) {
                if (zkConfig.getZnodeName().equals(watchedEvent.getPath())) {
                    enroll();
                }
            }
        }
    };

    private AsyncCallback.StatCallback masterExistCallback = new AsyncCallback.StatCallback() {
        @Override
        public void processResult(int rc, String path, Object ctx, Stat stat) {
            switch (KeeperException.Code.get(rc)) {
                case CONNECTIONLOSS:
                    addMasterWatcher();
                    break;
                case OK:
                    break;
                case NONODE:
                    enroll();
                    break;
                default:
                    checkMaster();
                    break;
            }
        }
    };

    private AsyncCallback.DataCallback masterCheckCallback = new AsyncCallback.DataCallback() {
        @Override
        public void processResult(int rc, String path, Object ctx, byte[] data, Stat stat) {
            switch (KeeperException.Code.get(rc)) {
                case CONNECTIONLOSS:
                    checkMaster();
                    return;
                case NONODE:
                    enroll();
                    return;
                case OK:
                    log.info("current leader node server id is [{}]", serverId);
                    if (serverId.equals(new String(data))) {
                        handleSelectorState(true);
                    } else {
                        log.info("current leader node server id [{}] doesn't equal to mine server id [{}]", new String(data), serverId);
                        handleSelectorState(false);
                        addMasterWatcher();
                    }
                    break;
                default:
                    break;
            }
        }
    };

    private void checkMaster() {
        zk.getData(zkConfig.getZnodeName(), false, masterCheckCallback, null);
    }

    private void handleSelectorState(boolean selected) {
        if (listener != null) {
            if (selected) {
                listener.isLeader();
            } else {
                listener.notLeader();
            }
        }
    }

    private void enroll() {
        log.info("enroll leader node by server id [{}]", serverId);
        zk.create(zkConfig.getZnodeName(), serverId.getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL, masterCreateCallback, null);
    }

    public void start() {
        if (state == ConnectionState.NONE) {
            try {
                log.info("start leader select");
                zk = new ZooKeeper(zkConfig.getConnectString(), zkConfig.getSessionTimeout(), ZkSessionWatcher);
            } catch (IOException e) {
                log.error(e.getMessage(), e);
            }
        } else {
            log.error("leader selector cannot start when state is [{}]", state);
        }
    }

    public void stop() {
        state = ConnectionState.NONE;
        if (zk != null) {
            try {
                zk.close();
                listener.notLeader();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}

4,在应用程序中使用测试:
测试demo类如下:

@Slf4j
public class ZooKeeperTest {
    private ZkConfig zkConfig;
    public static boolean server1Started = false;
    public static boolean server2Started = false;

    @Before
    public void init_zk_config() {
        zkConfig = new ZkConfig("28.163.0.65:2181", 5000, "/stock_mot/myFirst");
    }


    @Test
    public void testSelectMaster() throws InterruptedException {
        LeaderSelector server1 = new LeaderSelector(zkConfig);
        LeaderSelector server2 = new LeaderSelector(zkConfig);

        server1.setListener(new LeaderSelectorListener() {
            @Override
            public void isLeader() {
                log.info("server1 is the master");
                server1Started = true;
            }

            @Override
            public void notLeader() {
                log.info("server1 not the master");
                server1Started = false;
            }
        });

        server2.setListener(new LeaderSelectorListener() {
            @Override
            public void isLeader() {
                log.info("server2 is the master");
                server2Started = true;
            }

            @Override
            public void notLeader() {
                log.info("server2 not the master");
                server2Started = false;
            }
        });

        server1.start();
        Thread.sleep(500);
        server2.start();
        Thread.sleep(500);

        Thread t = new Thread(new Runnable() {
            @Override
            public void run() {
                while (server1Started | server2Started) {
                    if (server1Started) {
                        log.info("check server1 is the master");
                    }

                    if (server2Started) {
                        log.info("check server2 is the master");
                    }

                    if (server1Started && server2Started){
                        log.error("check there Error to find duplicate master");
                    }
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }
            }
        });
        t.start();
        Thread.sleep(5000);
        server1.stop();
        Thread.sleep(5000);
        server2.stop();

    }
}

运行testSelectMaster()可以看到如下输出:

[2018-04-27 22:49:44 INFO ] [main] (org.apache.zookeeper.ZooKeeper:438) - Initiating client connection, connectString=28.163.0.65:2181 sessionTimeout=5000 watcher=com.ounersc.ic.stock.mot.masterselect.LeaderSelector$1@49e4cb85
[2018-04-27 22:49:44 INFO ] [main-SendThread(28.163.0.65:2181)] (org.apache.zookeeper.ClientCnxn:1032) - Opening socket connection to server 28.163.0.65/28.163.0.65:2181. Will not attempt to authenticate using SASL (unknown error)
[2018-04-27 22:49:45 INFO ] [main-SendThread(28.163.0.65:2181)] (org.apache.zookeeper.ClientCnxn:876) - Socket connection established to 28.163.0.65/28.163.0.65:2181, initiating session
[2018-04-27 22:49:45 INFO ] [main-SendThread(28.163.0.65:2181)] (org.apache.zookeeper.ClientCnxn:1299) - Session establishment complete on server 28.163.0.65/28.163.0.65:2181, sessionid = 0x161d689c49f018f, negotiated timeout = 5000
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:58) - connection state: CONNECTED
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:172) - enroll leader node by server id [b38c96a7]
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:34) - server1 is the master
[2018-04-27 22:49:45 INFO ] [main] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:179) - start leader select
[2018-04-27 22:49:45 INFO ] [main] (org.apache.zookeeper.ZooKeeper:438) - Initiating client connection, connectString=28.163.0.65:2181 sessionTimeout=5000 watcher=com.ounersc.ic.stock.mot.masterselect.LeaderSelector$1@edf4efb
[2018-04-27 22:49:45 INFO ] [main-SendThread(28.163.0.65:2181)] (org.apache.zookeeper.ClientCnxn:1032) - Opening socket connection to server 28.163.0.65/28.163.0.65:2181. Will not attempt to authenticate using SASL (unknown error)
[2018-04-27 22:49:45 INFO ] [main-SendThread(28.163.0.65:2181)] (org.apache.zookeeper.ClientCnxn:876) - Socket connection established to 28.163.0.65/28.163.0.65:2181, initiating session
[2018-04-27 22:49:45 INFO ] [main-SendThread(28.163.0.65:2181)] (org.apache.zookeeper.ClientCnxn:1299) - Session establishment complete on server 28.163.0.65/28.163.0.65:2181, sessionid = 0x161d689c49f0190, negotiated timeout = 5000
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:58) - connection state: CONNECTED
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:172) - enroll leader node by server id [b380d8cd]
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:87) - leader node exists, add watcher.
[2018-04-27 22:49:45 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:54) - server2 not the master
[2018-04-27 22:49:45 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:69) - check server1 is the master
[2018-04-27 22:49:46 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:69) - check server1 is the master
[2018-04-27 22:49:47 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:69) - check server1 is the master
[2018-04-27 22:49:48 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:69) - check server1 is the master
[2018-04-27 22:49:49 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:69) - check server1 is the master
[2018-04-27 22:49:50 INFO ] [main] (org.apache.zookeeper.ZooKeeper:684) - Session: 0x161d689c49f018f closed
[2018-04-27 22:49:50 INFO ] [main] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:40) - server1 not the master
[2018-04-27 22:49:50 INFO ] [main-EventThread] (org.apache.zookeeper.ClientCnxn:519) - EventThread shut down for session: 0x161d689c49f018f
[2018-04-27 22:49:50 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.masterselect.LeaderSelector:172) - enroll leader node by server id [b380d8cd]
[2018-04-27 22:49:50 INFO ] [main-EventThread] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:48) - server2 is the master
[2018-04-27 22:49:50 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:73) - check server2 is the master
[2018-04-27 22:49:51 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:73) - check server2 is the master
[2018-04-27 22:49:52 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:73) - check server2 is the master
[2018-04-27 22:49:53 INFO ] [Thread-0] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:73) - check server2 is the master
[2018-04-27 22:49:54 INFO ] [Thread-0] (com.ounersc.icstock.mot.calc.ZooKeeperTest:73) - check server2 is the master
[2018-04-27 22:49:55 INFO ] [main] (org.apache.zookeeper.ZooKeeper:684) - Session: 0x161d689c49f0190 closed
[2018-04-27 22:49:55 INFO ] [main] (com.ounersc.ic.stock.mot.calc.ZooKeeperTest:54) - server2 not the master
[2018-04-27 22:49:55 INFO ] [main-EventThread] (org.apache.zookeeper.ClientCnxn:519) - EventThread shut down for session: 0x161d689c49f0190

Process finished with exit code 0

首先启动server1,此时无主节点,server1成为主节点。
然后启动了server2,此时server1 已经成为master。server2作为从节点监听主节点状态。输出:check server1 is the master
server1作为主节点运行5秒后,关闭server1 。此时server2获取到了主节点
输出:check server2 is the master

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值