2. ZK客户端与服务端建立连接的过程（基于NIO）

最新推荐文章于 2024-04-26 17:01:33 发布

YolynHou

最新推荐文章于 2024-04-26 17:01:33 发布

阅读量1.2k

点赞数 1

分类专栏： ZooKeeper源码分析文章标签： zookeeper 后端

本文链接：https://blog.youkuaiyun.com/qq_34088913/article/details/109079596

版权

ZooKeeper源码分析专栏收录该内容

3 篇文章

订阅专栏

ZK客户端与服务端建立连接的过程

在上一篇《客户端启动源码分析》文章中讲到了客户端会使用两个线程（SendThread和EventThread）去协调处理客户端与服务端的通信和watchers事件的回调，原本打算在这篇文章去分析这两个线程是怎么相互纠缠的。但是写着写着发现在客户端连接就花了很大的篇幅，不如这篇把标题改成ZK客户端与服务端建立连接的过程，那我在下一篇文章中再去分析SendThread和EventThread。当然这篇文章中也介绍了SendThread在客户端建立连接过程中发挥的作用。

引例

首先还是由第一篇文章中的Test来作为例子

public class ZooKeeperTestClient extends ZKTestCase implements Watcher {
    protected String hostPort = "127.0.0.1:22801";
    protected static final String dirOnZK = "/test_dir";
    protected String testDirOnZK = dirOnZK + "/" + Time.currentElapsedTime();


    private void create_get_stat_test() throws IOException, InterruptedException, KeeperException {
        ZooKeeper zk = new ZooKeeper(hostPort, 10000, this);
        String parentName = testDirOnZK;
        String nodeName = parentName + "/create_with_stat_tmp";
        deleteNodeIfExists(zk, nodeName);
        deleteNodeIfExists(zk, nodeName + "_2");
        Stat stat = new Stat();
        //创建一个持久节点
        zk.create(nodeName, null, Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT, stat);
        assertNotNull(stat);
        assertTrue(stat.getCzxid() > 0);
        assertTrue(stat.getCtime() > 0);
        zk.close();
    }


    public synchronized void process(WatchedEvent event) {
        try {
            System.out.println("Got an event " + event.toString());
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

}

先把涉及到的几个类的类图放出来，后面阅读的时候可做参考

类图：

1. 启动SendThread

在上一篇文章中最后讲到了客户端启动的时候调用SendThread#start()方法

    public void start() {
        //负责客户端和服务端的通信
        sendThread.start();
        //主要负责在客户端回调注册的Watchers进行通知处理
        eventThread.start();
    }

sendThread是一个线程，并且是ClientCnxn的内部类，条件反射地想到SendThread肯定有一个run方法，找到它：

        @Override
        public void run() {
			//省略部分代码
            while (state.isAlive()) {
               //省略部分代码
           }
        }

State#isAlive()

        public boolean isAlive() {
            return this != CLOSED && this != AUTH_FAILED;
        }

2. 状态初始化

可以看到run方法里面去监听了网络状态，这个state是由一个全局变量去标识的，只要状态不是关闭和认证失败的状态就会一直循环在那里，那么状态是什么时候初始化的呢，这要回到创建Zookeeper实例的时候：

ClientCnxn#changeZkState()

   
   
    volatile States state = States.NOT_CONNECTED;
    
   synchronized void changeZkState(ZooKeeper.States newState) throws IOException {
            if (!state.isAlive() && newState == States.CONNECTING) {
                throw new IOException(
                        "Connection has already been closed and reconnection is not allowed");
            }
            // It's safer to place state modification at the end.
            state = newState;
        }

由上面的流程知道，状态默认是NOT_CONNECTED，但在ZooKeeper实例化的时候就将状态（States）置为CONNECTING了，现在可以把SendThread的run方法拿出来。

public void run{
            while (state.isAlive()) {
                try {
                    if (!clientCnxnSocket.isConnected()) {
                        // don't re-establish connection if we are closing
                        if (closing) {
                            break;
                        }
                        if (rwServerAddress != null) {
                            serverAddress = rwServerAddress;
                            rwServerAddress = null;
                        } else {
                            serverAddress = hostProvider.next(1000);
                        }
                        onConnecting(serverAddress);
                        //开始连接服务
                        startConnect(serverAddress);
                        clientCnxnSocket.updateLastSendAndHeard();
                  }
                  //省略其他判断逻辑
            }
         }

由于初始状态是CONNECTING，那么首先会进入到第一个判断去连接服务：

3. 开始连接

请注意，接下来会在ClientCnxn和ClientCnxnSocketNIO两个类中跳来跳去，请抓稳！

ClientCnxn#startConnect()

  private void startConnect(InetSocketAddress addr) throws IOException {
            // initializing it for new connection
            changeZkState(States.CONNECTING);
            logStartConnect(addr);
			//省略部分代码
			//连接服务端
            clientCnxnSocket.connect(addr);
        }

connect方法是ClientCnxnSocket中的抽象方法，子类ClientCnxnSocketNIO中实现了这个方法：

ClientCnxnSocketNIO#connect()

   @Override
    void connect(InetSocketAddress addr) throws IOException {
        SocketChannel sock = createSock();
        try {
            registerAndConnect(sock, addr);
        } catch (UnresolvedAddressException | UnsupportedAddressTypeException | SecurityException | IOException e) {
            LOG.error("Unable to open socket to {}", addr);
            sock.close();
            throw e;
        }
        //是否初始化完成（是否连接成功）
        initialized = false;

        /*
         * Reset incomingBuffer
         */
        lenBuffer.clear();
        incomingBuffer = lenBuffer;
    }


    void registerAndConnect(SocketChannel sock, InetSocketAddress addr) throws IOException {
        sockKey = sock.register(selector, SelectionKey.OP_CONNECT);
        //建立socket连接
        boolean immediateConnect = sock.connect(addr);
        if (immediateConnect) {
            sendThread.primeConnection();
        }
    }

连接成功后又会去调用SendThread#primeConnection()方法：

SendThread#primeConnection()

        void primeConnection() throws IOException {
            LOG.info(
                "Socket connection established, initiating session, client: {}, server: {}",
                clientCnxnSocket.getLocalSocketAddress(),
                clientCnxnSocket.getRemoteSocketAddress());
            isFirstConnect = false;
            long sessId = (seenRwServerBefore) ? sessionId : 0;
            //构造连接请求
            ConnectRequest conReq = new ConnectRequest(0, lastZxid, sessionTimeout, sessId, sessionPasswd);
            //讲请求报文添加到outgoingQueue队列
            outgoingQueue.addFirst(new Packet(null, null, conReq, null, null, readOnly));
            //告知ClientCnxnSocket连接请求已经发送
            clientCnxnSocket.connectionPrimed();
            LOG.debug("Session establishment request sent on {}", clientCnxnSocket.getRemoteSocketAddress());
        }

ClientCnxnSocketNIO#connectionPrimed():

   void connectionPrimed() {
        sockKey.interestOps(SelectionKey.OP_READ | SelectionKey.OP_WRITE);
    }

好了，这里先暂停一下，咱们总结一下上面过程做了哪些事情：

初始化状态为CONNECTING
建立Socket连接
构造连接请求Packet
发送请求报文
将ClientCnxnSocketNIO的全局变量sockKey置为SelectionKey.OP_READ | SelectionKey.OP_WRITE，即设置读写事件的监听，因为后面需要监听服务端的返回，并且会影响到SendThread的run方法后面的逻辑。

4. 处理服务端连接响应

上面只是分析了SendThread#run()方法的一部分，这时候只是建立了Socket连接，但是还不能发送读写请求，接下来继续分析run方法剩下的部分:
SendThread#run()

public void run(){
	//省略部分代码，上面文章中已经分析了一部分，还有一部分这篇文章可忽略
   clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this);
}

又跑到了ClientCnxnSocketNIO#doTransport()方法：


   @Override
    void doTransport(
        int waitTimeOut,
        Queue<Packet> pendingQueue,
        ClientCnxn cnxn) throws IOException, InterruptedException {
        //等待服务端返回
        selector.select(waitTimeOut);
        Set<SelectionKey> selected;
        synchronized (this) {
            selected = selector.selectedKeys();
        }
        // Everything below and until we get back to the select is
        // non blocking, so time is effectively a constant. That is
        // Why we just have to do this once, here
        updateNow();
        for (SelectionKey k : selected) {
            SocketChannel sc = ((SocketChannel) k.channel());
            if ((k.readyOps() & SelectionKey.OP_CONNECT) != 0) {
                if (sc.finishConnect()) {
                    updateLastSendAndHeard();
                    updateSocketAddresses();
                    sendThread.primeConnection();
                }
            } else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != 0) {
                doIO(pendingQueue, cnxn);
            }
        }
        if (sendThread.getZkState().isConnected()) {
            if (findSendablePacket(outgoingQueue, sendThread.tunnelAuthInProgress()) != null) {
                enableWrite();
            }
        }
        selected.clear();
    }

很简单地会想到服务端响应之后会走到：

doIO(pendingQueue, cnxn);

看看这个方法里面做了什么：

    void doIO(Queue<Packet> pendingQueue, ClientCnxn cnxn) throws InterruptedException, IOException {
        SocketChannel sock = (SocketChannel) sockKey.channel();
        if (sock == null) {
            throw new IOException("Socket is null!");
        }
        if (sockKey.isReadable()) {
            int rc = sock.read(incomingBuffer);
            if (rc < 0) {
                throw new EndOfStreamException("Unable to read additional data from server sessionid 0x"
                                               + Long.toHexString(sessionId)
                                               + ", likely server has closed socket");
            }
            if (!incomingBuffer.hasRemaining()) {
                incomingBuffer.flip();
                if (incomingBuffer == lenBuffer) {
                    recvCount.getAndIncrement();
                    readLength();
                 	//第一次接受服务端的响应肯定会走到这else if里面来
                } else if (!initialized) {
               		 //读取服务端返回的结果
                    readConnectResult();
                    enableRead();
                    if (findSendablePacket(outgoingQueue, sendThread.tunnelAuthInProgress()) != null) {
                        // Since SASL authentication has completed (if client is configured to do so),
                        // outgoing packets waiting in the outgoingQueue can now be sent.
                        enableWrite();
                    }
					//省略部分代码
                    initialized = true;
                } 
         		  //省略部分代码
            }
        }

    }

由上面分析过的代码知道initialized的初始值为false，不行可以去上面找，在ClientCnxnSocketNIO#connect() 中

所以后面走到了readConnectResult()中，处理服务端的相应:

ClientCnxnSocket#readConnectResult()


    void readConnectResult() throws IOException {
        if (LOG.isTraceEnabled()) {
            StringBuilder buf = new StringBuilder("0x[");
            for (byte b : incomingBuffer.array()) {
                buf.append(Integer.toHexString(b)).append(",");
            }
            buf.append("]");
            if (LOG.isTraceEnabled()) {
                LOG.trace("readConnectResult {} {}", incomingBuffer.remaining(), buf.toString());
            }
        }

        ByteBufferInputStream bbis = new ByteBufferInputStream(incomingBuffer);
        BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
        ConnectResponse conRsp = new ConnectResponse();
        //反序列化
        conRsp.deserialize(bbia, "connect");

        // read "is read-only" flag
        boolean isRO = false;
        try {
            isRO = bbia.readBool("readOnly");
        } catch (IOException e) {
            // this is ok -- just a packet from an old server which
            // doesn't contain readOnly field
            LOG.warn("Connected to an old server; r-o mode will be unavailable");
        }

        this.sessionId = conRsp.getSessionId();
        sendThread.onConnected(conRsp.getTimeOut(), this.sessionId, conRsp.getPasswd(), isRO);
    }

ClientCnxn#onConnected()：

       void onConnected(
            int _negotiatedSessionTimeout,
            long _sessionId,
            byte[] _sessionPasswd,
            boolean isRO) throws IOException {
            negotiatedSessionTimeout = _negotiatedSessionTimeout;
            //省略部分代码
            //读写客户端不能与只读服务端建立连接
            if (!readOnly && isRO) {
                LOG.error("Read/write client got connected to read-only server");
            }

            readTimeout = negotiatedSessionTimeout * 2 / 3;
            connectTimeout = negotiatedSessionTimeout / hostProvider.size();
            hostProvider.onConnected();
            sessionId = _sessionId;
            sessionPasswd = _sessionPasswd;
            changeZkState((isRO) ? States.CONNECTEDREADONLY : States.CONNECTED);
            seenRwServerBefore |= !isRO;
            LOG.info(
                "Session establishment complete on server {}, session id = 0x{}, negotiated timeout = {}{}",
                clientCnxnSocket.getRemoteSocketAddress(),
                Long.toHexString(sessionId),
                negotiatedSessionTimeout,
                (isRO ? " (READ-ONLY mode)" : ""));
            KeeperState eventState = (isRO) ? KeeperState.ConnectedReadOnly : KeeperState.SyncConnected;
            eventThread.queueEvent(new WatchedEvent(Watcher.Event.EventType.None, eventState, null));
        }