ODL之Netconf重连

ODL中Netconf支持设备异常下线后定时重连。其相关功能介绍如下:

在节点添加成功后,会创建该设备的Communicator,负责控制器与该设备节点的连接沟通处理逻辑。


AbstractNetconfTopology.java

protected NetconfConnectorDTO createDeviceCommunicator(final NodeId nodeId,
                                                           final NetconfNode node) {
        //setup default values since default value is not supported yet in mdsal
        // TODO remove this when mdsal starts supporting default values 节点配置参数获取
        final Long defaultRequestTimeoutMillis = node.getDefaultRequestTimeoutMillis() == null ? DEFAULT_REQUEST_TIMEOUT_MILLIS : node.getDefaultRequestTimeoutMillis();
        final Long keepaliveDelay = node.getKeepaliveDelay() == null ? DEFAULT_KEEPALIVE_DELAY : node.getKeepaliveDelay();//保活心跳间隔 120s
        final Boolean reconnectOnChangedSchema = node.isReconnectOnChangedSchema() == null ? DEFAULT_RECONNECT_ON_CHANGED_SCHEMA : node.isReconnectOnChangedSchema();

        IpAddress ipAddress = node.getHost().getIpAddress();
        InetSocketAddress address = new InetSocketAddress(ipAddress.getIpv4Address() != null ?
                ipAddress.getIpv4Address().getValue() : ipAddress.getIpv6Address().getValue(),
                node.getPort().getValue());
        RemoteDeviceId remoteDeviceId = new RemoteDeviceId(nodeId.getValue(), address);

        RemoteDeviceHandler<NetconfSessionPreferences> salFacade =
                createSalFacade(remoteDeviceId, node, domBroker, bindingAwareBroker);
        //这里根据传入节点的keepaliveDelay配置,在设置为0时,会使用NetconfDevicesSalFacade,即无保活心跳机制
        if (keepaliveDelay > 0) {
            LOG.warn("Adding keepalive facade, for device {}", nodeId);
            salFacade = new KeepaliveSalFacade(remoteDeviceId, salFacade, keepaliveExecutor.getExecutor(), keepaliveDelay, defaultRequestTimeoutMillis);
        }

        final NetconfDevice.SchemaResourcesDTO schemaResourcesDTO = setupSchemaCacheDTO(nodeId, node);

        final NetconfDevice device = new NetconfDevice(schemaResourcesDTO, remoteDeviceId, salFacade,
                processingExecutor.getExecutor(), reconnectOnChangedSchema);

        final Optional<NetconfSessionPreferences> userCapabilities = getUserCapabilities(node);

        NetconfDeviceCommunicator communicator = userCapabilities.isPresent() ?
                new NetconfDeviceCommunicator(
                        remoteDeviceId, device, new UserPreferences(userCapabilities.get(), node.getYangModuleCapabilities().isOverride())):
                new NetconfDeviceCommunicator(remoteDeviceId, device);
        final NetconfConnectorDTO netconfConnectorDTO = new NetconfConnectorDTO(communicator, salFacade);
        salFacade.setListener(communicator);
        setCommunicator(nodeId, netconfConnectorDTO.getCommunicator());
        return netconfConnectorDTO;
    }
            leaf connection-timeout-millis {
                description "Specifies timeout in milliseconds after which connection must be established.";
                type uint32;
                default 20000;
            }

            leaf default-request-timeout-millis {
                description "Timeout for blocking operations within transactions.";
                type uint32;
                default 60000;
            }

            leaf max-connection-attempts {
                description "Maximum number of connection retries. Non positive value or null is interpreted as infinity.";
                type uint32;
                default 0; // retry forever
            }

            leaf between-attempts-timeout-millis {
                description "Initial timeout in milliseconds to wait between connection attempts. Will be multiplied by sleep-factor with every additional attempt";
                type uint16;
                default 2000;
            }

            leaf sleep-factor {
                type decimal64 {
                    fraction-digits 1;
                }
                default 1.5;
            }

在session创建成功后,AbstractSessionNegotiator中channelActive,执行startNegotiation,发送Hello报文,NetconfClientSessionNegotiator handleMessage中处理设备返回Hello报文

getSessionForHelloMessage中将session状态修改为ESTABLISHED

connection-timeout-millis:是指发起negotiation时,session从OPEN_WAIT变为ESTABLISHED状态的超时时间,当时间到,并且promise没有完成且没有取消,则协商失败,关闭channel

default-request-timeout-millis:在KeepaliveSalFacade类中KeepaliveDOMRpcService的invokeRpc,在RPC调用超时后,取消

maxConnectionAttempts, betweenAttemptsTimeoutMillis, sleepFactor:用于重连逻辑中重连时机的计算


保活心跳机制:

顾名思义是建立在节点已经连接上的基础上(如当session状态ideal),KeepaliveSalFacade.java

sessionCreated(IoSession session) 当有新的连接建立的时候,该方法被调用。
sessionOpened(IoSession session) 当有新的连接打开的时候,该方法被调用。该方法在 sessionCreated之后被调用。
sessionClosed(IoSession session) 当连接被关闭的时候,此方法被调用。
sessionIdle(IoSession session, IdleStatus status) 当连接变成闲置状态的时候,此方法被调用。
exceptionCaught(IoSession session, Throwable cause)当 I/O 处理器的实现,此方法被调用。


说明:

sessionCreated 和 sessionOpened 的区别。sessionCreated方法是由 I/O 处理线程来调用的,而 sessionOpened是由其它线程来调用的。

因此从性能方面考虑,不要在 sessionCreated 方法中执行过多的操作。

对于sessionIdle,默认情况下,闲置时间设置是禁用的,也就是说sessionIdle 并不会被调用。可以通过 IoSessionConfig.setIdleTime(IdleStatus, int) 来进行设置。

KeepaliveSalFacade.java

    @Override
    public void onDeviceConnected(final SchemaContext remoteSchemaContext, final NetconfSessionPreferences netconfSessionPreferences, final DOMRpcService deviceRpc) {
        this.currentDeviceRpc = deviceRpc;
        final DOMRpcService deviceRpc1 = new KeepaliveDOMRpcService(deviceRpc, resetKeepaliveTask, defaultRequestTimeoutMillis, executor);
        salFacade.onDeviceConnected(remoteSchemaContext, netconfSessionPreferences, deviceRpc1);

        LOG.debug("{}: Netconf session initiated, starting keepalives", id);
        scheduleKeepalive();
    }

连接成功后,调用scheduleKeepalive启动保活心跳机制


    private void scheduleKeepalive() {
        Preconditions.checkState(currentDeviceRpc != null);
        LOG.trace("{}: Scheduling next keepalive in {} {}", id, keepaliveDelaySeconds, TimeUnit.SECONDS);
        currentKeepalive = executor.schedule(new Keepalive(currentKeepalive), keepaliveDelaySeconds, TimeUnit.SECONDS);
    }

KeepaliveSalFacade.java中Keepalive实现了Runnable和FutureCallBack,其调用了rpc(get-config),其回调函数中,除成功返回响应外,都触发重连。

        @Override
        public void onSuccess(final DOMRpcResult result) {
            if (result != null && result.getResult() != null) {
                LOG.debug("{}: Keepalive RPC successful with response: {}", id, result.getResult());
                scheduleKeepalive();
            } else {
                LOG.warn("{} Keepalive RPC returned null with response: {}. Reconnecting netconf session", id, result);
                reconnect();
            }
        }

        @Override
        public void onFailure(@Nonnull final Throwable t) {
            LOG.warn("{}: Keepalive RPC failed. Reconnecting netconf session.", id, t);
            reconnect();
        }

考虑到除了getConfig请求,业务的其它RPC也能返回节点的数据,亦能证明节点Session存在,所以KeepaliveDOMRpcService的invokeRpc调用回调成功函数中会重置keepalive定时器。借助业务的RPC降低keepalive的心跳压力。
<node xmlns="urn:TBD:params:xml:ns:yang:network-topology">
   <node-id>testa</node-id>
   <host xmlns="urn:opendaylight:netconf-node-topology">10.42.94.233</host>
   <port xmlns="urn:opendaylight:netconf-node-topology">17830</port>
   <username xmlns="urn:opendaylight:netconf-node-topology">admin</username>
   <password xmlns="urn:opendaylight:netconf-node-topology">admin</password>
   <tcp-only xmlns="urn:opendaylight:netconf-node-topology">false</tcp-only>
   <keepalive-delay xmlns="urn:opendaylight:netconf-node-topology">0</keepalive-delay>
   <sleep-factor xmlns="urn:opendaylight:netconf-node-topology">1</sleep-factor>
   <reconnect-on-changed-schema xmlns="urn:opendaylight:netconf-node-topology">true</reconnect-on-changed-schema>
 </node>
可以通过节点参数配置,可以参考YANG文件netconf-node-topology.yang
 
断链重连:
断链,则之前已经创建链接,Netconf要创建链接,首先进行了设备节点的添加(写config库)
ProtocolSessionPromise.java
synchronized void connect() {
        final Object lock = this;

        try {
            final int timeout = this.strategy.getConnectTimeout();

            LOG.debug("Promise {} attempting connect for {}ms", lock, timeout);

            if(this.address.isUnresolved()) {
                this.address = new InetSocketAddress(this.address.getHostName(), this.address.getPort());
            }
            this.b.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, timeout);
            final ChannelFuture connectFuture = this.b.connect(this.address);
            // Add listener that attempts reconnect by invoking this method again.
            connectFuture.addListener(new BootstrapConnectListener(lock));
            this.pending = connectFuture;
        } catch (final Exception e) {
            LOG.info("Failed to connect to {}", address, e);
            setFailure(e);
        }
    }
timeout默认为2秒,即示2秒连接不上(其后退避策略计算),则超时(信号灯超时时间)
BootstrapConnectListener.java这个监听器的关键是在于连接不成功的逻辑(重连)
 
   LOG.debug("Attempt to connect to {} failed", ProtocolSessionPromise.this.address, cf.cause());

                final Future<Void> rf = ProtocolSessionPromise.this.strategy.scheduleReconnect(cf.cause());
                rf.addListener(new ReconnectingStrategyListener());
                ProtocolSessionPromise.this.pending = rf;

超时连接不成功,则开始重连逻辑,使用的策略为TimedReconnectStrategy.java

        leaf between-attempts-timeout-millis {
            description "Initial timeout in milliseconds to wait between connection attempts. Will be multiplied by sleep-factor with every additional attempt";
            config true;
            type uint16;
            default 2000;
        }

这里的重连等待时间采用的是退避算法(借助sleep-factor)

ReconnectingStrategyListener则比较简单,在重连时间计算feature到达后,连接即可。
connect的流程又回到了起始地方,形成一个循环。


当连接断开后,又是如何进行重连的。


在设备掉线后,一系列的channelInactive会触发,进入ClosedChannelHandler.channelInactive从而会触发ReconnectPromise的connect

@Override
public void channelInactive(final ChannelHandlerContext ctx) throws Exception {
    // This is the ultimate channel inactive handler, not forwarding
    if (promise.isCancelled()) {
        return;
    }

    if (promise.isInitialConnectFinished() == false) {
        LOG.debug("Connection to {} was dropped during negotiation, reattempting", promise.address);
    }

    LOG.debug("Reconnecting after connection to {} was dropped", promise.address);
    promise.connect();
}

最后的打印,表明重连

针对于后序的Ssh连接:

在进行重连后,进入AbstractChannelHandlerContext.java

 
    private void invokeConnect(SocketAddress remoteAddress, SocketAddress localAddress, ChannelPromise promise) {
        if (isAdded()) {
            try {
                ((ChannelOutboundHandler) handler()).connect(this, remoteAddress, localAddress, promise);
            } catch (Throwable t) {
                notifyOutboundHandlerException(t, promise);
            }
        } else {
            connect(remoteAddress, localAddress, promise);
        }
    }

其中handle()方法会依次调用返回: 

DefaultChannelPipeline.java connect

NetconfHelloMessageToXMLEncoder

EOMFramingMechanismEncoder

AsynSshHandler.java

 
 
 
 
 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值