Nacos源码解析系列目录
- Nacos 源码编译运行
- (Nacos源码解析一)Nacos 注册实例源码解析
- (Nacos源码解析二)Nacos 服务发现源码解析
- (Nacos源码解析三)Nacos 心跳机制源码解析
- (Nacos源码解析四)Nacos服务健康检查源码解析
- (Nacos源码解析五)Nacos服务事件变动源码解析
Nacos服务健康检查源码解析
概念
服务第一次注册的时候,开启对服务的定时检查客户端的健康状态
服务端
客户端调用服务端注册流程,可见
(Nacos源码解析一)Nacos 注册实例源码解析
com.alibaba.nacos.naming.controllers.InstanceController#register
com.alibaba.nacos.naming.core.ServiceManager#registerInstance
com.alibaba.nacos.naming.core.ServiceManager#createEmptyService
com.alibaba.nacos.naming.core.ServiceManager#createServiceIfAbsent
com.alibaba.nacos.naming.core.ServiceManager#putServiceAndInit
com.alibaba.nacos.naming.core.Service#init
1、创建客户端的心跳检查任务
public void init() {
// GlobalExecutor.scheduleNamingHealth(task, 5000, 5000, TimeUnit.MILLISECONDS))
// 默认每5s调度一次
HealthCheckReactor.scheduleCheck(clientBeatCheckTask);
for (Map.Entry<String, Cluster> entry : clusterMap.entrySet()) {
entry.getValue().setService(this);
entry.getValue().init();
}
}
2、心跳检查任务
com.alibaba.nacos.naming.healthcheck.ClientBeatCheckTask#run
@Override
public void run() {
try {
if (!getDistroMapper().responsible(service.getName())) {
return;
}
if (!getSwitchDomain().isHealthCheckEnabled()) {
return;
}
// 获取全部实例
List<Instance> instances = service.allIPs(true);
// first set health status of instances:
for (Instance instance : instances) {
// 当前时间 - 上一次心跳时间 > 心跳超时时间(默认15s)
if (System.currentTimeMillis() - instance.getLastBeat() > instance.getInstanceHeartBeatTimeOut()) {
if (!instance.isMarked()) {
if (instance.isHealthy()) {
// 标记为不健康状态
instance.setHealthy(false);
Loggers.EVT_LOG
.info("{POS} {IP-DISABLED} valid: {}:{}@{}@{}, region: {}, msg: client timeout after {}, last beat: {}",
instance.getIp(), instance.getPort(), instance.getClusterName(),
service.getName(), UtilsAndCommons.LOCALHOST_SITE,
instance.getInstanceHeartBeatTimeOut(), instance.getLastBeat());
getPushService().serviceChanged(service);
ApplicationUtils.publishEvent(new InstanceHeartbeatTimeoutEvent(this, instance));
}
}
}
}
if (!getGlobalConfig().isExpireInstance()) {
return;
}
// then remove obsolete instances:
for (Instance instance : instances) {
if (instance.isMarked()) {
continue;
}
// 当前时间 - 上一次心跳时间 > 实例删除时间(默认30s)
if (System.currentTimeMillis() - instance.getLastBeat() > instance.getIpDeleteTimeout()) {
// delete instance
Loggers.SRV_LOG.info("[AUTO-DELETE-IP] service: {}, ip: {}", service.getName(),
JacksonUtils.toJson(instance));
// 删除实例
deleteIp(instance);
}
}
} catch (Exception e) {
Loggers.SRV_LOG.warn("Exception while processing client beat time out.", e);
}
}
3、删除实例
com.alibaba.nacos.naming.healthcheck.ClientBeatCheckTask#deleteIp
private void deleteIp(Instance instance) {
try {
NamingProxy.Request request = NamingProxy.Request.newRequest();
// 封装删除实例的信息
request.appendParam("ip", instance.getIp()).appendParam("port", String.valueOf(instance.getPort()))
.appendParam("ephemeral", "true").appendParam("clusterName", instance.getClusterName())
.appendParam("serviceName", service.getName()).appendParam("namespaceId", service.getNamespaceId());
String url = "http://" + IPUtil.localHostIP() + IPUtil.IP_PORT_SPLITER + EnvUtil.getPort() + EnvUtil.getContextPath()
+ UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance?" + request.toUrl();
....
} catch (Exception e) {
Loggers.SRV_LOG
.error("[IP-DEAD] failed to delete ip automatically, ip: {}, error: {}", instance.toJson(), e);
}
}
4、调用服务下线方法
com.alibaba.nacos.naming.controllers.InstanceController#deregister
com.alibaba.nacos.naming.core.ServiceManager#removeInstance(java.lang.String, java.lang.String, boolean, com.alibaba.nacos.naming.core.Instance…)
private void removeInstance(String namespaceId, String serviceName, boolean ephemeral, Service service,
Instance... ips) throws NacosException {
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
// 更新instanceMap
List<Instance> instanceList = substractIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
// 将注册实例更新到内存注册表并同步到集群其他节点
consistencyService.put(key, instances);
}

本文详细解析Nacos服务健康检查的实现过程,涉及注册实例时创建心跳检查任务,检查中判断实例健康状态并处理超时和过期,以及自动删除未响应的服务实例。
4377

被折叠的 条评论
为什么被折叠?



