Eureka原理以及核心代码分析

一、前言

我个人理解,看源码抓住主线逻辑去看就行了,没有必要把所有内容都理解,除非是你真正需要。我们看源码就是为了了解一个技术组件的原理以及它的实现方式,方便我们去定位bug和学习它们的设计思想。

看代码一定不要揪着一个点去想他的意思是什么,要先看全局再看细节

二、Eureka Client如何注册

正常我们把一个服务注册到eureka上只需要四步:
1.新建springboot项目 2.引入依赖 3.配置yml文件 4.启动项目

可以发现我们根本找不到这个服务注册到eureka service的入口在什么地方,所以我们得懂spring的原理,如果不懂的话就很难去理解springcloud相关组件的原理。

1. Lifecycle和SmartLifecycle

Lifecycle和SmartLifecycle是spring里面的东西,这里就不细说了,可参考:
https://blog.youkuaiyun.com/bronze5/article/details/106558309

2. 启动时注册代码分析

在spring启动的最后阶段,会从spring容器中调用实现了SmartLifecycle的类的start方法,eureka client就是根据这个原理,通过 EurekaAutoServiceRegistration 的 start方法 实现相关注册过程的。

在这里插入图片描述
EurekaAutoServiceRegistration中的start方法总的来说实现了三点:
1.更改实例状态 2.发布一个事件通知(这个步骤其实就会触发eureka client注册的功能) 3.注册健康检测机制

正常来说我们猜想eureka client向eureka server注册信息肯定会把自身实例的信息发送到eureka server服务上,但是start方法里并没有看到这部分代码。这部分内容在com.netflix.discovery包下的DiscoveryClient类中。

在DiscoveryClient类有一个服务注册的方法register(),该方法是通过Http请求向Eureka Server注册。其代码如下:

    /**
     * Register with the eureka service by making the appropriate REST call.
     */
    boolean register() throws Throwable {
        logger.info(PREFIX + "{}: registering service...", appPathIdentifier);
        EurekaHttpResponse<Void> httpResponse;
        try {
            httpResponse = eurekaTransport.registrationClient.register(instanceInfo);
        } catch (Exception e) {
            logger.warn(PREFIX + "{} - registration failed {}", appPathIdentifier, e.getMessage(), e);
            throw e;
        }
        if (logger.isInfoEnabled()) {
            logger.info(PREFIX + "{} - registration status: {}", appPathIdentifier, httpResponse.getStatusCode());
        }
        return httpResponse.getStatusCode() == Status.NO_CONTENT.getStatusCode();
    }
  @Override
    public EurekaHttpResponse<Void> register(InstanceInfo info) {
        String urlPath = "apps/" + info.getAppName();
        ClientResponse response = null;
        try {
            Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
            addExtraHeaders(resourceBuilder);
            response = resourceBuilder
                    .header("Accept-Encoding", "gzip")
                    .type(MediaType.APPLICATION_JSON_TYPE)
                    .accept(MediaType.APPLICATION_JSON)
                    .post(ClientResponse.class, info);
            return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
        } finally {
            if (logger.isDebugEnabled()) {
                logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
                        response == null ? "N/A" : response.getStatus());
            }
            if (response != null) {
                response.close();
            }
        }
    }

在DiscoveryClient类先上追踪register()方法被谁引用了,它被InstanceInfoReplicator 类的run()方法调用,其中InstanceInfoReplicator实现了Runnable接口,run()方法代码如下:

    public void run() {
        try {
            discoveryClient.refreshInstanceInfo();

            Long dirtyTimestamp = instanceInfo.isDirtyWithTime();
            if (dirtyTimestamp != null) {
                discoveryClient.register();
                instanceInfo.unsetIsDirty(dirtyTimestamp);
            }
        } catch (Throwable t) {
            logger.warn("There was a problem with the instance info replicator", t);
        } finally {
            Future next = scheduler.schedule(this, replicationIntervalSeconds, TimeUnit.SECONDS);
            scheduledPeriodicRef.set(next);
        }
    }

而InstanceInfoReplicator类是在DiscoveryClient初始化过程中使用的,其中DiscoveryClient类中有一个initScheduledTasks()方法。该方法主要启动了一些定时任务:

  • cacheRefreshTask:每30s去server拉取服务列表信息,默认30秒,可通过eureka.client.registryFetchIntervalSeconds 配置;
  • heartbeatTask:每30s向server续约(心跳机制),默认30秒,可通过eureka.instance.leaseRenewalIntervalInSeconds 配置;
  • InstanceInfoReplicator:每40S将InstanceInfo的信息往server同步,每当有instanceStatus改变的时候也会触发同步。默认40秒,可通过eureka.client.initialInstanceInfoReplicationIntervalSeconds 配置;

其实上面说的每隔多长时间执行一次是不准确的,eureka用了TimedSupervisorTask(自动调节间隔的周期性任务),可以先了解下这个类再去看下面的代码。

TimedSupervisorTask类的作用:
https://blog.youkuaiyun.com/boling_cavalry/article/details/82795825

    /**
     * Initializes all scheduled tasks.
     */
    private void initScheduledTasks() {
        if (clientConfig.shouldFetchRegistry()) {
            // registry cache refresh timer
            int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
            int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
            cacheRefreshTask = new TimedSupervisorTask(
                    "cacheRefresh",
                    scheduler,
                    cacheRefreshExecutor,
                    registryFetchIntervalSeconds,
                    TimeUnit.SECONDS,
                    expBackOffBound,
                    new CacheRefreshThread()
            );
            //-----------cacheRefreshTask 同步server最新列表-----------
            scheduler.schedule(
                    cacheRefreshTask,
                    registryFetchIntervalSeconds, TimeUnit.SECONDS);
        }

        if (clientConfig.shouldRegisterWithEureka()) {
            int renewalIntervalInSecs = instanceInfo.getLeaseInfo().getRenewalIntervalInSecs();
            int expBackOffBound = clientConfig.getHeartbeatExecutorExponentialBackOffBound();
            logger.info("Starting heartbeat executor: " + "renew interval is: {}", renewalIntervalInSecs);

            // Heartbeat timer
            heartbeatTask = new TimedSupervisorTask(
                    "heartbeat",
                    scheduler,
                    heartbeatExecutor,
                    renewalIntervalInSecs,
                    TimeUnit.SECONDS,
                    expBackOffBound,
                    new HeartbeatThread()
            );
            //-----------cacheRefreshTask 心跳机制-----------
            scheduler.schedule(
                    heartbeatTask,
                    renewalIntervalInSecs, TimeUnit.SECONDS);

            // InstanceInfo replicator
            instanceInfoReplicator = new InstanceInfoReplicator(
                    this,
                    instanceInfo,
                    clientConfig.getInstanceInfoReplicationIntervalSeconds(),
                    2); // burstSize

            statusChangeListener = new ApplicationInfoManager.StatusChangeListener() {
                @Override
                public String getId() {
                    return "statusChangeListener";
                }

                @Override
                public void notify(StatusChangeEvent statusChangeEvent) {
                    if (InstanceStatus.DOWN == statusChangeEvent.getStatus() ||
                            InstanceStatus.DOWN == statusChangeEvent.getPreviousStatus()) {
                        // log at warn level if DOWN was involved
                        logger.warn("Saw local status change event {}", statusChangeEvent);
                    } else {
                        logger.info("Saw local status change event {}", statusChangeEvent);
                    }
                    instanceInfoReplicator.onDemandUpdate();
                }
            };

            if (clientConfig.shouldOnDemandUpdateStatusChange()) {
                applicationInfoManager.registerStatusChangeListener(statusChangeListener);
            }

            //-----------instanceInfoReplicator里的run方法(向eureka server中注册自己)-----------
            instanceInfoReplicator.start(clientConfig.getInitialInstanceInfoReplicationIntervalSeconds());
        } else {
            logger.info("Not registering with Eureka server per configuration");
        }
    }

3. 流程图总结

Eureka Client发起服务注册时,有两个地方会执行服务注册的任务

  1. 在Spring Boot启动时,通过refresh方法,最终调用StatusChangeListener.notify进行服务状态变更的监听,而这个监听的方法收到事件之后会去执行服务注册。
  2. 在Spring Boot启动时,由于自动装配机制将CloudEurekaClient注入到了容器,并且执行了构造方法,而在构造方法中有一个定时任务每40s会执行一次判断,判断实例信息是否发生了变化,如果是,则会发起服务注册的流程。

在这里插入图片描述

三、Eureka Server如何存储服务地址

上面说了Eureka客户端发送注册请求的代码位置
AbstractJerseyEurekaHttpClient#register:

  @Override
    public EurekaHttpResponse<Void> register(InstanceInfo info) {
        String urlPath = "apps/" + info.getAppName();
        ClientResponse response = null;
        try {
            Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
            addExtraHeaders(resourceBuilder);
            response = resourceBuilder
                    .header("Accept-Encoding", "gzip")
                    .type(MediaType.APPLICATION_JSON_TYPE)
                    .accept(MediaType.APPLICATION_JSON)
                    .post(ClientResponse.class, info);
            return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
        } finally {
            if (logger.isDebugEnabled()) {
                logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
                        response == null ? "N/A" : response.getStatus());
            }
            if (response != null) {
                response.close();
            }
        }
    }

1. Eureka Server收到请求之后的处理

请求入口在: com.netflix.eureka.resources.ApplicationResource.addInstance()

这里所提供的REST服务,采用的是jersey来实现的。其实可以把ApplicationResource看成是spring mvc的Controller来理解。

当EurekaClient调用register方法发起注册时,会调用ApplicationResource.addInstance方法。服务注册就是发送一个 POST 请求带上当前实例信息到类 ApplicationResource 的 addInstance方法进行服务注册。

    @POST
    @Consumes({"application/json", "application/xml"})
    public Response addInstance(InstanceInfo info,
                                @HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication) {
        logger.debug("Registering instance {} (replication={})", info.getId(), isReplication);
        // validate that the instanceinfo contains all the necessary required fields
        if (isBlank(info.getId())) {
            return Response.status(400).entity("Missing instanceId").build();
        } else if (isBlank(info.getHostName())) {
            return Response.status(400).entity("Missing hostname").build();
        } else if (isBlank(info.getIPAddr())) {
            return Response.status(400).entity("Missing ip address").build();
        } else if (isBlank(info.getAppName())) {
            return Response.status(400).entity("Missing appName").build();
        } else if (!appName.equals(info.getAppName())) {
            return Response.status(400).entity("Mismatched appName, expecting " + appName + " but was " + info.getAppName()).build();
        } else if (info.getDataCenterInfo() == null) {
            return Response.status(400).entity("Missing dataCenterInfo").build();
        } else if (info.getDataCenterInfo().getName() == null) {
            return Response.status(400).entity("Missing dataCenterInfo Name").build();
        }

        // handle cases where clients may be registering with bad DataCenterInfo with missing data
        DataCenterInfo dataCenterInfo = info.getDataCenterInfo();
        if (dataCenterInfo instanceof UniqueIdentifier) {
            String dataCenterInfoId = ((UniqueIdentifier) dataCenterInfo).getId();
            if (isBlank(dataCenterInfoId)) {
                boolean experimental = "true".equalsIgnoreCase(serverConfig.getExperimental("registration.validation.dataCenterInfoId"));
                if (experimental) {
                    String entity = "DataCenterInfo of type " + dataCenterInfo.getClass() + " must contain a valid id";
                    return Response.status(400).entity(entity).build();
                } else if (dataCenterInfo instanceof AmazonInfo) {
                    AmazonInfo amazonInfo = (AmazonInfo) dataCenterInfo;
                    String effectiveId = amazonInfo.get(AmazonInfo.MetaDataKey.instanceId);
                    if (effectiveId == null) {
                        amazonInfo.getMetadata().put(AmazonInfo.MetaDataKey.instanceId.getName(), info.getId());
                    }
                } else {
                    logger.warn("Registering DataCenterInfo of type {} without an appropriate id", dataCenterInfo.getClass());
                }
            }
        }

        registry.register(info, "true".equals(isReplication));
        return Response.status(204).build();  // 204 to be backwards compatible
    }

在 addInstance 方法中,registry.register最终调用的是PeerAwareInstanceRegistryImpl.register 方法。

  • leaseDuration 表示租约过期时间,默认是90s,也就是当服务端超过90s没有收到客户端的心跳,则主动剔除该节点
  • 调用super.register发起节点注册
  • 将信息复制到Eureka Server集群中的其他机器上,同步的实现也很简单,就是获得集群中的所有节点,然后逐个发起注册。这里有个点需要注意,eureka发送同步请求时会在请求头中携带自定义的x-netflix-discovery-replication头,如果该值为true则不会再走同步请求,这样就解决了同步死循环的问题。
    @Override
    public void register(final InstanceInfo info, final boolean isReplication) {
        int leaseDuration = Lease.DEFAULT_DURATION_IN_SECS;
        //心跳超时时间默认为90秒,如果客户端有自己定义心跳超时时间,则采用客户端的时间
        if (info.getLeaseInfo() != null && info.getLeaseInfo().getDurationInSecs() > 0) {
            leaseDuration = info.getLeaseInfo().getDurationInSecs();
        }
        //节点注册
        super.register(info, leaseDuration, isReplication);
        //复制到Eureka Server集群中的其他节点
        replicateToPeers(Action.Register, info.getAppName(), info.getId(), info, null, isReplication);
    }

AbstractInstanceRegistry#register

注册过程核心逻辑,客户端地址信息就储存在了这个类的 private final ConcurrentHashMap<String, Map<String, Lease>> registry = new ConcurrentHashMap<String, Map<String, Lease>>() 属性中

    /**
     * Registers a new instance with a given duration.
     *
     * @see com.netflix.eureka.lease.LeaseManager#register(java.lang.Object, int, boolean)
     */
    public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
        try {
            read.lock();
            //根据appName从registry中获得当前实例信息(registry就是存储服务信息的容器)
            Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
            //增加注册次数到监控信息中
            REGISTER.increment(isReplication);
            if (gMap == null) {
            //如果当前appName是第一次注册,则初始化一个ConcurrentHashMap
                final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
                gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
                if (gMap == null) {
                    gMap = gNewMap;
                }
            }
            //从gMap中查询已经存在的Lease信息,Lease中文翻译为租约,实际上它把服务提供者的实例信息包装成了一个lease,里面提供了对于改服务实例的租约管理
            Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
            // 当instance已经存在是,和客户端的instance的信息做比较,时间最新的那个,为有效instance信息
            if (existingLease != null && (existingLease.getHolder() != null)) {
                Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
                Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
                logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);

                // this is a > instead of a >= because if the timestamps are equal, we still take the remote transmitted
                // InstanceInfo instead of the server local copy.
                if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
                    logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
                            " than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
                    logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
                    registrant = existingLease.getHolder();
                }
            } else {
             //当lease不存在时,进入到这段代码
                synchronized (lock) {
                    if (this.expectedNumberOfClientsSendingRenews > 0) {
                        // Since the client wants to register it, increase the number of clients sending renews
                        this.expectedNumberOfClientsSendingRenews = this.expectedNumberOfClientsSendingRenews + 1;
                        updateRenewsPerMinThreshold();
                    }
                }
                logger.debug("No previous lease information found; it is new registration");
            }
            //构建一个lease
            Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
            if (existingLease != null) {
                // 当原来存在Lease的信息时,设置serviceUpTimestamp, 保证服务启动的时间一直是第一次注册的那个
                lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
            }
            //储存
            gMap.put(registrant.getId(), lease);
            recentRegisteredQueue.add(new Pair<Long, String>(
                    System.currentTimeMillis(),
                    registrant.getAppName() + "(" + registrant.getId() + ")"));
            // 检查实例状态是否发生变化,如果是并且存在,则覆盖原来的状态
            if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
                logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
                                + "overrides", registrant.getOverriddenStatus(), registrant.getId());
                if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
                    logger.info("Not found overridden id {} and hence adding it", registrant.getId());
                    overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
                }
            }
            InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
            if (overriddenStatusFromMap != null) {
                logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
                registrant.setOverriddenStatus(overriddenStatusFromMap);
            }

            // Set the status based on the overridden status rules
            InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
            registrant.setStatusWithoutDirty(overriddenInstanceStatus);

            // 得到instanceStatus,判断是否是UP状态
            if (InstanceStatus.UP.equals(registrant.getStatus())) {
                lease.serviceUp();
            }
            registrant.setActionType(ActionType.ADDED);
            recentlyChangedQueue.add(new RecentlyChangedItem(lease));
            registrant.setLastUpdatedTimestamp();
            //让缓存失效
            invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
            logger.info("Registered instance {}/{} with status {} (replication={})",
                    registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
        } finally {
            read.unlock();
        }
    }

PeerAwareInstanceRegistryImpl#replicateToPeers

在这里插入图片描述

2. Eureka Server多级缓存机制

参考:
https://www.cnblogs.com/shihaiming/p/11590748.html
https://www.shared-code.com/article/53

总结:
Eureka Server 存在三个变量:registry、readWriteCacheMap、readOnlyCacheMap 保存服务注册信息。

类 AbstractInstanceRegistry

private final ConcurrentHashMap<String, Map<String, Lease<InstanceInfo>>> registry
            = new ConcurrentHashMap<String, Map<String, Lease<InstanceInfo>>>();

类 ResponseCacheImpl

private final ConcurrentMap<Key, Value> readOnlyCacheMap = new ConcurrentHashMap<Key, Value>();
private final LoadingCache<Key, Value> readWriteCacheMap;

当存在大规模的服务注册和更新时,如果只是修改 ConcurrentHashMap 里的数据,那么势必因为锁的存在导致竞争,影响性能。而 Eureka又是AP模型,只需要满足最终可用就行。所以它在这里用到多级缓存来实现读写分离。

registry: 服务下线,过期,注册,状态变更时会更新registry里面的数据。eureka server web页面查出的服务信息是从registry里获取的

readWriteCacheMap :
1.当服务下线,过期,注册,状态变更,都会来清除这个缓存里面的数据。
2.默认180秒会自动失效。
3.readOnlyCacheMap每30秒会自动同步readWriteCacheMap 里面的数据,当readWriteCacheMap里面没有数据时会自动更新数据(从registry里获取),即30秒的那个自动任务可以使已经失效被清除的readWriteCacheMap重新加载。

readOnlyCacheMap : 这是一个JVM的CurrentHashMap只读缓存,这个主要是为了供客户端获取注册信息时使用,其缓存更新,依赖于定时器(默认30秒执行一次),通过和readWriteCacheMap 的值做对比,如果数据不一致,则以readWriteCacheMap 的数据为准。

四、总结

我自己去看源码时很多地方也是很懵,越看越迷。其实正常来说知道大概逻辑和流程也是不错的,没必要非得把很多地方都弄懂,看了eureka源码其实感觉也没什么好总结的,这里就大致写一下吧。

eureka客户端
eureka客户端在启动时会把自己注册到eureka服务端上,其实就是发个http请求,server端在接收到请求时把客户端发来的信息储存在一个ConcurrentHashMap里。客户端停止时也是如此罢了。

在客户端启动时除了注册到服务端,剩下比较重要的就是通过自动装配启动了一些定时任务(TimedSupervisorTask(自动调节间隔的周期性任务)),比如,定时续约(心跳机制)、定时拉去服务端的注册信息、定时推送信息到服务端。

eureka服务端
eureka服务端比较重要的就是把各个实例的信息储存在一个ConcurrentHashMap里。为了解决竞争引起的性能问题又引入了多级缓存的概念。

还有eureka服务端集群是如何同步注册信息的,以及同步引起的死循环问题是如果解决的。

还有服务端的自动保护机制:https://blog.youkuaiyun.com/qq_35080214/article/details/109443045

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值