1、阅读源码前的准备
配置数据源
将distribution工程下conf文件夹中的nacos-mysql.sql脚本执行,然后在console工程的application.properties中加入以下数据库的配置:
spring.datasource.platform=mysqldb.num=1### Connect URL of DB:db.url.0=jdbc:mysql://127.0.0.1:3306/nacos?characterEncoding=utf8&connectTimeout=1000&socketTimeout=3000&autoReconnect=true&useUnicode=true&useSSL=false&serverTimezone=UTCdb.user=rootdb.password=123456
2、阅读源码
2.1 启动类com.alibaba.nacos.Nacos
添加启动参数:-Dnacos.standalone=true -Dnacos.home=D:\nacos,启动工程即可。
问题:
①:客户端怎么和服务端保持长连接
②:配置更新之后如何进行推送更新
当我们通过控制台添加一个配置并发布的时候,代码实际上执行的是:com.alibaba.nacos.config.server.controller.ConfigController#publishConfig
这里面主要做了两件事:
保存配置信息到数据库
persistService.insertOrUpdate(srcIp, srcUser, configInfo, time, configAdvanceInfo, true);
这个persistService在nacos中有两个实现,分别是:
这两个实现本身作为一个Component被加载,而被加载的条件就是使用@Conditional(value = ConditionOnExternalStorage.class)来进行判断:
public class ConditionOnExternalStorage implements Condition { @Override public boolean matches(ConditionContext context, AnnotatedTypeMetadata metadata) { return !PropertyUtil.isEmbeddedStorage(); } }
发布一个配置变更事件
ConfigChangePublisher .notifyConfigChange(new ConfigDataChangeEvent(false, dataId, group, tenant, time.getTime()));
既然在这里发布事件,那么就肯定有一个地方在监听这个事件,经过查找我们看到了AsyncNotifyService。
com.alibaba.nacos.config.server.service.ConfigChangePublisher#notifyConfigChange
public static void notifyConfigChange(ConfigDataChangeEvent event) { if (PropertyUtil.isEmbeddedStorage() && !ApplicationUtils.getStandaloneMode()) { return; } NotifyCenter.publishEvent(event);}
com.alibaba.nacos.common.notify.NotifyCenter#publishEvent(Event)
public static boolean publishEvent(final Event event) { try { return publishEvent(event.getClass(), event); } catch (Throwable ex) { LOGGER.error("There was an exception to the message publishing : {}", ex); return false; }}
private static boolean publishEvent(final Class extends Event> eventType, final Event event) { final String topic = ClassUtils.getCanonicalName(eventType); if (ClassUtils.isAssignableFrom(SlowEvent.class, eventType)) { return INSTANCE.sharePublisher.publish(event); } if (INSTANCE.publisherMap.containsKey(topic)) { EventPublisher publisher = INSTANCE.publisherMap.get(topic); return publisher.publish(event); } LOGGER.warn("There are no [{}] publishers for this event, please register", topic); return false;}
上面这段代码就是从一个事件--->事件发布器中根据事件名称取出发布器,然后执行事件发布,那么接下来,就有几个问题了:
publisherMap是在什么时候填充的呢?
通过搜索代码,我们可以看到这个publisherMap的填充是在com.alibaba.nacos.common.notify.NotifyCenter#registerToPublisher中完成的
registerToPublisher什么时候被调用?
通过观察方法的调用链,我们发现最终是在ServerMemberManager中的构造器里面的init()方法调用到了NotifyCenter#registerToPublisher方法,
最终经过这个方法之后,publisherMap里面就主要有三个元素:
到此解决了上面的两个问题,后续将重点说一下这个ServerMemberManager类。
2.2 发布配置
2.2.1 方法调用链
NacosConfigService#publishConfig
NacosConfigService#publishConfigInner
ConfigController#publishConfig
ConfigChangePublisher#notifyConfigChange
NotifyCenter#publishEvent(Event)
主要是调用/v1/cs/configs的post请求来发布配置,而该端点对应的controller为ConfigController,方法为publishConfig
主要在com.alibaba.nacos.common.notify.EventPublisher#publish中完成,最终由DefaultPublisher#publish方法完成:
@Overridepublic boolean publish(Event event) { checkIsStart(); boolean success = this.queue.offer(event); if (!success) { receiveEvent(event); return true; } return true;}
可见,针对这个事件,事件发布器并没有直接发布,而是将事件先放到一个阻塞队列里面,如果队列满了那么就转而执行receiveEvent方法,这里我们看到的是往队列里面放事件,那么什么时候从队列里面取数据呢?
在DefaultPublisher#openEventHandler中执行取数据,然后再调用receiveEvent方法,所以重点就在receiveEvent中,
DefaultPublisher#receiveEvent:
void receiveEvent(Event event) { final long currentEventSequence = event.sequence(); for (Subscriber subscriber : subscribers) { if (subscriber.ignoreExpireEvent() && lastEventSequence > currentEventSequence) { LOGGER.debug("[NotifyCenter] the {} is unacceptable to this subscriber, because had expire", event.getClass()); continue; } notifySubscriber(subscriber, event); }}
上面这代码就是经典的观察者模式的实现,当有事件产生时,遍历所有的订阅者,然后通知订阅者(调用订阅者的onEvent方法)
DefaultPublisher#notifySubscriber
@Overridepublic void notifySubscriber(final Subscriber subscriber, final Event event) { final Runnable job = new Runnable() { @Override public void run() { subscriber.onEvent(event); } }; final Executor executor = subscriber.executor(); if (executor != null) { executor.execute(job); } else { try { job.run(); } catch (Throwable e) { LOGGER.error("Event callback exception : {}", e); } }}
只不过在这里,将onEvent方法的调用包装在一个Runnable,如果说订阅者有线程池就将该Runnable交给线程池执行,没有的话就直接运行。
到这里,还有一个问题,我们的订阅者是什么时候被添加到DefaultPublisher中的呢?
可以看到最终由NotifyCenter的registerSubscriber调用,而这个registerSubscriber是在AsyncNotifyService的构造器中被调用,AsyncNotifyService本身作为一个spring的bean,当容器已启动时,AsyncNotifyService的构造器就被会调用:
public AsyncNotifyService(ServerMemberManager memberManager) { this.memberManager = memberManager; httpclient.start(); // 将ConfigDataChangeEvent注册到 NotifyCenter.本质上就是往publisherMap中添加:事件名:事件发布器 // Map publisherMap NotifyCenter.registerToPublisher(ConfigDataChangeEvent.class, NotifyCenter.ringBufferSize); // 注册一个Subscriber来订阅ConfigDataChangeEvent事件,当有配置数据发生变化时,就会调用到onEvent方法上 NotifyCenter.registerSubscriber(new Subscriber() { @Override public void onEvent(Event event) { // Generate ConfigDataChangeEvent concurrently if (event instanceof ConfigDataChangeEvent) { ConfigDataChangeEvent evt = (ConfigDataChangeEvent) event; long dumpTs = evt.lastModifiedTs; String dataId = evt.dataId; String group = evt.group; String tenant = evt.tenant; String tag = evt.tag; Collection ipList = memberManager.allMembers(); //将更新数据的信息包装为一个NotifySingleTask丢在队列中 Queue queue = new LinkedList(); for (Member member : ipList) { queue.add(new NotifySingleTask(dataId, group, tenant, tag, dumpTs, member.getAddress(), evt.isBeta)); } //异步的执行队列中的任务 ConfigExecutor.executeAsyncNotify(new AsyncTask(httpclient, queue)); } } @Override public Class extends Event> subscribeType() { return ConfigDataChangeEvent.class; } });}
2.2.2 NotifySingleTask:
本身是NotifyTask的子类,里面只是包含一些基本信息:
public NotifySingleTask(String dataId, String group, String tenant, String tag, long lastModified, String target, boolean isBeta) { super(dataId, group, tenant, lastModified); this.target = target; this.isBeta = isBeta; try { dataId = URLEncoder.encode(dataId, Constants.ENCODE); group = URLEncoder.encode(group, Constants.ENCODE); } catch (UnsupportedEncodingException e) { LOGGER.error("URLEncoder encode error", e); } if (StringUtils.isBlank(tenant)) { this.url = MessageFormat.format(URL_PATTERN, target, ApplicationUtils.getContextPath(), dataId, group); } else { this.url = MessageFormat .format(URL_PATTERN_TENANT, target, ApplicationUtils.getContextPath(), dataId, group, tenant); } if (StringUtils.isNotEmpty(tag)) { url = url + "&tag=" + tag; } failCount = 0;}
这里面有个重要的属性url,通过debug之后,我们发现这个url的值为:
http://192.168.1.6:8848/nacos/v1/cs/communication/dataChange?dataId=dd&group=DEFAULT_GROUP
通过进一步观察我们得知,这个url代表的controller为:
2.2.3 AsyncTask:
AsyncTask实现了Runnable接口,可见这是一个将要被执行的任务,在AsyncTask里面维护了我们之前的队列queue和httpClient:
class AsyncTask implements Runnable { public AsyncTask(CloseableHttpAsyncClient httpclient, Queuequeue) { this.httpclient = httpclient; this.queue = queue; }}
队列中就是我们放入的NotifySingleTask对象,里面有个url属性,接下来就是从队列中取出这个NotifySingleTask对象,然后通过httpclient像url发起调用了,最终调到的就是CommunicationController的notifyConfigInfo方法
@GetMapping("/dataChange")public Boolean notifyConfigInfo(HttpServletRequest request, @RequestParam("dataId") String dataId, @RequestParam("group") String group, @RequestParam(value = "tenant", required = false, defaultValue = StringUtils.EMPTY) String tenant, @RequestParam(value = "tag", required = false) String tag) { dataId = dataId.trim(); group = group.trim(); String lastModified = request.getHeader(NotifyService.NOTIFY_HEADER_LAST_MODIFIED); long lastModifiedTs = StringUtils.isEmpty(lastModified) ? -1 : Long.parseLong(lastModified); String handleIp = request.getHeader(NotifyService.NOTIFY_HEADER_OP_HANDLE_IP); String isBetaStr = request.getHeader("isBeta"); if (StringUtils.isNotBlank(isBetaStr) && trueStr.equals(isBetaStr)) { dumpService.dump(dataId, group, tenant, lastModifiedTs, handleIp, true); } else { dumpService.dump(dataId, group, tenant, tag, lastModifiedTs, handleIp); } return true;}
2.3 获取配置
nacos将获取配置,发布配置的功能统一封装在ConfigService中,而且它只有唯一的实现类NacosConfigService,我们重点看看这个实现类。
NacosConfigService中有几个字段需要重点引起关注:HttpAgent和ClientWorker,后者是实现长连接的关键,关于获取配置,我们重点观察其带有监听器的回调方法。
2.3.1 获取配置的同时注册监听器
@Overridepublic String getConfigAndSignListener(String dataId, String group, long timeoutMs, Listener listener) throws NacosException { String content = getConfig(dataId, group, timeoutMs); worker.addTenantListenersWithContent(dataId, group, content, Arrays.asList(listener)); return content;}
上面这个方法在获取配置内容的同时,注册了一个监听器,以便在后续能接收到配置更新,关于getConfig方法,我们后面重点说。
ClientWorker#addTenantListeners
public void addTenantListeners(String dataId, String group, List extends Listener> listeners) throws NacosException { group = null2defaultGroup(group); String tenant = agent.getTenant(); //根据dataId,group,tenant构建缓存数据,然后将监听器添加到该缓存数据上,一旦缓存数据有变化的时候,就会通知到这些监听器 CacheData cache = addCacheDataIfAbsent(dataId, group, tenant); for (Listener listener : listeners) { cache.addListener(listener); } }
NacosConfigService#getConfig
@Overridepublic String getConfig(String dataId, String group, long timeoutMs) throws NacosException { return getConfigInner(namespace, dataId, group, timeoutMs);}
NacosConfigService#getConfigInner
private String getConfigInner(String tenant, String dataId, String group, long timeoutMs) throws NacosException { ConfigResponse cr = new ConfigResponse(); // 优先使用本地配置 String content = LocalConfigInfoProcessor.getFailover(agent.getName(), dataId, group, tenant); if (content != null) { //这里不会把内容全部都输出,如果内容小于100时,全部输出内容,超过100时,只会输出前100 LOGGER.warn("[{}] [get-config] get failover ok, dataId={}, group={}, tenant={}, config={}", agent.getName(), dataId, group, tenant, ContentUtils.truncateContent(content)); cr.setContent(content); configFilterChainManager.doFilter(null, cr); content = cr.getContent(); return content; } try { //走到这里说明,本地没有(一般是第一次获取),那么将从远端服务器拉取 String[] ct = worker.getServerConfig(dataId, group, tenant, timeoutMs); cr.setContent(ct[0]); configFilterChainManager.doFilter(null, cr); content = cr.getContent(); return content; } catch (NacosException ioe) { }}
ClientWorker#getServerConfig
这个方法只要通过httpClient调用config服务的/v1/cs/configs端点来拉取数据,拉取到数据的同时在本地保存一份,以便将来直接使用本地的数据。
2.4 ServerMemberManager解读
该类本身实现了ApplicationListener接口,说明其是一个事件监听器,当然,在nacos中,它的作用不仅仅于此,按照API的说法,主要是一个集群节点管理。
public class ServerMemberManager implements ApplicationListener<WebServerInitializedEvent> {}
@Overridepublic void onApplicationEvent(WebServerInitializedEvent event) { getSelf().setState(NodeState.UP); if (!ApplicationUtils.getStandaloneMode()) { GlobalExecutor.scheduleByCommon(this.infoReportTask, 5_000L); } ApplicationUtils.setPort(event.getWebServer().getPort()); ApplicationUtils.setLocalAddress(this.localAddress); Loggers.CLUSTER.info("This node is ready to provide external services");}
监听WebServerInitializedEvent事件,就可以拿到Webserver的端口,并且可以调用WebServer的start()和stop()方法。同样的玩法,我们还可以实现ServletContextInitializer接口,然后在其onStartup(ServletContext servletContext)方法中针对servletContext进行一些信息的get操作。
我们在registerClusterEvent中看到了NotifyCenter.registerSubscriber方法的调用,其实这个方法在nacos的代码中出现了多次:
所以,只要有相应的事件被发布了,那么这些subscriber的onEvent方法就会被调用。
2.5 LongPollingService
顾名思义,这是长轮询的一个服务,主要是针对本地数据变化的一个更新,我们可以看到在其构造器中,注册一个针对本地数据变更的事件:
public LongPollingService() { //这是一个Queue allSubs队列,存放客户端长轮询的队列 allSubs = new ConcurrentLinkedQueue(); ConfigExecutor.scheduleLongPolling(new StatTask(), 0L, 10L, TimeUnit.SECONDS); // Register LocalDataChangeEvent to NotifyCenter. NotifyCenter.registerToPublisher(LocalDataChangeEvent.class, NotifyCenter.ringBufferSize); // Register A Subscriber to subscribe LocalDataChangeEvent. NotifyCenter.registerSubscriber(new Subscriber() { @Override public void onEvent(Event event) { if (isFixedPolling()) { //啥也不做 } else { if (event instanceof LocalDataChangeEvent) { LocalDataChangeEvent evt = (LocalDataChangeEvent) event; //在这里针对本地数据发生的变更做处理 ConfigExecutor.executeLongPolling(new DataChangeTask(evt.groupKey, evt.isBeta, evt.betaIps)); } } } @Override public Class extends Event> subscribeType() { return LocalDataChangeEvent.class; } });}
isFixedPolling:
private static boolean isFixedPolling() { return SwitchService.getSwitchBoolean(SwitchService.FIXED_POLLING, false);}
public static String getSwitchString(String key, String defaultValue) { String value = switches.get(key); return StringUtils.isBlank(value) ? defaultValue : value;}
switches是一个map,我们就看到从map里面get数据了,那么何时往里面放数据呢?
通过查找,我们发现在SwitchService#load方法有给switches赋值,
2.6 DataChangeTask
本身是一个Runnable,在其run方法中,遍历所有的ClientLongPolling对象,
public void run() { try { ConfigCacheService.getContentBetaMd5(groupKey); for (Iterator iter = allSubs.iterator(); iter.hasNext(); ) { ClientLongPolling clientSub = iter.next(); if (clientSub.clientMd5Map.containsKey(groupKey)) { //重点 clientSub.sendResponse(Arrays.asList(groupKey)); } } } catch (Throwable t) { LogUtil.DEFAULT_LOG.error("data change error: {}", ExceptionUtil.getStackTrace(t)); }}
然后将有变更的数据发送到客户端,主要方法是:
LongPollingService.ClientLongPolling#generateResponse
void generateResponse(List<String> changedGroups) { if (null == changedGroups) { // Tell web container to send http response. asyncContext.complete(); return; } HttpServletResponse response = (HttpServletResponse) asyncContext.getResponse(); try { final String respString = MD5Util.compareMd5ResultString(changedGroups); // 告诉客户端将本地缓存失效 response.setHeader("Pragma", "no-cache"); response.setDateHeader("Expires", 0); response.setHeader("Cache-Control", "no-cache,no-store"); response.setStatus(HttpServletResponse.SC_OK); response.getWriter().println(respString); asyncContext.complete(); } catch (Exception ex) { PULL_LOG.error(ex.toString(), ex); asyncContext.complete(); }}
2.7 ClientLongPolling
这也是一个Runnable,在run方法中会将发生数据变更的group发送到客户端,并将自身添加到allSubs中,这个allSubs就是一个获取配置的客户端,比如我们自己开发的应用。
LongPollingService.ClientLongPolling#run
那么ClientLongPolling的run方法又是什么时候被调用呢?我们发现在LongPollingService#addLongPollingClient方法的最后有这么一句:
ConfigExecutor.executeLongPolling( new ClientLongPolling(asyncContext, clientMd5Map, ip, probeRequestSize, timeout, appName, tag));
这样run方法就被调用了,现在只要找出LongPollingService#addLongPollingClient什么时候被调用即可,一步步寻找之后,发现了
ConfigController#listener
ConfigServletInner#doPollingConfig
LongPollingService#addLongPollingClient
ClientLongPolling#run
而ConfigController#listener方法又是由谁调用呢?
LongPollingRunnable#run
ClientWorker#checkUpdateDataIds
ClientWorker#checkUpdateConfigStr------> 发起/v1/cs/configs/listener的调用
所以综合一下调用链路就是:
LongPollingRunnable#run
ClientWorker#checkUpdateDataIds
ClientWorker#checkUpdateConfigStr------> 发起/v1/cs/configs/listener的调用
ConfigController#listener
ConfigServletInner#doPollingConfig
LongPollingService#addLongPollingClient
ClientLongPolling#run
那么LongPollingRunnable的run方法是什么时候调用的呢?我们知道LongPollingRunnable实现了Runnable,调用时机就是将其交给线程池的时候,继续寻找
ClientWorker#checkConfigInfo
public void checkConfigInfo() { // Dispatch taskes. int listenerSize = cacheMap.get().size(); // Round up the longingTaskCount. int longingTaskCount = (int) Math.ceil(listenerSize / ParamUtil.getPerTaskConfigSize()); if (longingTaskCount > currentLongingTaskCount) { for (int i = (int) currentLongingTaskCount; i < longingTaskCount; i++) { // executorService.execute(new LongPollingRunnable(i)); } currentLongingTaskCount = longingTaskCount; }}
这个 checkConfigInfo就在ClientWorker的构造器里面调用:
public ClientWorker(final HttpAgent agent, final ConfigFilterChainManager configFilterChainManager, final Properties properties) { this.agent = agent; this.configFilterChainManager = configFilterChainManager; // Initialize the timeout parameter init(properties); this.executor = Executors.newScheduledThreadPool(1, new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); t.setName("com.alibaba.nacos.client.Worker." + agent.getName()); t.setDaemon(true); return t; } }); this.executorService = Executors .newScheduledThreadPool(Runtime.getRuntime().availableProcessors(), new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); t.setName("com.alibaba.nacos.client.Worker.longPolling." + agent.getName()); t.setDaemon(true); return t; } }); this.executor.scheduleWithFixedDelay(new Runnable() { @Override public void run() { try { //就是这里 checkConfigInfo(); } catch (Throwable e) { LOGGER.error("[" + agent.getName() + "] [sub-check] rotate check error", e); } } }, 1L, 10L, TimeUnit.MILLISECONDS);}
而ClientWorker的构造器就是在NacosConfigService构造器里面调用,至此,线路已经很清晰了。
public NacosConfigService(Properties properties) throws NacosException { ValidatorUtils.checkInitParam(properties); String encodeTmp = properties.getProperty(PropertyKeyConst.ENCODE); if (StringUtils.isBlank(encodeTmp)) { this.encode = Constants.ENCODE; } else { this.encode = encodeTmp.trim(); } initNamespace(properties); this.agent = new MetricsHttpAgent(new ServerHttpAgent(properties)); this.agent.start(); //调用ClientWorker构造器 this.worker = new ClientWorker(this.agent, this.configFilterChainManager, properties);}
new NacosConfigService()
new ClientWorker()
ClientWorker#checkConfigInfo
executorService.execute(new LongPollingRunnable(i));