配置中心
上一篇介绍了nacos的架构,并解析了注册中心源码,本篇继续解析配置中心,主要介绍客户端实现订阅配置信息的原理,同上一篇一样,分别从客户端和服务端的源码进行分析。
客户端
和注册中心一样,加载过程不做详细介绍了,最终会调用到在试用nacos的SDK时使用过的configService.getConfig,来实现从nacos服务获取相应的配置信息,getConfig内部其实是对OpenAPI请求/nacos/v1/cs/configs(GET)进行了封装,但是其中似乎并没有发现客户端对配置信息的订阅,我们反过来看configService的实现类NacosConfigService的构造方法:
public NacosConfigService(Properties properties) throws NacosException {
String encodeTmp = properties.getProperty(PropertyKeyConst.ENCODE);
if (StringUtils.isBlank(encodeTmp)) {
encode = Constants.ENCODE;
} else {
encode = encodeTmp.trim();
}
initNamespace(properties);
//创建了一个HttpAgent
agent = new MetricsHttpAgent(new ServerHttpAgent(properties));
agent.start();
//工作任务,agent传入
worker = new ClientWorker(agent, configFilterChainManager, properties);
}
重点在工作任务ClientWorker
@SuppressWarnings("PMD.ThreadPoolCreationRule")
public ClientWorker(final HttpAgent agent, final ConfigFilterChainManager configFilterChainManager, final Properties properties) {
this.agent = agent;
this.configFilterChainManager = configFilterChainManager;
init(properties);
//一个核心线程的线程池
executor = Executors.newScheduledThreadPool(1, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.Worker." + agent.getName());
t.setDaemon(true);
return t;
}
});
//运行同步任务的线程池
executorService = Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors(), new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.Worker.longPolling." + agent.getName());
t.setDaemon(true);
return t;
}
});
//延时1秒后开始,每十秒执行一次
executor.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
try {
//检查配置信息的变化
checkConfigInfo();
} catch (Throwable e) {
LOGGER.error("[" + agent.getName() + "] [sub-check] rotate check error", e);
}
}
}, 1L, 10L, TimeUnit.MILLISECONDS);
}
可以看出这里每10秒进行一次配置信息的检查,注意这里10秒一次并不是去nacos服务获取配置信息,而是本地所需获取配置项的自查,代码:
public void checkConfigInfo() {
// 配置项总量
int listenerSize = cacheMap.get().size();
// 配置项分批处理,默认3000一组
int longingTaskCount = (int) Math.ceil(listenerSize / ParamUtil.getPerTaskConfigSize());
//如果组数增加,则添加新的线程
if (longingTaskCount > currentLongingTaskCount) {
for (int i = (int) currentLongingTaskCount; i < longingTaskCount; i++) {
//任务线程池添加长轮询LongPollingRunnable线程任务
executorService.execute(new LongPollingRunnable(i));
}
currentLongingTaskCount = longingTaskCount;
}
}
每3000配置项组建一个线程,从名称可以看出是一个长轮询任务,i作为参数传入。
@Override
public void run() {
List<CacheData> cacheDatas = new ArrayList<CacheData>();
List<String> inInitializingCacheList = new ArrayList<String>();
try {
for (CacheData cacheData : cacheMap.get().values()) {
//taskId对应之前的分组,当前线程只操作自己分组的配置项
if (cacheData.getTaskId() == taskId) {
cacheDatas.add(cacheData);
try {
//检查本地配置,将内存中的配置项和存入本地文件的内容对比
checkLocalConfig(cacheData);
if (cacheData.isUseLocalConfigInfo()) {
cacheData.checkListenerMd5();
}
} catch (Exception e) {
LOGGER.error("get local config info error", e);
}
}
}
//去服务端查询,建立长轮询,获取变更项key
//本质是发送一个默认30秒超时的请求到"/nacos/v1/cs/configs/listener(POST)",头信息添加"Long-Pulling-Timeout"标识,默认值30000
List<String> changedGroupKeys = checkUpdateDataIds(cacheDatas, inInitializingCacheList);
for (String groupKey : changedGroupKeys) {
String[] key = GroupKey.parseKey(groupKey);
String dataId = key[0];
String group = key[1];
String tenant = null;
if (key.length == 3) {
tenant = key[2];
}
try {
//读取对应配置信息,并将配置信息写在本地文件中(${user}\nacos\config)
//发送请求到"/nacos/v1/cs/configs(GET)"
String content = getServerConfig(dataId, group, tenant, 3000L);
CacheData cache = cacheMap.get().get(GroupKey.getKeyTenant(dataId, group, tenant));
cache.setContent(content);
LOGGER.info("[{}] [data-received] dataId={}, group={}, tenant={}, md5={}, content={}",
agent.getName(), dataId, group, tenant, cache.getMd5(),
ContentUtils.truncateContent(content));
} catch (NacosException ioe) {
String message = String.format(
"[%s] [get-update] get changed config exception. dataId=%s, group=%s, tenant=%s",
agent.getName(), dataId, group, tenant);
LOGGER.error(message, ioe);
}
}
//触发事件通知
for (CacheData cacheData : cacheDatas) {
if (!cacheData.isInitializing() || inInitializingCacheList
.contains(GroupKey.getKeyTenant(cacheData.dataId, cacheData.group, cacheData.tenant))) {
cacheData.checkListenerMd5();
cacheData.setInitializing(false);
}
}
inInitializingCacheList.clear();
//重新执行该线程
executorService.execute(this);
} catch (Throwable e) {
//如果报错,延迟执行任务
executorService.schedule(this, taskPenaltyTime, TimeUnit.MILLISECONDS);
}
}
任务中主要做了三件事,首先检查内存中的配置和存储在本地文件的配置数据,进行对比同步,然后发起了一个有Long-Pulling-Timeou头信息的请求,最后根据返回值再次发起获取配置详情的请求,并将请求返回的详情写入本地文件和内存。
服务端
接收长轮询请求:
@PostMapping("/listener")
public void listener(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
request.setAttribute("org.apache.catalina.ASYNC_SUPPORTED", true);
String probeModify = request.getParameter("Listening-Configs");
if (StringUtils.isBlank(probeModify)) {
throw new IllegalArgumentException("invalid probeModify");
}
probeModify = URLDecoder.decode(probeModify, Constants.ENCODE);
Map<String, String> clientMd5Map;
try {
clientMd5Map = MD5Util.getClientMd5Map(probeModify);
} catch (Throwable e) {
throw new IllegalArgumentException("invalid probeModify");
}
//开始执行
inner.doPollingConfig(request, response, clientMd5Map, probeModify.length());
}
执行方法doPollingConfig中会判断是否是长轮询,判断标准是头信息中"Long-Pulling-Timeout"标识是否存在,然后会进入longPollingService.addLongPollingClient方法中:
public void addLongPollingClient(HttpServletRequest req, HttpServletResponse rsp, Map<String, String> clientMd5Map,
int probeRequestSize) {
String str = req.getHeader(LongPollingService.LONG_POLLING_HEADER);
String noHangUpFlag = req.getHeader(LongPollingService.LONG_POLLING_NO_HANG_UP_HEADER);
String appName = req.getHeader(RequestUtil.CLIENT_APPNAME_HEADER);
String tag = req.getHeader("Vipserver-Tag");
int delayTime = SwitchService.getSwitchInteger(SwitchService.FIXED_DELAY_TIME, 500);
//设置延时返回时间
long timeout = Math.max(10000, Long.parseLong(str) - delayTime);
if (isFixedPolling()) {
timeout = Math.max(10000, getFixedPollingInterval());
} else {
long start = System.currentTimeMillis();
List<String> changedGroups = MD5Util.compareMd5(req, rsp, clientMd5Map);
if (changedGroups.size() > 0) {
//如果配置项有变更直接返回
generateResponse(req, rsp, changedGroups);
return;
} else if (noHangUpFlag != null && noHangUpFlag.equalsIgnoreCase(TRUE_STR)) {
return;
}
}
String ip = RequestUtil.getRemoteIp(req);
final AsyncContext asyncContext = req.startAsync();
asyncContext.setTimeout(0L);
//开启ClientLongPolling线程任务
scheduler.execute(
new ClientLongPolling(asyncContext, clientMd5Map, ip, probeRequestSize, timeout, appName, tag));
}
上述方法中先判断了客户端订阅的配置是否有变更,如果有,则直接返回,否则开启客户端长轮询线程任务:
@Override
public void run() {
//开启一个延时执行的任务
asyncTimeoutFuture = scheduler.schedule(new Runnable() {
@Override
public void run() {
try {
getRetainIps().put(ClientLongPolling.this.ip, System.currentTimeMillis());
//删除订阅关系
allSubs.remove(ClientLongPolling.this);
if (isFixedPolling()) {
//对比数据是否发生变化
List<String> changedGroups = MD5Util.compareMd5(
(HttpServletRequest)asyncContext.getRequest(),
(HttpServletResponse)asyncContext.getResponse(), clientMd5Map);
if (changedGroups.size() > 0) {
//将变化的数据key值返回
sendResponse(changedGroups);
} else {
sendResponse(null);
}
} else {
sendResponse(null);
}
} catch (Throwable t) {
LogUtil.defaultLog.error("long polling error:" + t.getMessage(), t.getCause());
}
}
}, timeoutTime, TimeUnit.MILLISECONDS);
//任务未执行,先添加订阅关系
allSubs.add(this);
}
该线程任务中,开启了一个延时任务,延时30秒(29.5防超时),同时将对象维护到一个ConcurrentLinkedQueue队列allSubs中,延时任务执行时,先将对象从allSubs移除,然后对比配置项变化,无论是否变更都将返回。
上述代码就是长轮询订阅的基本逻辑,但还有一个问题,30秒延时中如果配置发生变更,客户端难道要30秒后才能感知吗。我们重新查看longPollingService类,发现这个类继承了AbstractEventListener,并实现了onEvent方法:
@Override
public void onEvent(Event event) {
if (isFixedPolling()) {
// ignore
} else {
//判断是否是数据变更事件
if (event instanceof LocalDataChangeEvent) {
LocalDataChangeEvent evt = (LocalDataChangeEvent)event;
scheduler.execute(new DataChangeTask(evt.groupKey, evt.isBeta, evt.betaIps));
}
}
}
通过判断是否是LocalDataChangeEvent事件后,执行了线程任务:
@Override
public void run() {
try {
ConfigService.getContentBetaMd5(groupKey);
for (Iterator<ClientLongPolling> iter = allSubs.iterator(); iter.hasNext(); ) {
//轮询判断key值
ClientLongPolling clientSub = iter.next();
if (clientSub.clientMd5Map.containsKey(groupKey)) {
if (isBeta && !betaIps.contains(clientSub.ip)) {
continue;
}
if (StringUtils.isNotBlank(tag) && !tag.equals(clientSub.tag)) {
continue;
}
getRetainIps().put(clientSub.ip, System.currentTimeMillis());
//删除订阅关系
iter.remove();
//返回数据
clientSub.sendResponse(Arrays.asList(groupKey));
}
}
} catch (Throwable t) {
LogUtil.defaultLog.error("data change error:" + t.getMessage(), t.getCause());
}
}
可以看到,任务中对allSubs进行了轮询,经过key值判断,匹配成功会将其从队列中移除,并将请求返回客户端,提前终止了长轮询。