Hystrix的基本运行流程
- 创建Command对象
- 我们会判断熔断器是否开启
- 当熔断器开启的时候,我们不会执行自己真正的逻辑,而是直接执行器降级方法,执行FallBack方法
- 若FallBack方法执行成功,则返回Fallback方法的执行结果
- 若Fallback方法执行失败,则抛出异常
- 当熔断器关闭的时候,表示需要执行正常的逻辑
- 当熔断器开启的时候,我们不会执行自己真正的逻辑,而是直接执行器降级方法,执行FallBack方法
- 当熔断器关闭走向正常的逻辑的时候,会先判断当前的线程池是否允许执行,若线程池未满,则允许执行,否则就拒绝执行。
- 线程池拒绝执行,则执行fallback方法
- 线程池允许执行,则继续判断
- 当我们的线程池允许其继续执行的时候,我们会调用自己的业务代码
- 判断其是否执行失败
- 执行失败执行Fallback方法
- 执行成功
- 判断其是否执行超时
- 执行超时则执行FallBack方法
- 执行成功,则返回结果给用户
- 判断其是否执行超时
- 判断其是否执行失败
Hystrix的实现
- 简单来说就是先导入pom文件
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
- 启动类上加入注解@EnableCircuitBreaker
- 在需要服务隔离的方法中加入@HystrixCommand注解
package com.xiyou.war.controller;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
/**
* @author 92823
*/
@RestController
public class TestController {
@GetMapping("/test/{id}")
@HystrixCommand(fallbackMethod = "myFallBack")
public String test(@PathVariable("id") String id) {
System.out.println("系统正常调用");
if ("1".equals(id)) {
throw new RuntimeException("抛出异常");
}
return "ok";
}
public String myFallBack(String id) {
System.out.println("myFallback被调用...");
return "服务繁忙,请稍后再试...: " + id;
}
}
注意:
- 我们的@HystrixCommand中有很多其他的属性,这里只配置了其降级的方法。
- 下面的是HystrixCommand注解的属性,可以看到线程池的相关配置也在这里配置,这里的线程池是舱壁隔离的线程池,一个commandKey对应了一个线程池,所以在一个方法上面配置一个CommandKey,则对应着方法用独立的线程池。若CommandKey相等,则使用同一个线程池。实际上后台实现使用ConcurrentHashMap存储的,key就是CommandKey,value就是线程池。
/**
* Copyright 2012 Netflix, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.netflix.hystrix.contrib.javanica.annotation;
import java.lang.annotation.Documented;
import java.lang.annotation.ElementType;
import java.lang.annotation.Inherited;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
/**
* This annotation used to specify some methods which should be processes as hystrix commands.
*/
@Target({ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
@Inherited
@Documented
public @interface HystrixCommand {
/**
* The command group key is used for grouping together commands such as for reporting,
* alerting, dashboards or team/library ownership.
* <p/>
* default => the runtime class name of annotated method
*
* @return group key
*/
String groupKey() default "";
/**
* Hystrix command key.
* <p/>
* default => the name of annotated method. for example:
* <code>
* ...
* @HystrixCommand
* public User getUserById(...)
* ...
* the command name will be: 'getUserById'
* </code>
*
* @return command key
*/
String commandKey() default "";
/**
* The thread-pool key is used to represent a
* HystrixThreadPool for monitoring, metrics publishing, caching and other such uses.
*
* @return thread pool key
*/
String threadPoolKey() default "";
/**
* Specifies a method to process fallback logic.
* A fallback method should be defined in the same class where is HystrixCommand.
* Also a fallback method should have same signature to a method which was invoked as hystrix command.
* for example:
* <code>
* @HystrixCommand(fallbackMethod = "getByIdFallback")
* public String getById(String id) {...}
*
* private String getByIdFallback(String id) {...}
* </code>
* Also a fallback method can be annotated with {@link HystrixCommand}
* <p/>
* default => see {@link com.netflix.hystrix.contrib.javanica.command.GenericCommand#getFallback()}
*
* @return method name
*/
String fallbackMethod() default "";
/**
* Specifies command properties.
*
* @return command properties
*/
HystrixProperty[] commandProperties() default {};
/**
* Specifies thread pool properties.
*
* @return thread pool properties
*/
HystrixProperty[] threadPoolProperties() default {};
/**
* Defines exceptions which should be ignored.
* Optionally these can be wrapped in HystrixRuntimeException if raiseHystrixExceptions contains RUNTIME_EXCEPTION.
*
* @return exceptions to ignore
*/
Class<? extends Throwable>[] ignoreExceptions() default {};
/**
* Specifies the mode that should be used to execute hystrix observable command.
* For more information see {@link ObservableExecutionMode}.
*
* @return observable execution mode
*/
ObservableExecutionMode observableExecutionMode() default ObservableExecutionMode.EAGER;
/**
* When includes RUNTIME_EXCEPTION, any exceptions that are not ignored are wrapped in HystrixRuntimeException.
*
* @return exceptions to wrap
*/
HystrixException[] raiseHystrixExceptions() default {};
/**
* Specifies default fallback method for the command. If both {@link #fallbackMethod} and {@link #defaultFallback}
* methods are specified then specific one is used.
* note: default fallback method cannot have parameters, return type should be compatible with command return type.
*
* @return the name of default fallback method
*/
String defaultFallback() default "";
}
Hystrix的源码核心步骤图
Hystrix的源码解析
- 在介绍源码之前我们应该明白一个设计模式即观察者模式和Java的响应式编程,在hystrix中我们用了观察者模式,并利用了大量的响应式编程。
- 这里建议大家还是自己查资料看看,这里我只说下自己的理解,这两个东西没有使用过,不是很熟悉
- 其中响应式编程,有点类似于钩子函数,就是虽然我们定义好了一些东西,但是这些东西只有到触发的时候才会回调其方法。所以在看源码的时候,可以暂时不去关注这些,直接跳过。
- 观察者模式我们这里重点有两个对象:Observable和Observer,其中Observable称为被观察者,Observer是观察者,这里怎么理解观察者和被观察者呢?其实可以从Zookeeper的角度去理解,Zookeeper本身就是一个被观察者,而一个个的Node就是我们的观察者,当我们的Zookeeper发生改变的时候,我们会主动通知给其他Node,告知其已经改变。所以我们也可以理解为当被观察者(Observable)发生改变的时候,观察者(Observer)就会收到相关通知。
- 这里我们的源码解析主要请参考代码中的注释,以及上面的流程图
- 众所周知,我们在使用Hystrix的时候,需要导入其依赖
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
-
当我们点进这个maven依赖的时候,发现其还依赖了spring-cloud-netflix-core.jar,在该包下的/META-INF/spring.factories文件如下
-
这里我们利用SpringBoot的原理会默认导入spring.factories文件中的类:
- 导入org.springframework.cloud.netflix.hystrix.HystrixAutoConfiguration
- 导入org.springframework.cloud.netflix.hystrix.security.HystrixSecurityAutoConfiguration
- 导入org.springframework.cloud.netflix.hystrix.HystrixCircuitBreakerConfiguration(重点)
-
导入的前两个类我们不用考虑,这里主要看我们导入的org.springframework.cloud.netflix.hystrix.HystrixCircuitBreakerConfiguration类
-
HystrixCircuitBreakerConfiguration
/**
* @author Spencer Gibb
* @author Christian Dupuis
* @author Venil Noronha
*/
@Configuration
public class HystrixCircuitBreakerConfiguration {
@Bean
public HystrixCommandAspect hystrixCommandAspect() {
return new HystrixCommandAspect();
}
........
}
(1)我们看到HystrixCircuitBreakerConfiguration类主要是创建了一个HystrixCommandAspect 对象,该对象从名字看,就知道这个利用了AOP。
(2)点进去看看HystrixCommandAspect
- HystrixCommandAspect
- 我们这里只贴出其核心代码进行分析
// 很明显的AOP,利用了Spring的Aspect
@Aspect
public class HystrixCommandAspect {
.......
// 定义相关的切点
// 出现注解@HystrixCommand
@Pointcut("@annotation(com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand)")
public void hystrixCommandAnnotationPointcut() {
}
// 定义切点
// 出现注解@HystrixCollapser
@Pointcut("@annotation(com.netflix.hystrix.contrib.javanica.annotation.HystrixCollapser)")
public void hystrixCollapserAnnotationPointcut() {
}
// 定义一个环绕通知
@Around("hystrixCommandAnnotationPointcut() || hystrixCollapserAnnotationPointcut()")
public Object methodsAnnotatedWithHystrixCommand(final ProceedingJoinPoint joinPoint) throws Throwable {
// 得到目标方法,该方法就是我们加上注解@HystrixCommand和@HystrixCollapser的方法
Method method = getMethodFromTarget(joinPoint);
Validate.notNull(method, "failed to get method from joinPoint: %s", joinPoint);
if (method.isAnnotationPresent(HystrixCommand.class) && method.isAnnotationPresent(HystrixCollapser.class)) {
throw new IllegalStateException("method cannot be annotated with HystrixCommand and HystrixCollapser " +
"annotations at the same time");
}
MetaHolderFactory metaHolderFactory = META_HOLDER_FACTORY_MAP.get(HystrixPointcutType.of(method));
MetaHolder metaHolder = metaHolderFactory.create(joinPoint);
// 重点,准备各种材料后,创建HystrixInvokable对象
// 实际上创建的是GenericCommand对象
// HystrixInvokable 是一个空的接口,里面没有定义任何方法,里面只是标记具备可执行的能力
HystrixInvokable invokable = HystrixCommandFactory.getInstance().create(metaHolder);
ExecutionType executionType = metaHolder.isCollapserAnnotationPresent() ?
metaHolder.getCollapserExecutionType() : metaHolder.getExecutionType();
Object result;
try {
if (!metaHolder.isObservable()) {
// debugger模式下实际上调用的是这个方法
// 该方法是利用工具CommandExecutor来执行的具体的方法
result = CommandExecutor.execute(invokable, executionType, metaHolder);
} else {
result = executeObservable(invokable, executionType, metaHolder);
}
} catch (HystrixBadRequestException e) {
throw e.getCause() != null ? e.getCause() : e;
} catch (HystrixRuntimeException e) {
throw hystrixRuntimeExceptionToThrowable(metaHolder, e);
}
return result;
}
}
......
(1)具体的代码含义可以参考上面代码的注释
(2)HystrixInvokable是一个空的接口,没有任何方法定义,只是用来标记具备可执行的能力
-
那么我们的HystrixInvokable 对象是如何创建出来的?其具体实现类是什么?请带着问题继续看下面的代码
-
HystrixCommandFactory.getInstance().create()
- 该行代码是在环绕通知中出现的,我们来看下这个的具体实现
public HystrixInvokable create(MetaHolder metaHolder) {
HystrixInvokable executable;
if (metaHolder.isCollapserAnnotationPresent()) {
executable = new CommandCollapser(metaHolder);
} else if (metaHolder.isObservable()) {
executable = new GenericObservableCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder));
} else {
// 从debigger中可以看到我们实际上执行的是这行代码
// 所以实际上我们的HystrixInvokable 对象是GenericCommand对象
executable = new GenericCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder));
}
return executable;
}
(1)从上面的代码以及debugger的调试来看,实际上我们的HystrixInvokable 对象是 GenericCommand对象
- new GenericCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder))
- 该对象就是我们实际创建的HystrixInvokable
- 该类中存在两个至关重要的方法run()和方法getFallback()
- run方法是执行实际的我们编写的业务代码,即@HystrixCommand注解标注的方法
- getFallback方法就是我们的服务降级方法,就是我们在@HystrixCommand注解中配置的fallbackMethod属性对应的方法
- 其中run方法和getFallback方法执行到具体的方法上都是利用反射进行调用的,拿到@HystrixCommand中配置的方法名的信息等,然后反射调用
/**
* Implementation of AbstractHystrixCommand which returns an Object as result.
*/
@ThreadSafe
public class GenericCommand extends AbstractHystrixCommand<Object> {
private static final Logger LOGGER = LoggerFactory.getLogger(GenericCommand.class);
public GenericCommand(HystrixCommandBuilder builder) {
super(builder);
}
/**
* {@inheritDoc}
* 该方法就是执行的具体的业务逻辑方法,利用反射对其进行调用
*/
@Override
protected Object run() throws Exception {
LOGGER.debug("execute command: {}", getCommandKey().name());
return process(new Action() {
@Override
Object execute() {
return getCommandAction().execute(getExecutionType());
}
});
}
// 该方法就是服务降级的方法,就是我们配置出错的时候执行的方法
// 也是通过反射调用到我们实际配置的方法的
@Override
protected Object getFallback() {
final CommandAction commandAction = getFallbackAction();
if (commandAction != null) {
try {
return process(new Action() {
@Override
Object execute() {
MetaHolder metaHolder = commandAction.getMetaHolder();
Object[] args = createArgsForFallback(metaHolder, getExecutionException());
return commandAction.executeWithArgs(metaHolder.getFallbackExecutionType(), args);
}
});
} catch (Throwable e) {
LOGGER.error(FallbackErrorMessageBuilder.create()
.append(commandAction, e).build());
throw new FallbackInvocationException(unwrapCause(e));
}
} else {
return super.getFallback();
}
}
}
- 重新回到环绕通知,观察其核心执行的方法CommandExecutor.execute(invokable, executionType, metaHolder);
具体的代码实现:
上面的代码就是将invokable对象转换成executionType类型,并执行其execute方法。继续跟进其execute方法
发现这是一个接口,继续跟进HystrixCommand.execute()
- queue().get()
public Future<R> queue() {
// 最最关键的代码
final Future<R> delegate = toObservable().toBlocking().toFuture();
final Future<R> f = new Future<R>() {
@Override
public boolean cancel(boolean mayInterruptIfRunning) {
if (delegate.isCancelled()) {
return false;
}
if (HystrixCommand.this.getProperties().executionIsolationThreadInterruptOnFutureCancel().get()) {
interruptOnFutureCancel.compareAndSet(false, mayInterruptIfRunning);
}
final boolean res = delegate.cancel(interruptOnFutureCancel.get());
if (!isExecutionComplete() && interruptOnFutureCancel.get()) {
final Thread t = executionThread.get();
if (t != null && !t.equals(Thread.currentThread())) {
t.interrupt();
}
}
return res;
}
其中最核心的代码就是final Future delegate = toObservable().toBlocking().toFuture();
- toObservable()
- 这个方法看着是相当的长,但是很多都是回调的函数,现在根本不用去看
- 主要参考下面代码中的注释
public Observable<R> toObservable() {
final AbstractCommand<R> _cmd = this;
//doOnCompleted handler already did all of the SUCCESS work
//doOnError handler already did all of the FAILURE/TIMEOUT/REJECTION/BAD_REQUEST work
// 命令结束后的清理者
final Action0 terminateCommandCleanup = new Action0() {...};
//mark the command as CANCELLED and store the latency (in addition to standard cleanup)
// 取消订阅时候的处理者
final Action0 unsubscribeCommandCleanup = new Action0() {...};
// 重点:Hystrix核心逻辑:断路器,线程隔离
final Func0<Observable<R>> applyHystrixSemantics = new Func0<Observable<R>>() {...};
// 发送数据时候的Hook
final Func1<R, R> wrapWithAllOnNextHooks = new Func1<R, R>() {...};
final Action0 fireOnCompletedHook = new Action0() {... };
// 返回值
// 通过defer方法,回调其call方法,创建了一个Observable对象
// 该对象上面绑定了一堆的观察者
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
/* this is a stateful object so can only be used once */
if (!commandState.compareAndSet(CommandState.NOT_STARTED, CommandState.OBSERVABLE_CHAIN_CREATED)) {
IllegalStateException ex = new IllegalStateException("This instance can only be executed once. Please instantiate a new instance.");
//TODO make a new error type for this
throw new HystrixRuntimeException(FailureType.BAD_REQUEST_EXCEPTION, _cmd.getClass(), getLogMessagePrefix() + " command executed multiple times - this is not permitted.", ex, null);
}
commandStartTimestamp = System.currentTimeMillis();
if (properties.requestLogEnabled().get()) {
// log this command execution regardless of what happened
if (currentRequestLog != null) {
currentRequestLog.addExecutedCommand(_cmd);
}
}
final boolean requestCacheEnabled = isRequestCachingEnabled();
final String cacheKey = getCacheKey();
/* try from cache first */
if (requestCacheEnabled) {
HystrixCommandResponseFromCache<R> fromCache = (HystrixCommandResponseFromCache<R>) requestCache.get(cacheKey);
if (fromCache != null) {
isResponseFromCache = true;
return handleRequestCacheHitAndEmitValues(fromCache, _cmd);
}
}
// 绑定了applyHystrixSemantics和wrapWithAllOnNextHooks
Observable<R> hystrixObservable =
Observable.defer(applyHystrixSemantics)
.map(wrapWithAllOnNextHooks);
Observable<R> afterCache;
// put in cache
if (requestCacheEnabled && cacheKey != null) {
// wrap it for caching
HystrixCachedObservable<R> toCache = HystrixCachedObservable.from(hystrixObservable, _cmd);
HystrixCommandResponseFromCache<R> fromCache = (HystrixCommandResponseFromCache<R>) requestCache.putIfAbsent(cacheKey, toCache);
if (fromCache != null) {
// another thread beat us so we'll use the cached value instead
toCache.unsubscribe();
isResponseFromCache = true;
return handleRequestCacheHitAndEmitValues(fromCache, _cmd);
} else {
// we just created an ObservableCommand so we cast and return it
afterCache = toCache.toObservable();
}
} else {
afterCache = hystrixObservable;
}
return afterCache
// 绑定观察者,上面已经介绍了
.doOnTerminate(terminateCommandCleanup) // perform cleanup once (either on normal terminal state (this line), or unsubscribe (next line))
.doOnUnsubscribe(unsubscribeCommandCleanup) // perform cleanup once
.doOnCompleted(fireOnCompletedHook);
}
});
}
(1)我们在toObservable()方法中运用了大量的响应式编程
(2)最后toObservable()方法返回的是一个Observable对象,该对象绑定了许多的观察者
- (2.1)绑定terminateCommandCleanup方法:命令执行结束后的清理者
- (2.2)绑定unsubscribeCommandCleanup方法:取消订阅时的处理者
- (2.3)绑定applyHystrixSemantics方法:重点,Hystrix的核心逻辑,断路器以及舱壁隔离,都是在这里实现的,下面详细介绍
- (2.4)绑定wrapWithAllOnNextHooks方法:发射数据(OnNext表示发射数据)时的Hook
- (2.5)绑定fireOnCompletedHook方法:命令执行完成后的Hook
- 断路器,舱壁隔离的核心代码applyHystrixSemantics()
private Observable<R> applyHystrixSemantics(final AbstractCommand<R> _cmd) {
// mark that we're starting execution on the ExecutionHook
// if this hook throws an exception, then a fast-fail occurs with no fallback. No state is left inconsistent
// 源码中有很多的executionHook,eventNotifer的操作,这是Hystrix扩展性的一种体现。这里面啥事都没有做,我们这里是为了开发人员扩展的使用
executionHook.onStart(_cmd);
/* determine if we're allowed to execute */
// 判断断路器是否开启
if (circuitBreaker.attemptExecution()) {
// 如果断路器没有开启 断路器关闭执行的是具体的业务逻辑
// 获取信号量
// 这里不再点进去看了,我们什么都没配置默认会返回一个空的,public static final TryableSemaphore DEFAULT = new TryableSemaphoreNoOp();
final TryableSemaphore executionSemaphore = getExecutionSemaphore();
final AtomicBoolean semaphoreHasBeenReleased = new AtomicBoolean(false);
final Action0 singleSemaphoreRelease = new Action0() {。。。};
final Action1<Throwable> markExceptionThrown = new Action1<Throwable>() {。。。};
// 判断信号量是否拒绝,就是当前信号量是否已经满了,我们这里因为返回的是个空的,所以这里肯定会进此if
if (executionSemaphore.tryAcquire()) {
try {
/* used to track userThreadExecutionTime */
executionResult = executionResult.setInvocationStartTime(System.currentTimeMillis());
// 重点:处理正常的逻辑,正常执行业务逻辑代码
// 舱壁隔离等代码都在此
return executeCommandAndObserve(_cmd)
.doOnError(markExceptionThrown)
.doOnTerminate(singleSemaphoreRelease)
.doOnUnsubscribe(singleSemaphoreRelease);
} catch (RuntimeException e) {
return Observable.error(e);
}
} else {
return handleSemaphoreRejectionViaFallback();
}
} else {
// 断路器开启
// 断路器开启则直接执行的是服务降级的方法
return handleShortCircuitViaFallback();
}
}
- 这里主要的逻辑就是
- (1)circuitBreaker.attemptExecution():判断断路器是否开启,若开启则直接执行降级方法,若关闭断路器则执行的是真实的逻辑
- (2)handleShortCircuitViaFallback();:断路器开启直接执行服务降级的方法
- (3)executeCommandAndObserve(_cmd):执行正常的业务逻辑,下面具体分析
执行正常逻辑的代码executeCommandAndObserve()
private Observable<R> executeCommandAndObserve(final AbstractCommand<R> _cmd) {
final HystrixRequestContext currentRequestContext = HystrixRequestContext.getContextForCurrentThread();
final Action1<R> markEmits = new Action1<R>() {...};
final Action0 markOnCompleted = new Action0() {...};
// 利用Func1获取Fallback的Observable
final Func1<Throwable, Observable<R>> handleFallback = new Func1<Throwable, Observable<R>>() {
@Override
public Observable<R> call(Throwable t) {
circuitBreaker.markNonSuccess();
Exception e = getExceptionFromThrowable(t);
executionResult = executionResult.setExecutionException(e);
// 拒绝处理的服务降级方法
if (e instanceof RejectedExecutionException) {
return handleThreadPoolRejectionViaFallback(e);
} else if (t instanceof HystrixTimeoutException) {
// 超时处理的服务降级方法
return handleTimeoutViaFallback();
} else if (t instanceof HystrixBadRequestException) {
return handleBadRequestByEmittingError(e);
} else {
/*
* Treat HystrixBadRequestException from ExecutionHook like a plain HystrixBadRequestException.
*/
if (e instanceof HystrixBadRequestException) {
eventNotifier.markEvent(HystrixEventType.BAD_REQUEST, commandKey);
return Observable.error(e);
}
return handleFailureViaFallback(e);
}
}
};
final Action1<Notification<? super R>> setRequestContext = new Action1<Notification<? super R>>() {
@Override
public void call(Notification<? super R> rNotification) {
setRequestContextIfNeeded(currentRequestContext);
}
};
Observable<R> execution;
// 利用特定的隔离策略来处理
if (properties.executionTimeoutEnabled().get()) {
execution = executeCommandWithSpecifiedIsolation(_cmd)
.lift(new HystrixObservableTimeoutOperator<R>(_cmd));
} else {
// 重点:舱壁隔离的处理
execution = executeCommandWithSpecifiedIsolation(_cmd);
}
// 绑定方法
return execution.doOnNext(markEmits)
.doOnCompleted(markOnCompleted)
// 绑定Fallback的处理者
.onErrorResumeNext(handleFallback)
.doOnEach(setRequestContext);
}
- 舱壁隔离的特性处理:executeCommandWithSpecifiedIsolation()
private Observable<R> executeCommandWithSpecifiedIsolation(final AbstractCommand<R> _cmd) {
// 判断线程池隔离
if (properties.executionIsolationStrategy().get() == ExecutionIsolationStrategy.THREAD) {
// mark that we are executing in a thread (even if we end up being rejected we still were a THREAD execution and not SEMAPHORE)
// 再次使用Observable.defer(),通过执行Func0来得到一个Observable对象
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
executionResult = executionResult.setExecutionOccurred();
if (!commandState.compareAndSet(CommandState.OBSERVABLE_CHAIN_CREATED, CommandState.USER_CODE_EXECUTED)) {
return Observable.error(new IllegalStateException("execution attempted while in state : " + commandState.get().name()));
}
// 收集熔断器的相关信息
metrics.markCommandStart(commandKey, threadPoolKey, ExecutionIsolationStrategy.THREAD);
if (isCommandTimedOut.get() == TimedOutStatus.TIMED_OUT) {
// the command timed out in the wrapping thread so we will return immediately
// and not increment any of the counters below or other such logic
return Observable.error(new RuntimeException("timed out before executing run()"));
}
if (threadState.compareAndSet(ThreadState.NOT_USING_THREAD, ThreadState.STARTED)) {
//we have not been unsubscribed, so should proceed
HystrixCounters.incrementGlobalConcurrentThreads();
threadPool.markThreadExecution();
// store the command that is being run
endCurrentThreadExecutingCommand = Hystrix.startCurrentThreadExecutingCommand(getCommandKey());
executionResult = executionResult.setExecutedInThread();
/**
* If any of these hooks throw an exception, then it appears as if the actual execution threw an error
*/
try {
executionHook.onThreadStart(_cmd);
executionHook.onRunStart(_cmd);
executionHook.onExecutionStart(_cmd);
// 重点:获取真正的用户Task
return getUserExecutionObservable(_cmd);
} catch (Throwable ex) {
return Observable.error(ex);
}
} else {
//command has already been unsubscribed, so return immediately
return Observable.error(new RuntimeException("unsubscribed before executing run()"));
}
}
})
// 绑定各种处理者
.doOnTerminate(new Action0() {。。。})
.doOnUnsubscribe(new Action0() {。。。})
// 重点,绑定的是线程隔离的,超时的处理者
.subscribeOn(threadPool.getScheduler(new Func0<Boolean>() {
@Override
public Boolean call() {
return properties.executionIsolationThreadInterruptOnTimeout().get() && _cmd.isCommandTimedOut.get() == TimedOutStatus.TIMED_OUT;
}
}));
} else {
// 这个是信号量隔离,意思和上面的线程隔离大一一样
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
executionResult = executionResult.setExecutionOccurred();
if (!commandState.compareAndSet(CommandState.OBSERVABLE_CHAIN_CREATED, CommandState.USER_CODE_EXECUTED)) {
return Observable.error(new IllegalStateException("execution attempted while in state : " + commandState.get().name()));
}
metrics.markCommandStart(commandKey, threadPoolKey, ExecutionIsolationStrategy.SEMAPHORE);
// semaphore isolated
// store the command that is being run
endCurrentThreadExecutingCommand = Hystrix.startCurrentThreadExecutingCommand(getCommandKey());
try {
executionHook.onRunStart(_cmd);
executionHook.onExecutionStart(_cmd);
return getUserExecutionObservable(_cmd); //the getUserExecutionObservable method already wraps sync exceptions, so this shouldn't throw
} catch (Throwable ex) {
//If the above hooks throw, then use that as the result of the run method
return Observable.error(ex);
}
}
});
}
}
-
主要有两个方法比较重要(以线程池的隔离来举例,信号量的隔离是一样的)
- getUserExecutionObservable(_cmd);:真正执行业务具体逻辑的方法
- threadPool.getScheduler:线程隔离的方法
-
threadPool.getScheduler线程池隔离
(1)首先我们应该明确线程池是在哪里初始化的。我们只只知道我们在@HystrixCommand注解中可以配置相关的线程池的信息,但是我们什么时候初始化的呢
(2)我们回到HystrixCircuitBreakerConfiguration类(上面已经叙述了,在Spring.factories中定义了)中,这里以截图的方式逐一追踪
不停的追踪这个构造函数有父类的构造就点进去,最终追踪到AbstractCommand类的构造函数中
通过上面的截图可以看到我们在这个构造函数中初始化了一堆东西,其中就包括我们的线程池的初始化
this.threadPool = initThreadPool(threadPool, this.threadPoolKey, threadPoolPropertiesDefaults);
// 线程池的初始化
private static HystrixThreadPool initThreadPool(HystrixThreadPool fromConstructor, HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties.Setter threadPoolPropertiesDefaults) {
if (fromConstructor == null) {
// get the default implementation of HystrixThreadPool
// 得到HystrixThreadPool对象
return HystrixThreadPool.Factory.getInstance(threadPoolKey, threadPoolPropertiesDefaults);
} else {
return fromConstructor;
}
}
------------------------
// threadPools是其实是一个Map,key是commandKey,value是HystrixThreadPool线程池,后面会用到这个变量
final static ConcurrentHashMap<String, HystrixThreadPool> threadPools = new ConcurrentHashMap<String, HystrixThreadPool>();
------------------------
/* package */static HystrixThreadPool getInstance(HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties.Setter propertiesBuilder) {
// get the key to use instead of using the object itself so that if people forget to implement equals/hashcode things will still work
// 这里的key是我们的@hystrix中配置的commandKey
String key = threadPoolKey.name();
// this should find it for all but the first time
// 从Map中根据这个key获取这个HystrixThreadPool
HystrixThreadPool previouslyCached = threadPools.get(key);
if (previouslyCached != null) {
return previouslyCached;
}
// if we get here this is the first time so we need to initialize
synchronized (HystrixThreadPool.class) {
// 如果当前这个threadPool(map)中没有这个key,就新建一个线程池将其放进去
if (!threadPools.containsKey(key)) {
threadPools.put(key, new HystrixThreadPoolDefault(threadPoolKey, propertiesBuilder));
}
}
return threadPools.get(key);
}
-----------------------------------
public HystrixThreadPoolDefault(HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties.Setter propertiesDefaults) {
this.properties = HystrixPropertiesFactory.getThreadPoolProperties(threadPoolKey, propertiesDefaults);
HystrixConcurrencyStrategy concurrencyStrategy = HystrixPlugins.getInstance().getConcurrencyStrategy();
this.queueSize = properties.maxQueueSize().get();
this.metrics = HystrixThreadPoolMetrics.getInstance(threadPoolKey,
concurrencyStrategy.getThreadPool(threadPoolKey, properties),
properties);
this.threadPool = this.metrics.getThreadPool();
this.queue = this.threadPool.getQueue();
/* strategy: HystrixMetricsPublisherThreadPool */
HystrixMetricsPublisherFactory.createOrRetrievePublisherForThreadPool(threadPoolKey, this.metrics, this.properties);
}
我们能利用Hystrix做到舱壁隔离(线程隔离)实际原理就是上面threadPool的初始化代码,我们利用@Hystrix中的commandKey的不一致(就是threadPool这个map的key),让其创建不同的线程池,若我们在几个方法上面的@hystrixCommand注解中配置的commandKey一样的话,则同一个commandKey对应的线程池是一个,也就是也可以做到多个方法公用一个线程池
命令真正的调用逻辑入口getUserExecutionObservable
- 现在我们继续回到当时的getUserExecutionObservable方法
- 该方法是我们真正执行业务逻辑的入口
private Observable<R> getUserExecutionObservable(final AbstractCommand<R> _cmd) {
Observable<R> userObservable;
try {
// 追踪这个方法
userObservable = getExecutionObservable();
} catch (Throwable ex) {
// the run() method is a user provided implementation so can throw instead of using Observable.onError
// so we catch it here and turn it into Observable.error
userObservable = Observable.error(ex);
}
return userObservable
.lift(new ExecutionHookApplication(_cmd))
.lift(new DeprecatedOnRunHookApplication(_cmd));
}
----------------------
// 执行的是HystrixCommand的getExecutionObservable方法
@Override
final protected Observable<R> getExecutionObservable() {
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
try {
// 调用真正的业务逻辑入口,就是这个run方法
// 这个run方法就是我们最先开始介绍的GenericCommand中的run方法
return Observable.just(run());
} catch (Throwable ex) {
return Observable.error(ex);
}
}
}).doOnSubscribe(new Action0() {
@Override
public void call() {
// Save thread on which we get subscribed so that we can interrupt it later if needed
executionThread.set(Thread.currentThread());
}
});
}
- 该方法 一步一步追踪发现我们最后执行的就是我们当时的GenericCommand的run方法
小结一下:
- 上面的方法层层调用,倒过来看,就是先创建了一个被观察者(Observable),然后绑定各种事件对应的观察者(Observer),如下图所示:
滑动窗口讲解以及常用配置
- 断路器可以理解为:以秒为单位来统计请求的处理情况(成功请求数量,失败请求数,超时请求数,被拒绝的请求数),然后每次取最近10s的数据来进行计算,如果失败率超过了50%,就进行熔断,不再处理任何请求。
- 下图是Hystrix的一张图,用来掩饰滑动窗口
- 上图演示了Hystrix的滑动窗口策略,假定以秒为单位来统计请求处理情况,上面的每个格子代表1s,格子中的数据就是1s内各处理结果的请求数量,格子成为Bucket(译为桶)
- 若每次的决策都是以10个Bucket的数据为依据,计算10个Bucket的请求处理情况,当失败率超过50%的时候就进行熔断。10个Bucket就是10s,这个10s就是一个滑动窗口
- 为什么将其称为滑动窗口呢?因为在没有熔断的时候,每当收集好一个新的Bucket后,就会丢弃掉最旧的一个Bucket。上图中的深色的(23 5 2 0)就是被丢弃掉的桶
- 下图展示的是官方完整的流程图,其策略是:不断的收集数据,达到条件就进行熔断;熔断之后就拒绝所有请求一段时间(sleepWindow);然后放一个请求过去,如果请求成功,则关闭熔断器,否则继续打开熔断器
- 说白了上图就是我们说的三态转换图
这里我谈一下自己对滑动窗口的理解:
(1)在一个时间段内,当服务的失败次数/服务调用的总次数 > 一个我们规定的阈值的时候,我们就会打开断路器,这个时间段,我们就将其称为滑动窗口。
- 举个例子来说:就是当前的时间往前推一段时间,比如当前是8-40,往前推10分钟,取的就是8:30-8:40,如果当前时间是8:31,则时间段就是8:21-8:31
相关配置(重要)
- 默认的配置都在HystrixCommandProperties类中
- metrics.rollingStats.timeInMilliseconds
该配置表示的是滑动窗口的时间,默认的是10000(10s),也就是熔断器计算的基本单位。 - metrics.rollingStats.numBuckets
滑动窗口的Bucket数量,默认是10,通过timeInMilliseconds和numBuckets可以计算出每个Bucket的时长。metrics.rollingStats.timeInMilliseconds % metrics.rollingStats.numBuckets必须等于0,否则将抛出异常
- 说白了numBuckets的意思就是通过规定划分多少个小格子,这样用timeInMilliseconds/numBuckets就表示的是每个格子应该在多久一次进行收集,这样滑动窗口每次滑动就会移动这一个格子,简单来说,numBuckets这个属性就是来规定我们每隔多久去计算一次。
- 当我们的numBuckets的值越大,则粒度越小,统计的越精准
-
circuitBreaker.requestVolumeThreshold
滑动窗口触发熔断的最小请求数。如果值是20,但是滑动窗口的时间内请求数只有19,那么即使19个请求全部都失败,也不会进行熔断,我们必须达到这个值才行,否则样本太少,没有意义 -
circuitBreaker.sleepWindowInMilliseconds
这个和熔断器的自动回复有关,说白了就是经过多长时间进入半开状态,为了检测后端服务是否恢复,可以放一个请求过去试探。sleepWindow指的发生熔断之后,必须隔sleepWindow这么长的时间,才能将请求放过去去试探下服务是否恢复。默认是5s -
circuitBreaker.errorThresholdPercentage
错误率阈值,表示达到熔断的条件,比如默认的50%,当一个滑动窗口内,失败率达到50%就会触发熔断。
断路器源码讲解
- 上图请结合最开始的流程图来看
- AbstractCommand的构造函数
- 该构造函数是之前我们线程池初始化的时候调用的,初始化GenericCommand的时候,就会同时初始化这个initCircuitBreaker的
initCircuitBreaker - 代码如下
- 该构造函数是之前我们线程池初始化的时候调用的,初始化GenericCommand的时候,就会同时初始化这个initCircuitBreaker的
private static HystrixCircuitBreaker initCircuitBreaker(boolean enabled, HystrixCircuitBreaker fromConstructor,
HystrixCommandGroupKey groupKey, HystrixCommandKey commandKey,
HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
// 如果启用了熔断器
if (enabled) {
// 如果没有对应的熔断器则新建一个
if (fromConstructor == null) {
// get the default implementation of HystrixCircuitBreaker
return HystrixCircuitBreaker.Factory.getInstance(commandKey, groupKey, properties, metrics);
} else {
// 如果有对应的熔断器则返回
return fromConstructor;
}
} else {
// 如果禁用了熔断器,则创建一个空的
return new NoOpCircuitBreaker();
}
}
HystrixCircuitBreaker.Factory.getInstance(commandKey, groupKey, properties, metrics)
/**
* Get the {@link HystrixCircuitBreaker} instance for a given {@link HystrixCommandKey}.
* <p>
* This is thread-safe and ensures only 1 {@link HystrixCircuitBreaker} per {@link HystrixCommandKey}.
*
* @param key
* {@link HystrixCommandKey} of {@link HystrixCommand} instance requesting the {@link HystrixCircuitBreaker}
* @param group
* Pass-thru to {@link HystrixCircuitBreaker}
* @param properties
* Pass-thru to {@link HystrixCircuitBreaker}
* @param metrics
* Pass-thru to {@link HystrixCircuitBreaker}
* @return {@link HystrixCircuitBreaker} for {@link HystrixCommandKey}
*/
public static HystrixCircuitBreaker getInstance(HystrixCommandKey key, HystrixCommandGroupKey group, HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
// this should find it for all but the first time
// circuitBreakersByCommand是一个map和threadPool一样
// private static ConcurrentHashMap<String, HystrixCircuitBreaker> circuitBreakersByCommand = new ConcurrentHashMap<String, HystrixCircuitBreaker>();
// 如果有则返回现有的,key.name就是我们的commandKey
HystrixCircuitBreaker previouslyCached = circuitBreakersByCommand.get(key.name());
if (previouslyCached != null) {
return previouslyCached;
}
// 如果当前commandKey没有对应一个断路器就新建一个,并将其放到commandKey中
HystrixCircuitBreaker cbForCommand = circuitBreakersByCommand.putIfAbsent(key.name(), new HystrixCircuitBreakerImpl(key, group, properties, metrics));
if (cbForCommand == null) {
// this means the putIfAbsent step just created a new one so let's retrieve and return it
return circuitBreakersByCommand.get(key.name());
} else {
// this means a race occurred and while attempting to 'put' another one got there before
// and we instead retrieved it and will now return it
return cbForCommand;
}
}
--------------------------------
// 初始化断路器
protected HystrixCircuitBreakerImpl(HystrixCommandKey key, HystrixCommandGroupKey commandGroup, final HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
this.properties = properties;
// 这是Command中的metrics对象,metrics也是commandKey维度的
this.metrics = metrics;
//On a timer, this will set the circuit between OPEN/CLOSED as command executions occur
// 订阅事件流(重点)
Subscription s = subscribeToStream();
activeSubscription.set(s);
}
-------------------------------
// 订阅事件流,各个事件以结构化数据汇入了stream中
private Subscription subscribeToStream() {
/*
* This stream will recalculate the OPEN/CLOSED status on every onNext from the health stream
*/
// HealthCountsStream是重点
return metrics.getHealthCountsStream()
.observe()
// 利用数据统计的结果HealthCounts, 实现熔断器
.subscribe(new Subscriber<HealthCounts>() {
@Override
public void onCompleted() {
}
@Override
public void onError(Throwable e) {
}
@Override
// 每次都会调用
public void onNext(HealthCounts hc) {
// check if we are past the statisticalWindowVolumeThreshold
if (hc.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
// 检查了是否达到最小的请求数,默认是20个;未达到的话即使请求全部失败也不会进行熔断
} else {
if (hc.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
// 错误百分比的阈值没有达到设定的阈值,什么都不做
} else {
// our failure rate is too high, we need to set the state to OPEN
// 满足断路器打开的条件
// 将断路器的状态设置为开启状态
if (status.compareAndSet(Status.CLOSED, Status.OPEN)) {
// 设置当前的时间
// 这里是为了进行断路器半开状态的判断,当断路器断开一定时间内会进入半开状态,这里就是设置的时间就是这个作用
// 默认是-1
circuitOpened.set(System.currentTimeMillis());
}
}
}
}
});
}
- Metrics可以理解为整个统计结果的计算中心,里面保存了失败的请求数,总请求数
attemptExecution
- 这时候我们在看下这个函数,该函数的作用是判断断路器是否打开
@Override
public boolean attemptExecution() {
if (properties.circuitBreakerForceOpen().get()) {
return false;
}
if (properties.circuitBreakerForceClosed().get()) {
return true;
}
// 如果该值为-1 表示的是断路器关闭,返回true,因为-1是默认值
// 若断路器打开,该值会改变
if (circuitOpened.get() == -1) {
return true;
} else {
// 如果当前currentTime > circuitOpenTime + sleepWindowTime;
// 表示如果当前时间大于断路器开启时间加上我们配置的断路器开启多久进入半开状态
if (isAfterSleepWindow()) {
// 将断路器设置为半开状态
if (status.compareAndSet(Status.OPEN, Status.HALF_OPEN)) {
//only the first request after sleep window should execute
return true;
} else {
return false;
}
} else {
return false;
}
}
}
Zuul的源码流程图
- zuul就是一系列的过滤器
- 如果配置了路由就是将我们配置的路由映射到真实的Eureka配置的服务名,得到对应的IP列表,利用Ribbon进行调用。