hystrix
执行请求的代码逻辑如下图
以下为比较重要的hystrix
熔断参数。
hystrix:
command:
default:
execution:
isolation:
strategy: THREAD
thread:
timeoutInMilliseconds: 10000
circuitBreaker:
requestVolumeThreshold: 30
sleepWindowInMilliseconds: 5000
errorThresholdPercentage: 90
strategy
指定策略为thread
形式,有2
种选择分别是SEMAPHORE
,THREAD
,后者请求执行逻辑在单独线程内执行,前者在当前请求线程内执行。timeoutInMilliseconds
线程超市时间,我在图里没有画出来。errorThresholdPercentage
窗口内请求失败数占比。例如超过90%
请求触发失败,才允许打开熔断器。requestVolumeThreshold
如果熔断器要触发熔断,当前时间窗口内必须至少有requestVolumeThreshold
配置的请求个数。如果errorThresholdPercentage
配置了50%
,requestVolumeThreshold
为1
,如果只有进来一个请求,如果请求失败失败,就会立即触发熔断,不合理。sleepWindowInMilliseconds
熔断触发后,熔断时间。这段时间内都返回fallback
结果。
requestVolumeThreshold
,sleepWindowInMilliseconds
和errorThresholdPercentage
在circuitBreaker.allowRequest()
里被用到。hystrix
根据实际请求的状态决定是否允许执行请求。
public boolean allowRequest() {
if (properties.circuitBreakerForceOpen().get()) {
// properties have asked us to force the circuit open so we will allow NO requests
return false;
}
if (properties.circuitBreakerForceClosed().get()) {
// we still want to allow isOpen() to perform it's calculations so we simulate normal behavior
isOpen();
// properties have asked us to ignore errors so we will ignore the results of isOpen and just allow all traffic through
return true;
}
// 熔断器未打开或者熔断时间已经超过熔断时间间隔,则允许请求执行
return !isOpen() || allowSingleTest();
}
public boolean allowSingleTest() {
long timeCircuitOpenedOrWasLastTested = circuitOpenedOrLastTestedTime.get();
// 如果当前熔断器已经打开,并且当前的开启时间已经超过 `sleepWindowInMilliseconds`,则允许请求,并更新`circuitOpenedOrLastTestedTime.compareAndSet`。
// 1) if the circuit is open
// 2) and it's been longer than 'sleepWindow' since we opened the circuit
if (circuitOpen.get() && System.currentTimeMillis() > timeCircuitOpenedOrWasLastTested +
properties.circuitBreakerSleepWindowInMilliseconds().get()) {
// We push the 'circuitOpenedTime' ahead by 'sleepWindow' since we have allowed one request to try.
// If it succeeds the circuit will be closed, otherwise another singleTest will be allowed at the end of the 'sleepWindow'.
if (circuitOpenedOrLastTestedTime.compareAndSet(timeCircuitOpenedOrWasLastTested, System.currentTimeMillis())) {
// if this returns true that means we set the time so we'll return true to allow the singleTest
// if it returned false it means another thread raced us and allowed the singleTest before we did
return true;
}
}
return false;
}
@Override
public boolean isOpen() {
if (circuitOpen.get()) {
// if we're open we immediately return true and don't bother attempting to 'close' ourself as that is left to allowSingleTest and a subsequent successful test to close
return true;
}
// we're closed, so let's see if errors have made us so we should trip the circuit open
HealthCounts health = metrics.getHealthCounts();
// check if we are past the statisticalWindowVolumeThreshold
// 当前窗口请求数未达到 requestVolumeThreshold 数,不打开熔断器
if (health.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
// we are not past the minimum volume threshold for the statisticalWindow so we'll return false immediately and not calculate anything
return false;
}
// 当前窗口请求错误占比未达到 errorThresholdPercentage 值,不打开熔断器
if (health.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
return false;
} else {
// 打开熔断器
// our failure rate is too high, trip the circuit
if (circuitOpen.compareAndSet(false, true)) {
// if the previousValue was false then we want to set the currentTime
circuitOpenedOrLastTestedTime.set(System.currentTimeMillis());
return true;
} else {
// How could previousValue be true? If another thread was going through this code at the same time a race-condition could have
// caused another thread to set it to true already even though we were in the process of doing the same
// In this case, we know the circuit is open, so let the other thread set the currentTime and report back that the circuit is open
return true;
}
}
}
除了以上的几个参数外,还有以下几个比较重要的参数。
hystrix:
threadpool:
default:
metrics:
rollingStats:
timeInMilliseconds: 60000
numBuckets: 1
timeInMilliseconds
表示窗口的时间范围,numBuckets
表示桶数量,每个桶的时间为timeInMilliseconds/numBuckets
,hystrix
统计桶时间内错误数和错误比例来决定是否熔断。