Ribbon负载均衡源码解析

RLCRAFT

已于 2024-11-19 08:38:47 修改

阅读量1.1k

点赞数 6

CC 4.0 BY-SA版权

分类专栏：源码解读 ribbon 文章标签： ribbon 负载均衡 python

于 2024-11-18 22:31:29 首次发布

本文链接：https://blog.youkuaiyun.com/RLCRAFT/article/details/143868354

源码解读同时被 2 个专栏收录

4 篇文章

订阅专栏

ribbon

1 篇文章

订阅专栏

Ribbon 是 Netflix 提供的一款客户端负载均衡器，广泛用于分布式系统中，尤其是在 Spring Cloud 中与微服务架构结合使用。Ribbon 支持多种负载均衡算法，通过这些算法，它能够根据不同的策略选择合适的服务器实例来处理客户端请求。以下是 Ribbon 提供的几种常见的负载均衡算法及其实现原理。

1. 轮询算法（Round Robin）

说明：轮询算法会按顺序依次选择服务实例，每次请求都会选择下一个服务实例，直到所有服务实例都被轮询完一圈后，再从头开始。
适用场景：适用于服务实例的负载大致均衡的场景。
Ribbon 实现：RoundRobinRule 类。
代码示例：

在这里插入代码片

2. 随机算法（Random）

说明：随机算法从可用的服务实例中随机选择一个进行请求。该策略能够在服务实例数较少时较为均衡地分配请求。
适用场景：适用于服务实例之间差异不大，且不考虑其他因素的场景。
Ribbon 实现：RandomRule 类。
代码示例：

**
 * A loadbalacing strategy that randomly distributes traffic amongst existing
 * servers.
 *
 * @author stonse
 */
public class RandomRule extends AbstractLoadBalancerRule {

    /**
     * Randomly choose from all living servers
     * 随机从所有可用的服务器中选择一个
     */
    @edu.umd.cs.findbugs.annotations.SuppressWarnings(value = "RCN_REDUNDANT_NULLCHECK_OF_NULL_VALUE")
    public Server choose(ILoadBalancer lb, Object key) {
        //传入的参数 lb 是一个 ILoadBalancer 实例，key 是用于选择的键（通常是请求的一些信息）
        if (lb == null) {
            // 如果负载均衡器为空，直接返回 null
            return null;
        }
        // 用于存储选中的服务器实例
        Server server = null;

        // 直到选择到一个有效的服务器
        while (server == null) {
            // 如果当前线程被中断，返回 null
            if (Thread.interrupted()) {
                return null;
            }
            // 获取所有可到达的服务器
            List<Server> upList = lb.getReachableServers();
            // 获取所有服务器（包括不可达的）
            List<Server> allList = lb.getAllServers();

            // 获取所有服务器的数量
            int serverCount = allList.size();
            // 如果没有可用的服务器
            if (serverCount == 0) {
                /*
                 * No servers. End regardless of pass, because subsequent passes
                 * only get more restrictive.
                 * 没有可用的服务器，返回 null
                 */
                return null;
            }

            // 随机选择一个服务器的索引
            int index = chooseRandomInt(serverCount);
            // 从可用服务器列表中选择一个服务器
            server = upList.get(index);

            // 如果选择到的服务器为 null
            if (server == null) {
                /*
                 * The only time this should happen is if the server list were
                 * somehow trimmed. This is a transient condition. Retry after
                 * yielding.
                 * 如果服务器列表被修改，暂时没有有效的服务器，稍后重试
                 */
                // 暂停当前线程，稍后再试
                Thread.yield();
                // 继续循环，重新选择服务器
                continue;
            }

            // 如果服务器是存活的
            if (server.isAlive()) {
                // 返回该服务器
                return (server);
            }

            // Shouldn't actually happen.. but must be transient or a bug.
            // 如果服务器不可用（不活跃），重新选择
            server = null;
            // 暂停当前线程，稍后再试
            Thread.yield();
        }

        // 返回选中的服务器
        return server;

    }

    // 用于生成一个随机整数，范围从 0 到 serverCount - 1
    protected int chooseRandomInt(int serverCount) {
        return ThreadLocalRandom.current().nextInt(serverCount);
    }

    @Override
    public Server choose(Object key) {
        // 重写父类方法 choose，根据传入的 key 获取负载均衡器并选择一个服务器
        return choose(getLoadBalancer(), key);
    }
}

3. 加权轮询算法（Weighted Round Robin）

说明：加权轮询算法在轮询的基础上，给每个服务实例分配一个权重，权重高的实例会被更多地选择。这个算法适用于不同服务实例的处理能力不均衡的情况。
适用场景：适用于服务实例的负载能力差异较大，需要根据实例的能力分配请求。
Ribbon 实现：WeightedResponseTimeRule 类（基于响应时间加权）。
代码示例：

/** 
 * Rule that use the average/percentile response times
 * to assign dynamic "weights" per Server which is then used in 
 * the "Weighted Round Robin" fashion. 
 * <p>
 * The basic idea for weighted round robin has been obtained from JCS
 * The implementation for choosing the endpoint from the list of endpoints
 * is as follows:Let's assume 4 endpoints:A(wt=10), B(wt=30), C(wt=40), 
 * D(wt=20). 
 * <p>
 * Using the Random API, generate a random number between 1 and10+30+40+20.
 * Let's assume that the above list is randomized. Based on the weights, we
 * have intervals as follows:
 * <p>
 * 1-----10 (A's weight)
 * <br>
 * 11----40 (A's weight + B's weight)
 * <br>
 * 41----80 (A's weight + B's weight + C's weight)
 * <br>
 * 81----100(A's weight + B's weight + C's weight + C's weight)
 * <p>
 * Here's the psuedo code for deciding where to send the request:
 * <p>
 * if (random_number between 1 &amp; 10) {send request to A;}
 * <br>
 * else if (random_number between 11 &amp; 40) {send request to B;}
 * <br>
 * else if (random_number between 41 &amp; 80) {send request to C;}
 * <br>
 * else if (random_number between 81 &amp; 100) {send request to D;}
 * <p>
 * When there is not enough statistics gathered for the servers, this rule
 * will fall back to use {@link RoundRobinRule}. 
 * @author stonse
 */
public class WeightedResponseTimeRule extends RoundRobinRule {

    // 配置项用于控制更新服务器权重的时间间隔，单位为毫秒
    public static final IClientConfigKey<Integer> WEIGHT_TASK_TIMER_INTERVAL_CONFIG_KEY = new IClientConfigKey<Integer>() {
        @Override
        public String key() {
            return "ServerWeightTaskTimerInterval";
        }
        
        @Override
        public String toString() {
            return key();
        }

        @Override
        public Class<Integer> type() {
            return Integer.class;
        }
    };
    // 默认的时间间隔为30秒
    public static final int DEFAULT_TIMER_INTERVAL = 30 * 1000;
    // 存储定时器任务的时间间隔
    private int serverWeightTaskTimerInterval = DEFAULT_TIMER_INTERVAL;
    // 日志记录器
    private static final Logger logger = LoggerFactory.getLogger(WeightedResponseTimeRule.class);
    
    // holds the accumulated weight from index 0 to current index
    // for example, element at index 2 holds the sum of weight of servers from 0 to 2
    // 存储累计的服务器权重值
    private volatile List<Double> accumulatedWeights = new ArrayList<Double>();

    // 随机数生成器，用于生成随机权重值
    private final Random random = new Random();

    // 定时器，用于定期更新服务器权重
    protected Timer serverWeightTimer = null;

    // 用于控制权重计算是否正在进行的标志位
    protected AtomicBoolean serverWeightAssignmentInProgress = new AtomicBoolean(false);

    // 用于存储负载均衡器名称的字符串
    String name = "unknown";

    // 默认构造函数
    public WeightedResponseTimeRule() {
        super();
    }

    // 带负载均衡器的构造函数
    public WeightedResponseTimeRule(ILoadBalancer lb) {
        super(lb);
    }

    //传入一个负载均衡器实例，调用父类构造函数初始化负载均衡器。
    @Override
    public void setLoadBalancer(ILoadBalancer lb) {
        //将负载均衡器实例传递给父类，并且初始化定时任务和相关参数。如果负载均衡器是 BaseLoadBalancer 的实例，则获取其名称并保存到 name 变量中。
        super.setLoadBalancer(lb);
        if (lb instanceof BaseLoadBalancer) {
            name = ((BaseLoadBalancer) lb).getName();
        }
        // 初始化负载均衡器
        initialize(lb);
    }

    // 初始化负载均衡器并启动定时任务
    void initialize(ILoadBalancer lb) {        
        if (serverWeightTimer != null) {
            serverWeightTimer.cancel();
        }
        // 创建定时器，定期更新服务器权重
        serverWeightTimer = new Timer("NFLoadBalancer-serverWeightTimer-"
                + name, true);
        serverWeightTimer.schedule(new DynamicServerWeightTask(), 0,
                serverWeightTaskTimerInterval);
        // do a initial run
        // 进行一次初始的权重计算
        ServerWeight sw = new ServerWeight();
        sw.maintainWeights();

        // 在 JVM 关闭时取消定时器
        Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
            public void run() {
                logger
                        .info("Stopping NFLoadBalancer-serverWeightTimer-"
                                + name);
                serverWeightTimer.cancel();
            }
        }));
    }

    // 停止权重任务定时器
    public void shutdown() {
        if (serverWeightTimer != null) {
            logger.info("Stopping NFLoadBalancer-serverWeightTimer-" + name);
            serverWeightTimer.cancel();
        }
    }

    // 获取累计的权重列表
    List<Double> getAccumulatedWeights() {
        return Collections.unmodifiableList(accumulatedWeights);
    }

    // 选择一个服务器，基于加权轮询算法
    @edu.umd.cs.findbugs.annotations.SuppressWarnings(value = "RCN_REDUNDANT_NULLCHECK_OF_NULL_VALUE")
    @Override
    public Server choose(ILoadBalancer lb, Object key) {
        //负载均衡器为空返回null
        if (lb == null) {
            return null;
        }
        Server server = null;

        while (server == null) {
            // 获取当前权重值的引用
            // get hold of the current reference in case it is changed from the other thread
            List<Double> currentWeights = accumulatedWeights;
            if (Thread.interrupted()) {
                return null;
            }
            // 获取所有服务器的列表
            List<Server> allList = lb.getAllServers();

            //获取所有服务器总数量
            int serverCount = allList.size();

            if (serverCount == 0) {
                return null;
            }

            int serverIndex = 0;

            // last one in the list is the sum of all weights
            // 获取所有权重的总和
            double maxTotalWeight = currentWeights.size() == 0 ? 0 : currentWeights.get(currentWeights.size() - 1); 
            // No server has been hit yet and total weight is not initialized
            // fallback to use round robin
            // 获取所有权重的总和
            if (maxTotalWeight < 0.001d || serverCount != currentWeights.size()) {
                server =  super.choose(getLoadBalancer(), key);
                if(server == null) {
                    return server;
                }
            } else {
                // generate a random weight between 0 (inclusive) to maxTotalWeight (exclusive)
                // 生成一个0到最大权重之间的随机数
                double randomWeight = random.nextDouble() * maxTotalWeight;
                // pick the server index based on the randomIndex
                // 根据随机数选择服务器
                int n = 0;
                for (Double d : currentWeights) {
                    if (d >= randomWeight) {
                        serverIndex = n;
                        break;
                    } else {
                        n++;
                    }
                }

                server = allList.get(serverIndex);
            }

            if (server == null) {
                /* Transient. */
                Thread.yield();
                continue;
            }

            if (server.isAlive()) {
                return (server);
            }

            // Next.
            // 如果当前选择的服务器不可用，则继续尝试
            server = null;
        }
        return server;
    }

    // 定时任务，用于动态调整服务器权重
    class DynamicServerWeightTask extends TimerTask {
        public void run() {
            ServerWeight serverWeight = new ServerWeight();
            try {
                // 更新服务器权重
                serverWeight.maintainWeights();
            } catch (Exception e) {
                logger.error("Error running DynamicServerWeightTask for {}", name, e);
            }
        }
    }

    // 计算并维护每个服务器的权重
    class ServerWeight {

        public void maintainWeights() {
            ILoadBalancer lb = getLoadBalancer();
            if (lb == null) {
                return;
            }
            // 使用 AtomicBoolean 来确保权重计算不会被多个线程同时执行
            if (!serverWeightAssignmentInProgress.compareAndSet(false,  true))  {
                return; 
            }
            
            try {
                logger.info("Weight adjusting job started");
                AbstractLoadBalancer nlb = (AbstractLoadBalancer) lb;
                LoadBalancerStats stats = nlb.getLoadBalancerStats();
                if (stats == null) {
                    // no statistics, nothing to do
                    // 如果没有统计数据，什么都不做
                    return;
                }
                // 计算所有服务器的响应时间总和
                double totalResponseTime = 0;
                // find maximal 95% response time
                for (Server server : nlb.getAllServers()) {
                    // this will automatically load the stats if not in cache
                    // 自动加载服务器的统计信息（如果没有缓存）
                    ServerStats ss = stats.getSingleServerStat(server);
                    totalResponseTime += ss.getResponseTimeAvg();
                }
                // weight for each server is (sum of responseTime of all servers - responseTime)
                // so that the longer the response time, the less the weight and the less likely to be chosen
                // 为每个服务器计算权重，权重是“所有服务器响应时间总和 - 当前服务器的响应时间”
                Double weightSoFar = 0.0;
                
                // create new list and hot swap the reference
                // 创建新的权重列表，并交换旧的列表
                List<Double> finalWeights = new ArrayList<Double>();
                for (Server server : nlb.getAllServers()) {
                    ServerStats ss = stats.getSingleServerStat(server);
                    double weight = totalResponseTime - ss.getResponseTimeAvg();
                    weightSoFar += weight;
                    finalWeights.add(weightSoFar);   
                }
                // 设置新的权重值
                setWeights(finalWeights);
            } catch (Exception e) {
                logger.error("Error calculating server weights", e);
            } finally {
                // 更新标志位，表示权重计算已完成
                serverWeightAssignmentInProgress.set(false);
            }

        }
    }

    // 设置服务器权重
    void setWeights(List<Double> weights) {
        this.accumulatedWeights = weights;
    }

    @Override
    public void initWithNiwsConfig(IClientConfig clientConfig) {
        super.initWithNiwsConfig(clientConfig);
        // 从配置中获取定时任务时间间隔
        serverWeightTaskTimerInterval = clientConfig.get(WEIGHT_TASK_TIMER_INTERVAL_CONFIG_KEY, DEFAULT_TIMER_INTERVAL);
    }

}

4. 响应时间优先算法（Response Time Based）

说明：响应时间优先算法优先选择响应时间最快的服务实例。这个算法会根据每个服务实例的响应时间来动态调整负载均衡，通常需要结合监控系统来评估各个服务实例的响应时间。
适用场景：适用于希望减少请求延迟的场景，尤其是在请求响应时间差异较大的情况下。
Ribbon 实现：BestAvailableRule 和 WeightedResponseTimeRule 类。
代码示例：

/**
 * A rule that skips servers with "tripped" circuit breaker and picks the
 * server with lowest concurrent requests.
 * <p>
 * This rule should typically work with {@link ServerListSubsetFilter} which puts a limit on the
 * servers that is visible to the rule. This ensure that it only needs to find the minimal
 * concurrent requests among a small number of servers. Also, each client will get a random list of
 * servers which avoids the problem that one server with the lowest concurrent requests is
 * chosen by a large number of clients and immediately gets overwhelmed.
 * 继承自 ClientConfigEnabledRoundRobinRule，表示这是一个基于轮询的负载均衡策略，但加上了并发连接数最少的优先策略。
 *
 * @author awang
 */
public class BestAvailableRule extends ClientConfigEnabledRoundRobinRule {

    // 定义一个字段 loadBalancerStats，用于保存负载均衡器的统计信息（包括每个服务器的并发请求数等信息）
    private LoadBalancerStats loadBalancerStats;

    @Override
    public Server choose(Object key) {
        // 重写 choose 方法，该方法用于选择合适的服务实例来处理请求。
        if (loadBalancerStats == null) {
            // 如果 loadBalancerStats 为空，说明负载均衡器没有可用的统计信息，此时回退到父类的选择方法（轮询）。
            return super.choose(key);
        }
        // 获取负载均衡器中的所有服务器列表（包括可用和不可用的服务器）
        List<Server> serverList = getLoadBalancer().getAllServers();
        // 初始化最小并发连接数为最大整数，方便后续选择并发请求数最少的服务器。
        int minimalConcurrentConnections = Integer.MAX_VALUE;
        // 获取当前时间戳，用于后续判断服务器的状态。
        long currentTime = System.currentTimeMillis();
        // 初始化一个 chosen 变量，用于保存选择的服务器实例。
        Server chosen = null;
        // 遍历所有服务器，选择并发连接数最少且未触发断路器的服务器。
        for (Server server : serverList) {
            // 获取当前服务器的统计信息，包含了该服务器的活跃请求数、是否触发断路器等。
            ServerStats serverStats = loadBalancerStats.getSingleServerStat(server);
            // 如果当前服务器没有触发断路器（即服务器是健康的，能够处理请求）。
            if (!serverStats.isCircuitBreakerTripped(currentTime)) {
                // 获取当前服务器的并发连接数（即当前活动请求数）。
                int concurrentConnections = serverStats.getActiveRequestsCount(currentTime);
                // 如果该服务器的并发请求数少于目前找到的最小并发数，则选择该服务器。
                if (concurrentConnections < minimalConcurrentConnections) {
                    // 更新最小并发连接数，并选择当前服务器作为候选服务器。
                    minimalConcurrentConnections = concurrentConnections;
                    chosen = server;
                }
            }
        }
        // 如果没有找到符合条件的服务器（即所有服务器都触发了断路器或并发连接数较高），则回退到父类的轮询选择方法。
        if (chosen == null) {
            return super.choose(key);
        } else {
            // 如果找到符合条件的服务器，返回选择的服务器。
            return chosen;
        }
    }

    @Override
    // 重写 setLoadBalancer 方法，在设置负载均衡器时，也设置 loadBalancerStats。
    public void setLoadBalancer(ILoadBalancer lb) {
        // 调用父类的 setLoadBalancer 方法，设置负载均衡器。
        super.setLoadBalancer(lb);
        // 如果负载均衡器是 AbstractLoadBalancer 的实例，则获取该负载均衡器的统计信息。
        if (lb instanceof AbstractLoadBalancer) {
            // 设置 loadBalancerStats 字段，用于后续获取每个服务器的统计信息。
            loadBalancerStats = ((AbstractLoadBalancer) lb).getLoadBalancerStats();
        }
    }
}

5. 最少连接数算法（Least Connections）

说明：最少连接数算法根据每个服务实例的当前连接数选择请求发送的目标实例。该算法优先选择当前连接数最少的服务实例，适用于负载均衡的网络负载较高的场景。
适用场景：适用于服务实例之间的连接数差异较大，且希望避免某个实例过载的情况。
Ribbon 实现：Ribbon 默认不提供最少连接数策略，但可以通过自定义负载均衡策略来实现。
代码示例:

在这里插入代码片

6. 加权响应时间算法（Weighted Response Time）

说明：这种策略会在响应时间上给每个服务实例赋予权重，优先选择响应时间最短的实例，尤其是结合了服务的加权策略，适用于服务的负载能力差异较大。
适用场景：适用于服务器响应时间不同，且需要根据响应时间动态分配请求的场景。
Ribbon 实现：WeightedResponseTimeRule 类。
代码示例：

在这里插入代码片

7. 区域避开算法（Zone Avoidance）

说明：区域避开算法会根据区域信息选择服务实例，避免将请求发送到当前不可用或负载较高的区域。常用于跨地域的服务调用。
适用场景：适用于跨区域负载均衡场景。
Ribbon 实现：ZoneAvoidanceRule 类。
代码示例：

在这里插入代码片

自定义负载均衡规则

Ribbon 允许你实现自定义的负载均衡策略，只需要实现 IRule 接口并重写 choose 方法即可。

public class CustomRule implements IRule {
    
    @Override
    public Server choose(Object key) {
        // 自定义负载均衡逻辑
    }
}

然后在 Spring 配置中注入该自定义策略：

@Bean
public IRule ribbonRule() {
    return new CustomRule(); // 使用自定义的负载均衡策略
}

负载均衡配置示例
以下是 application.yml 中的一个 Ribbon 配置示例，其中使用了 RoundRobinRule（轮询算法）：

ribbon:
  NFLoadBalancerRuleClassName: com.netflix.loadbalancer.RoundRobinRule  # 负载均衡策略
  ConnectTimeout: 3000  # 连接超时设置
  ReadTimeout: 3000  # 读取超时设置
  MaxTotalConnections: 200  # 最大连接数
  MaxConnectionsPerHost: 50  # 每个主机的最大连接数