Hadoop MapReduce Next Generation - Capacity Scheduler_note: queues cannot be deleted, only addition of n-优快云博客

Capacity Scheduler，一个可插入的hadoop调度器，它可以使的多个用户可以安全的共享一个大的集群，他们的应用程序可以在容量限度下获取到需要的资源。

概述

CapacityScheduler被设计出来使得hadoop应用能够共享的、多用户的、操作简便的运行在集群上，同时最大化集群的吞吐率和利用率。

传统上，每个组织都有自己的一组机器资源保证在峰值货接近峰值是的资源需求，这通常导致比较差的资源利用率和多个独立集群的管理开销，各个组织之间共享集群是非常有效益的一种方式，但各个组织有比较担心共享集群的方式，担心自己的资源被占用，而影响了SLA。

CapacityScheduler被设计用来共享一个大的集群，同时保证每个组织的容量需求。他的核心思想在使用hadoop集群的各个组织构建一个共同基金，然后根据他们对资源需求的来共享资源，这种方式有一个额外的好处，一个组织可以访问别人没有使用的资源，从而提供一个符合成本效益的弹性组织方式。

跨组织共享集群，需要多个组织的用户的大力支持，因为每一个组织必须保证容量和安全防范，以确保共享的集群不受单个应用或用户的影响。CapacityScheduler提供了一套严格的限制以确保单个应用或用户不会消耗不合比例的集群资源。此外，CapacityScheduler提供了初始化/挂起一个用户或者队列的应用，以集群的保证公平与平稳。

CapacityScheduler提供的一个主要抽象就是队列，队列是有管理员建立来反应资源的分配。

未来提供资源进一步的控制和预测，CapacitySchedu提供层次的队列可以保证资源在一个队列的子队列中优先于其他队列来获取到空闲的资源，这样提供了一种在一个组织内部的优先共享资源。

特性

CapacSche支持一下一些特性：

层次化的队列

层次化的队列支持在一个组织内子队列的优先共享资源，从而提供了更多的控制和预测的能力

资源容量保证

从某种角度说，队列实现了一种资源的划分，所有的应用都会被指定到特定的队列，这些应用所能使用到的资源受到队列所拥有资源的限制，管理员可以配置soft limits 或者optional hard limits来限制队列所拥有的资源

安全性

每一个队列都有一个严格的ACL（accesscontrol list）来控制那些用户可以访问队列，并且有一个safe-guard来保证用户不能够看或者修改其他用户的应用，而且每个队列或系统都可以设置管理员角色。

弹性

空闲的资源可以分配给任何队列，这样可能超出队列的资源限制。也就是说，如果集群有空闲的资源，而有些队列需要的资源超出了分给他的限制，这些空闲的资源将被分配给这些队列，这样就保证了资源的可预测性和弹性，从而防止了人工孤岛，帮助实现资源的优化利用。

多用户，一系列的综合设置可以防止单一的应用或用户占用队列或集群的全部资源，防止集群被单用户过度使用，从而保证了多用户可以共同使用集群

可操作性

RuntimeConfiguration：一些设置可以在运行时进行配置，例如资源分配的容量，ACL（access control list）等，这些都可以在运行时有管理员设置，减少对用户的影响。而且有一个终端提供给管理员和用户来查看当前分配各个队列的资源，管理员也可以在运行时添加队列。

Drain applications：所谓Drainapplications就是管理员可以停止运行的队列，同时保证队列上的任务运行完成，而新的任务不会提交到队列上。如果一个队列是STOPPED状态，新的应用不会被提交到该队列或者他的子队列上，而队列上现有的任务会一直运行完成。管理员也可以start一个stoped的队列。

基于资源的调度

支持资源密集型的应用，应用可以被指定分配超出缺省设置的更多的资源，因此可以容纳不同资源需求的应用程序，目前只支持内存资源的配置。

配置

设置ResourceManager使用CapacityScheduler

为了使ResourceManager使用CapacityScheduler，需要设置conf/yan-set.xml的下面的属性

Property	Value
`yarn.resourcemanager.scheduler.class`	`org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.CapacityScheduler`

建立队列

conf/capacity-schduler.xml是CapacityScheduler的配置文件

CapacityScheduler有一个预定义的队列叫做root，所有系统中的队列都是root的子队列。

进一步，队列可以通过配置yan.scheduler.capacity.root.queues配置项来建立，通过逗号分隔来建立子队列列表

CapacityScheduler使用queue path的概念来建立层次化的子队列，queue path是从root开始到子队列的一个全路径，使用“.”来分隔

一个定义好的子队列可以作为进一步配置的knob（译者：也就是对子队列属性进行进一步配置的引用），例如：yan.scheduler.capacity.<queue-path>.queues

下面是一个3个top-level的子队列a，b，c和a，b的一些子队列

<name>yarn.scheduler.capacity.root.queues</name>

<description>The queues at the thislevel (root is the root queue).

</description>

</property>

<name>yarn.scheduler.capacity.root.a.queues</name>

<description>The queues at the thislevel (root is the root queue).

</description>

</property>

<name>yarn.scheduler.capacity.root.b.queues</name>

<description>The queues at the thislevel (root is the root queue).

</description>

</property>

队列属性

资源分配

Property	Description
`yarn.scheduler.capacity.<queue-path>.capacity`	Queue capacity in percentage (%). The sum of capacities for all queues, at each level, should be less than or equal to 100. Applications in the queue may consume more resources than the queue's capacity if there are free resources, providing elasticity.
`yarn.scheduler.capacity.<queue-path>.maximum-capacity`	Maximum queue capacity in percentage (%). This limits the elasticity for applications in the queue.
`yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent`	Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value. The former depends on the number of users who have submitted applications, and the latter is set to this property value. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed.
`yarn.scheduler.capacity.<queue-path>.user-limit-factor`	The multiple of the queue capacity which can be configured to allow a single user to acquire more resources. By default this is set to 1 which ensures that a single user can never take more than the queue's configured capacity irrespective of how idle th cluster is.

运行和挂起的应用限制

CapacityScheduler支持下面的参数来控制运行和挂起的应用

Property	Description
`yarn.scheduler.capacity.maximum-applications`	Maximum number of jobs in the system which can be concurently active both running and pending. Limits on each queue are directly proportional to their queue capacities.
yarn.scheduler.capacity.maximum-am-resource-percent	Maximum percent of resources in the cluster which can be used to run application masters - controls number of concurrent running applications.

队列Administration和Permissions

CapacityScheduler支持下面的参数来对队列进行管理

Property	Description
`yarn.scheduler.capacity.<queue-path>.state`	The state of the queue. Can be one of `RUNNING` or `STOPPED`. If a queue is in `STOPPED` state, new applications cannot be submitted toitself or any of its child queueus. Thus, if the root queue is `STOPPED` no applications can be submitted to the entire cluster. Existing applications continue to completion, thus the queue can be drained gracefully.
`yarn.scheduler.capacity.root.<queue-path>.acl_submit_jobs`	The ACL which controls who can submit jobs to the given queue. If the given user/group has necessary ACLs on the given queue or one of the parent queues in the hierarchy they can submit jobs.
`yarn.scheduler.capacity.root.<queue-path>.acl_administer_jobs`	The ACL which controls who can administer jobs on the given queue. If the given user/group has necessary ACLs on the given queue or one of the parent queues in the hierarchy they can administer jobs.

ReviewCapacityScheduler的配置

一旦安装配置完成，你可以从web-ui启动YARN：

启动YARN normal
打开ResourceManager WebUI
Scheduler的web-page将显示每个individualqueue的资源使用情况

改变队列的配置

改变对立的属性和添加新的队列非常简单，你需要编辑conf/capacity-scheduler.xml然后运行rmadmin –refreshQueue

$ vi $HADOOP_CONF_DIR/capacity-scheduler.xml

$ $YARN_HOME/bin/rmadmin -refreshQueues

注意：队列不能被删除，只有添加新的队列被支持，更新队列的配置将是有效的，队列的容量在每一个层级将等于100%。（译者：这句话不知道怎么理解好，原文如下：Note: Queues cannot be deleted, only addition of newqueues is supported - the updated queue configuration should be a valid onei.e. queue-capacity at eachlevel should be equal to 100%.）