http://www.verydemo.com/demo_c288_i57027.html
一开始在没有看源码的时候,看了下官方文档的Filter Scheduler,知道过滤(filter)是怎么回事,但是那个weight是什么意思始终没看明白,现在看下源码发现也挺简单的。
Scheduler做的工作就是在创建实例(instance)时,为实例找到合适的主机(host),这个过程分为两步:首先是过滤(filter),从所有的主机中找到符合实例运行条件的主机,然后从过滤出来的主机中,找到最合适的一个主机。这个“最合适”其实是由“你”说了算的,也就是可以通过配置文件,来让它按你的要求来filter和weight。如下图:
这部分功能的实现主要是在FilterScheduler类中的_schedule()方法中:
- def _schedule(self, context, topic, request_spec, filter_properties, *args,
- **kwargs):
- """Returns a list of hosts that meet the required specs,
- ordered by their fitness.
- """
- elevated = context.elevated()
- if topic != "compute":
- msg = _("Scheduler only understands Compute nodes (for now)")
- raise NotImplementedError(msg)
- instance_properties = request_spec['instance_properties']
- instance_type = request_spec.get("instance_type", None)
- cost_functions = self.get_cost_functions()
- config_options = self._get_configuration_options()
- # check retry policy:
- self._populate_retry(filter_properties, instance_properties)
- filter_properties.update({'context': context,
- 'request_spec': request_spec,
- 'config_options': config_options,
- 'instance_type': instance_type})
- self.populate_filter_properties(request_spec,
- filter_properties)
- # Find our local list of acceptable hosts by repeatedly
- # filtering and weighing our options. Each time we choose a
- # host, we virtually consume resources on it so subsequent
- # selections can adjust accordingly.
- # unfiltered_hosts_dict is {host : ZoneManager.HostInfo()}
- unfiltered_hosts_dict = self.host_manager.get_all_host_states(
- elevated, topic)
- # Note: remember, we are using an iterator here. So only
- # traverse this list once. This can bite you if the hosts
- # are being scanned in a filter or weighing function.
- hosts = unfiltered_hosts_dict.itervalues()#HostState对象
- num_instances = request_spec.get('num_instances', 1)
- selected_hosts = []
- for num in xrange(num_instances):
- # Filter local hosts based on requirements ...
- # 得到当前可用的符合条件的host,即每一个host要能穿过所有的过滤器,才会认为是可用的
- hosts = self.host_manager.filter_hosts(hosts,
- filter_properties)
- if not hosts:
- # Can't get any more locally.
- break
- LOG.debug(_("Filtered %(hosts)s") % locals())
- # weighted_host = WeightedHost() ... the best
- # host for the job.
- # TODO(comstud): filter_properties will also be used for
- # weighing and I plan fold weighing into the host manager
- # in a future patch. I'll address the naming of this
- # variable at that time.
- weighted_host = least_cost.weighted_sum(cost_functions,
- hosts, filter_properties)
- LOG.debug(_("Weighted %(weighted_host)s") % locals())
- selected_hosts.append(weighted_host)
- # Now consume the resources so the filter/weights
- # will change for the next instance.
- weighted_host.host_state.consume_from_instance(
- instance_properties)
- selected_hosts.sort(key=operator.attrgetter('weight'))
- return selected_hosts[:num_instances]
类之间的相互调用关系如下类图:
_schedule()中主要涉及到两个方法:HostManager中的filter_hosts()和least_cost模块中的weighted_sum(),两者分别实现了filter和weight。
先来看一下filter_hosts():
- # 对hosts中的每一个host进行过滤,返回通过过滤后的host
- def filter_hosts(self, hosts, filter_properties, filters=None):
- """Filter hosts and return only ones passing all filters"""
- filtered_hosts = []
- filter_fns = self._choose_host_filters(filters)#得到过滤器类中的过滤器方法host_passes()
- for host in hosts:
- if host.passes_filters(filter_fns, filter_properties):
- filtered_hosts.append(host)
- return filtered_hosts
随后在对每一个主机循环的过程中,调用了每一个主机的passes_filters()方法,用过滤类中的过滤方法,根据所给的过滤条件(filter_properties)来判断这个主机是不是符合条件:
- def passes_filters(self, filter_fns, filter_properties):
- """Return whether or not this host passes filters."""
- #忽略的host
- if self.host in filter_properties.get('ignore_hosts', []):
- LOG.debug(_('Host filter fails for ignored host %(host)s'),
- {'host': self.host})
- return False
- #强制使用哪个host
- force_hosts = filter_properties.get('force_hosts', [])
- if force_hosts:
- if not self.host in force_hosts:
- LOG.debug(_('Host filter fails for non-forced host %(host)s'),
- {'host': self.host})
- return self.host in force_hosts
- #调用各个Filter类中的host_passes()方法,对filter_properties进行验证
- #只有所有的过滤器都通过,才返回真,否则有一个不通过就返回假
- for filter_fn in filter_fns:
- if not filter_fn(self, filter_properties):
- LOG.debug(_('Host filter function %(func)s failed for '
- '%(host)s'),
- {'func': repr(filter_fn),
- 'host': self.host})
- return False
- LOG.debug(_('Host filter passes for %(host)s'), {'host': self.host})
- return True
选择出来符合条件的主机放到filtered_hosts列表中返回,这第一步工作就完成了。
然后是第二步weight,这个weight该怎么理解呢?从它做的工作来看,就是从符合条件的主机中选择“最合适”的主机,这个选择的过程是通过“评分”来实现的,我们判断一个人的水平高低,往往是看它的综合成绩,把各科成绩加起来,根据总分来判断,或者是求平均成绩。在这里也是同样的道理,判断哪一个主机“最合适”,也是看多方面的,从主机的剩余内存,剩余磁盘空间,vcpu的使用情况来综合考虑,但是有一个问题就是这些数据都不是一类的,单位不同,不能直接进行相加,但是他们有一个共同点就是他们的值都是线性变化的,所以可以给他们都乘上一个weight,让他们达到同一个数量级别,就可以进行相加了,得出来的“总分”也就能说明这个主机整体的情况了。来看一下weighted_sum()的代码:
- def weighted_sum(weighted_fns, host_states, weighing_properties):
- # weight=-1.0
- # 取最小值,即找剩余内存最大的host
- min_score, best_host = None, None
- for host_state in host_states:
- score = sum(weight * fn(host_state, weighing_properties)
- for weight, fn in weighted_fns)
- if min_score is None or score < min_score:
- min_score, best_host = score, host_state
- return WeightedHost(min_score, host_state=best_host)
现在再来看官网上的那张图片就应该容易理解了吧: