kube-scheduler源码走读

蚂蚁金服kubernetes方向招聘

kube-scheduler是k8s中相对比较简单的一个服务,它监听api server获取新建的Pod,从众多的Node中选择一个合适的,来运行该Pod。

选择的过程分两个阶段:预选阶段 & 优选阶段

  • 预选阶段:根据Pod创建的要求,筛选出所有符合要求的Node,通过该阶段的Node理论上都可以运行目标Pod
  • 优选阶段:给上一步筛选出来的Node打分,选择一个分数最高的Node

本文简单的跟进一下kube-scheduler执行的整个流程。

入口代码:

#cmd/kube-scheduler/app/server.go:62
#同样基于cobra包开发,
	cmd := &cobra.Command{
		Use: "kube-scheduler",
		Long: `The Kubernetes scheduler is a policy-rich, topology-aware,
workload-specific function that significantly impacts availability, performance,
and capacity. The scheduler needs to take into account individual and collective
resource requirements, quality of service requirements, hardware/software/policy
constraints, affinity and anti-affinity specifications, data locality, inter-workload
interference, deadlines, and so on. Workload-specific requirements will be exposed
through the API as necessary.`,
		Run: func(cmd *cobra.Command, args []string) {
			if err := runCommand(cmd, args, opts); err != nil {
				fmt.Fprintf(os.Stderr, "%v\n", err)
				os.Exit(1)
			}
		},
	}


// runCommand runs the scheduler.
func runCommand(cmd *cobra.Command, args []string, opts *options.Options) error {
	//构建调度所需的配置文件:主要包括kubeclient、eventclient、
	cc, err := opts.Config()

	stopCh := make(chan struct{})

	//根据当前的feature gates对调度的算法做一些裁剪
	// Apply algorithms based on feature gates.
	// TODO: make configurable?
	algorithmprovider.ApplyFeatureGates()

	//启动调度服务
	return Run(cc, stopCh)
}

继续看Run函数

// Run executes the scheduler based on the given configuration. It only return on error or when stopCh is closed.
func Run(cc schedulerserverconfig.CompletedConfig, stopCh <-chan struct{}) error {
	// Create the scheduler.
	//构造一个scheduler,(Scheduler),
	//构造调度的预选策略列表、优选策略列表、为各个informer关联handler处理函数
	sched, err := scheduler.New(cc.Client, ...,

	// Start all informers.
	//启动各个infomer,监听相关的变化
	go cc.PodInformer.Informer().Run(stopCh)
	cc.InformerFactory.Start(stopCh)

	// Wait for all caches to sync before scheduling.
	cc.InformerFactory.WaitForCacheSync(stopCh)

	// Prepare a reusable runCommand function.
	run := func(ctx context.Context) {
		sched.Run()
		<-ctx.Done()
	}

	ctx, cancel := context.WithCancel(context.TODO()) // TODO once Run() accepts a context, it should be used here
	defer cancel()

	go func() {
		select {
		case <-stopCh:
			cancel()
		case <-ctx.Done():
		}
	}()


	// Leader election is disabled, so runCommand inline until done.
	run(ctx)
	return fmt.Errorf("finished without leader elect")
}

这里边重要的函数就两个,

一个是scheduler.New() 构建了一个Scheduler对象,关联了各个informer的动作;

一个是run(ctx),启动调度服务,run(ctx)最终会调用函数:scheduleOne;

先看下scheduleOne函数

// scheduleOne does the entire scheduling workflow for a single pod.  It is serialized on the scheduling algorithm's host fitting.
func (sched *Scheduler) scheduleOne() {
	//从待调度队列中拿到一个需要调度的Pod
	pod := sched.config.NextPod()
	// pod could be nil when schedulerQueue is closed
	if pod == nil {
		return
	}

	//采用调度算法选择一个合适的Node来运行该Pod
	scheduleResult, err := sched.schedule(pod)

	assumedPod := pod.DeepCopy()
	
	#根据调度结果scheduleResult, 将pod绑定到对应的Node上
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值