更多ceph-csi其他源码分析,请查看下面这篇博文:ceph-csi套件分析目录导航
概述
接下来将对external-provisioner组件进行源码分析。
在external-provisioner组件中,rbd与cephfs共用一套处理逻辑,也即同一套代码,同时适用于rbd存储与cephfs存储。
external-provisioner组件的源码分析分为三部分:
(1)主体处理逻辑分析;
(2)main方法与Leader选举分析;
(3)组件启动参数分析。
基于tag v1.6.0
https://github.com/kubernetes-csi/external-provisioner/releases/tag/v1.6.0
external-provisioner作用介绍
(1)create pvc时,external-provisioner参与存储资源与pv对象的创建。external-provisioner组件监听到pvc创建事件后,负责拼接请求,调用ceph-csi组件的CreateVolume方法来创建存储,创建存储成功后,创建pv对象;
(2)delete pvc时,external-provisioner参与存储资源与pv对象的删除。当pvc被删除时,pv controller会将其绑定的pv对象状态由bound更新为release,external-provisioner监听到pv更新事件后,调用ceph-csi的DeleteVolume方法来删除存储,并删除pv对象。
external-provisioner源码分析(1)-主体处理逻辑分析
external-provisioner组件中,主要的业务处理逻辑都在provisionController中了,所以对external-provisioner组件的分析,先从provisionController入手。
provisionController主要负责处理claimQueue(也即处理pvc对象的新增与更新事件),根据需要调用ceph-csi组件的CreateVolume方法来创建存储,并创建pv对象;与处理volumeQueue(也即处理pv对象的新增与更新事件),根据pv的状态以及回收策略决定是否调用ceph-csi组件的DeleteVolume方法来删除存储,并删除pv对象。
后续会对claimQueue与volumeQueue进行分析。
main方法中调用了provisionController.Run(wait.NeverStop)
,作为provisionController的分析入口。
provisionController.Run()
provisionController.Run()中定义了run方法并执行。主要关注run方法中的ctrl.runClaimWorker
与ctrl.runVolumeWorker
,这两个方法负责处理主体逻辑。
// Run starts all of this controller's control loops
func (ctrl *ProvisionController) Run(_ <-chan struct{}) {
// TODO: arg is as of 1.12 unused. Nothing can ever be cancelled. Should
// accept a context instead and use it instead of context.TODO(), but would
// break API. Not urgent: realistically, users are simply passing in
// wait.NeverStop() anyway.
run := func(ctx context.Context) {
glog.Infof("Starting provisioner controller %s!", ctrl.component)
defer utilruntime.HandleCrash()
defer ctrl.claimQueue.ShutDown()
defer ctrl.volumeQueue.ShutDown()
ctrl.hasRunLock.Lock()
ctrl.hasRun = true
ctrl.hasRunLock.Unlock()
if ctrl.metricsPort > 0 {
prometheus.MustRegister([]prometheus.Collector{
metrics.PersistentVolumeClaimProvisionTotal,
metrics.PersistentVolumeClaimProvisionFailedTotal,
metrics.PersistentVolumeClaimProvisionDurationSeconds,
metrics.PersistentVolumeDeleteTotal,
metrics.PersistentVolumeDeleteFailedTotal,
metrics.PersistentVolumeDeleteDurationSeconds,
}...)
http.Handle(ctrl.metricsPath, promhttp.Handler())
address := net.JoinHostPort(ctrl.metricsAddress, strconv.FormatInt(int64(ctrl.metricsPort), 10))
glog.Infof("Starting metrics server at %s\n", address)
go wait.Forever(func() {
err := http.ListenAndServe(address, nil)
if err != nil {
glog.Errorf("Failed to listen on %s: %v", address, err)
}
}, 5*time.Second)
}
// If a external SharedInformer has been passed in, this controller
// should not call Run again
if !ctrl.customClaimInformer {
go ctrl.claimInformer.Run(ctx.Done())
}
if !ctrl.customVolumeInformer {
go ctrl.volumeInformer.Run(ctx.Done())
}
if !ctrl.customClassInformer {
go ctrl.classInformer.Run(ctx.Done())
}
if !cache.WaitForCacheSync(ctx.Done(), ctrl.claimInformer.HasSynced, ctrl.volumeInformer.HasSynced, ctrl.classInformer.HasSynced) {
return
}
// 两个worker跑多个goroutine
for i := 0; i < ctrl.threadiness; i++ {
go wait.Until(ctrl.runClaimWorker, time.Second, context.TODO().Done())
go wait.Until(ctrl.runVolumeWorker, time.Second, context.TODO().Done())
}
glog.Infof("Started provisioner controller %s!", ctrl.component)
select {}
}
go ctrl.volumeStore.Run(context.TODO(), DefaultThreadiness)
// 选主相关
if ctrl.leaderElection {
rl, err := resourcelock.New("endpoints",
ctrl.leaderElectionNamespace,
strings.Replace(ctrl.provisionerName, "/", "-", -1),
ctrl.client.CoreV1(),
nil,
resourcelock.ResourceLockConfig{
Identity: ctrl.id,
EventRecorder: ctrl.eventRecorder,
})
if err != nil {
glog.Fatalf("Error creating lock: %v", err)
}
leaderelection.RunOrDie(context.TODO(), leaderelection.LeaderElectionConfig{
Lock: rl,
LeaseDuration: ctrl.leaseDuration,
RenewDeadline: ctrl.renewDeadline,
RetryPeriod: ctrl.retryPeriod,
Callbacks: leaderelection.LeaderCallbacks{
OnStartedLeading: run,
OnStoppedLeading: func() {
glog.Fatalf("leaderelection lost")
},
},
})
panic("unreachable")
} else {
run(context.TODO())
}
}
接下来将分别对run方法中的ctrl.runClaimWorker
与ctrl.runVolumeWorker
进行分析。
1.ctrl.runClaimWorker
根据threadiness的个数,起相应个数的goroutine,运行ctrl.runClaimWorker。
主要负责处理claimQueue,处理pvc对象的新增与更新事件,根据需要调用csi组件的CreateVolume方法来创建存储,并创建pv对象。
for i := 0; i < ctrl.threadiness; i++ {
go wait.Until(ctrl.runClaimWorker, time.Second, context.TODO().Done())
go wait.Until(ctrl.runVolumeWorker, time.Second, context.TODO().Done())
}
// vendor/sigs.k8s.io/sig-storage-lib-external-provisioner/v5/controller/controller.go
func (ctrl *ProvisionController) runClaimWorker() {
// 无限循环processNextClaimWorkItem
for ctrl.processNextClaimWorkItem() {
}
}
调用链:main() --> provisionController.Run() --> ctrl.runClaimWorker() --> ctrl.processNextClaimWorkItem() --> ctrl.syncClaimHandler() --> ctrl.syncClaim() --> ctrl.provisionClaimOperation() --> ctrl.provisioner.Provision()
1.1 ctrl.processNextClaimWorkItem
主要逻辑:
(1)从claimQueue中获取pvc;
(2)调ctrl.syncClaimHandler做进一步处理;
(3)处理成功后,清理该pvc的rateLimiter,并将pvc从claimsInProgress中移除;
(4)处理失败后,会进行一定次数的重试,即将该pvc添加rateLimiter;
(5)最后,无论调ctrl.syncClaimHandler成功与否,将该pvc从claimQueue中移除。
// Map UID -> *PVC with all claims that may be provisioned in the background.
claimsInProgress sync.Map
// processNextClaimWorkItem processes items from claimQueue
func (ctrl *ProvisionController) processNextClaimWorkItem() bool {
// 从claimQueue中获取pvc
obj, shutdown := ctrl.claimQueue.Get()
if shutdown {
return false
}
err := func(obj interface{}) error {
// 最后,无论调ctrl.syncClaimHandler成功与否,将该pvc从claimQueue中移除
defer ctrl.claimQueue.Done(obj)
var key string
var ok bool
if key, ok = obj.(string); !ok {
ctrl.claimQueue.Forget(obj)
return fmt.Errorf("expected string in workqueue but got %#v", obj)
}
// 调ctrl.syncClaimHandler做进一步处理
if err := ctrl.syncClaimHandler(key); err != nil {
// 处理失败后,会进行一定次数的重试,即将该pvc添加rateLimiter
if ctrl.failedProvisionThreshold == 0 {
glog.Warningf("Retrying syncing claim %q, failure %v", key, ctrl.claimQueue.NumRequeues(obj))
ctrl.claimQueue.AddRateLimited(obj)
} else if ctrl.claimQueue.NumRequeues(obj) < ctrl.failedProvisionThreshold {
glog.Warningf("Retrying syncing claim %q because failures %v < threshold %v", key, ctrl.claimQueue.NumRequeues(obj), ctrl.failedProvisionThreshold)
ctrl.claimQueue.AddRateLimited(obj)
} else {
glog.Errorf("Giving up syncing claim %q because failures %v >= threshold %v", key, ctrl.claimQueue.NumRequeues(obj), ctrl.failedProvisionThreshold)
glog.V(2).Infof("Removing PVC %s from claims in progress", key)
ctrl.claimsInProgress.Delete(key) // This can leak a volume that's being provisioned in the background!
// Done but do not Forget: it will not be in the queue but NumRequeues
// will be saved until the obj is deleted from kubernetes
}
return fmt.Errorf("error syncing claim %q: %s", key, err.Error())
}
// 处理成功后,清理该pvc的rateLimiter,并将pvc从claimsInProgress中移除
ctrl.claimQueue.Forget(obj)
// Silently remove the PVC from list of volumes in progress. The provisioning either succeeded
// or the PVC was ignored by this provisioner.
ctrl.claimsInProgress.Delete(key)
return nil
}(obj)
if err != nil {
utilruntime.HandleError(err)
return true
}
return true
}
下面先分析一下claimQueue的相关方法:
(1)Done:从claimQueue中删除
// Done marks item as done processing, and if it has been marked as dirty again
// while it was being processed, it will be re-added to the queue for
// re-processing.
func (q *Type) Done(item interface{}) {
q.cond.L.Lock()
defer q.cond.L.Unlock()
q.metrics.done(item)
q.processing.delete(item)
if q.dirty.has(item) {
q.queue = append(q.queue, item)
q.cond.Signal()
}
}
(2)Forget:仅清理rateLimiter
func (q *rateLimitingType) Forget(item interface{}) {
q.rateLimiter.Forget(item)
}
(3)AddRateLimited:在速率限制器表示可以之后向claimQueue重新加入
// AddRateLimited AddAfter's the item based on the time when the rate limiter says it's ok
func (q *rateLimitingType) AddRateLimited(item interface{}) {
q.DelayingInterface.AddAfter(item, q.rateLimiter.When(item))
}
// AddAfter adds the given item to the work queue after the given delay
func (q *delayingType) AddAfter(item interface{}, duration time.Duration) {
// don't add if we're already shutting down
if q.ShuttingDown() {
return
}
q.metrics.retry()
// immediately add things with no delay
if duration <= 0 {
q.Add(item)
return
}
select {
case <-q.stopCh:
// unblock if ShutDown() is called
ca