Kubernetes监控体系(二)之cAdvisor介绍

本文详细介绍了cAdvisor的功能,作为Kubernetes中用于监控节点资源和容器性能的重要组件,cAdvisor能够收集CPU、内存、网络和文件系统的使用情况。文章还深入探讨了cAdvisor的内部结构和源码实现,并提供了如何通过HTTP接口与其交互的方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://www.huweihuang.com/article/kubernetes/monitoring/cadvisor-introduction/

1. cAdvisor简介

​ cAdvisor对Node机器上的资源及容器进行实时监控和性能数据采集,包括CPU使用情况、内存使用情况、网络吞吐量及文件系统使用情况,cAdvisor集成在Kubelet中,当kubelet启动时会自动启动cAdvisor,即一个cAdvisor仅对一台Node机器进行监控。kubelet的启动参数–cadvisor-port可以定义cAdvisor对外提供服务的端口,默认为4194。可以通过浏览器访问。项目主页:http://github.com/google/cadvisor 。

2. cAdvisor结构图

这里写图片描述

3. Metrics

分类字段描述
cpucpu_usage_total 
 cpu_usage_system 
 cpu_usage_user 
 cpu_usage_per_cpu 
 load_averageSmoothed average of number of runnable threads x 1000
memorymemory_usageMemory Usage
 memory_working_setWorking set size
networkrx_bytesCumulative count of bytes received
 rx_errorsCumulative count of receive errors encountered
 tx_bytesCumulative count of bytes transmitted
 tx_errorsCumulative count of transmit errors encountered
filesystemfs_deviceFilesystem device
 fs_limitFilesystem limit
 fs_usageFilesystem usage

4. cAdvisor源码

4.1. cAdvisor入口函数

cadvisor.go

func main() {
    defer glog.Flush()
    flag.Parse()
    if *versionFlag {
        fmt.Printf("cAdvisor version %s (%s)\n", version.Info["version"], version.Info["revision"]) os.Exit(0) } setMaxProcs() memoryStorage, err := NewMemoryStorage() if err != nil { glog.Fatalf("Failed to initialize storage driver: %s", err) } sysFs, err := sysfs.NewRealSysFs() if err != nil { glog.Fatalf("Failed to create a system interface: %s", err) } collectorHttpClient := createCollectorHttpClient(*collectorCert, *collectorKey) containerManager, err := manager.New(memoryStorage, sysFs, *maxHousekeepingInterval, *allowDynamicHousekeeping, ignoreMetrics.MetricSet, &collectorHttpClient) if err != nil { glog.Fatalf("Failed to create a Container Manager: %s", err) } mux := http.NewServeMux() if *enableProfiling { mux.HandleFunc("/debug/pprof/", pprof.Index) mux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline) mux.HandleFunc("/debug/pprof/profile", pprof.Profile) mux.HandleFunc("/debug/pprof/symbol", pprof.Symbol) } // Register all HTTP handlers. err = cadvisorhttp.RegisterHandlers(mux, containerManager, *httpAuthFile, *httpAuthRealm, *httpDigestFile, *httpDigestRealm) if err != nil { glog.Fatalf("Failed to register HTTP handlers: %v", err) } cadvisorhttp.RegisterPrometheusHandler(mux, containerManager, *prometheusEndpoint, nil) // Start the manager. if err := containerManager.Start(); err != nil { glog.Fatalf("Failed to start container manager: %v", err) } // Install signal handler. installSignalHandler(containerManager) glog.Infof("Starting cAdvisor version: %s-%s on port %d", version.Info["version"], version.Info["revision"], *argPort) addr := fmt.Sprintf("%s:%d", *argIp, *argPort) glog.Fatal(http.ListenAndServe(addr, mux)) }

核心代码:

memoryStorage, err := NewMemoryStorage()
sysFs, err := sysfs.NewRealSysFs()
#创建containerManager
containerManager, err := manager.New(memoryStorage, sysFs, *maxHousekeepingInterval, *allowDynamicHousekeeping, ignoreMetrics.MetricSet, &collectorHttpClient)
#启动containerManager
err := containerManager.Start()

4.2. cAdvisor Client的使用

import "github.com/google/cadvisor/client"
func main(){
    client, err := client.NewClient("http://192.168.19.30:4194/")   //http://<host-ip>:<port>/ }
4.2.1 client定义

cadvisor/client/client.go

// Client represents the base URL for a cAdvisor client.
type Client struct {
    baseUrl string
}
// NewClient returns a new v1.3 client with the specified base URL. func NewClient(url string) (*Client, error) { if !strings.HasSuffix(url, "/") { url += "/" } return &Client{ baseUrl: fmt.Sprintf("%sapi/v1.3/", url), }, nil }
4.2.2. client方法

1)MachineInfo

// MachineInfo returns the JSON machine information for this client.
// A non-nil error result indicates a problem with obtaining
// the JSON machine information data.
func (self *Client) MachineInfo() (minfo *v1.MachineInfo, err error) {
       u := self.machineInfoUrl()
       ret := new(v1.MachineInfo) if err = self.httpGetJsonData(ret, nil, u, "machine info"); err != nil { return } minfo = ret return }

2)ContainerInfo

// ContainerInfo returns the JSON container information for the specified
// container and request.
func (self *Client) ContainerInfo(name string, query *v1.ContainerInfoRequest) (cinfo *v1.ContainerInfo, err error) {
       u := self.containerInfoUrl(name)
       ret := new(v1.ContainerInfo) if err = self.httpGetJsonData(ret, query, u, fmt.Sprintf("container info for %q", name)); err != nil { return } cinfo = ret return }

3)DockerContainer

// Returns the JSON container information for the specified
// Docker container and request.
func (self *Client) DockerContainer(name string, query *v1.ContainerInfoRequest) (cinfo v1.ContainerInfo, err error) {
       u := self.dockerInfoUrl(name)
       ret := make(map[string]v1.ContainerInfo) if err = self.httpGetJsonData(&ret, query, u, fmt.Sprintf("Docker container info for %q", name)); err != nil { return } if len(ret) != 1 { err = fmt.Errorf("expected to only receive 1 Docker container: %+v", ret) return } for _, cont := range ret { cinfo = cont } return }

4)AllDockerContainers

// Returns the JSON container information for all Docker containers.
func (self *Client) AllDockerContainers(query *v1.ContainerInfoRequest) (cinfo []v1.ContainerInfo, err error) {
       u := self.dockerInfoUrl("/")
       ret := make(map[string]v1.ContainerInfo) if err = self.httpGetJsonData(&ret, query, u, "all Docker containers info"); err != nil { return } cinfo = make([]v1.ContainerInfo, 0, len(ret)) for _, cont := range ret { cinfo = append(cinfo, cont) } return }

文章参考:http://blog.opskumu.com/cadvisor.html

转载于:https://www.cnblogs.com/liuhongru/p/11215475.html

### 使用 Prometheus 实现 Kubernetes 集群监控 #### 定义抓取目标 Prometheus 的抓取目标可以通过 `ServiceMonitor` 和 `PodMonitor` 资源来定义,这些资源由 Prometheus Operator 简化管理,在 Kubernetes 中配置 Prometheus 变得更加便捷[^2]。 ```yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: example-servicemonitor spec: selector: matchLabels: app: my-app endpoints: - port: http-metrics ``` 此 YAML 文件展示了如何创建一个 `ServiceMonitor` 来指定 Prometheus 应该从哪些服务收集度量标准。通过这种方式可以轻松地将任何应用的服务暴露给 Prometheus 进行监控。 #### 关键指标与告警规则设置 对于 Kubernetes 监控的最佳实践而言,除了关注应用程序级别的性能外,还需要特别注意一些核心组件的状态,比如节点健康状况、容器重启次数等重要参数,并据此设定合理的阈值触发警告通知机制[^3]。 例如,针对 CPU 利用率过高情况下的报警规则: ```yaml groups: - name: example rules: - alert: HighCpuUsage expr: sum(rate(container_cpu_usage_seconds_total{job="kubernetes-cadvisor"}[5m])) by (node) > 0.9 for: 1m labels: severity: page annotations: summary: "High CPU usage on {{ $labels.node }}" description: "{{ $labels.node }} has had greater than 90% cpu utilization for the last minute." ``` 这段 PromQL 表达式用于检测某个节点在过去一分钟内的平均 CPU 占用是否超过了 90%,一旦满足条件就会发出警报提醒管理员采取相应措施。 #### 自身状态监测 为了保障整个监控体系稳定运行,同样不可忽视对 Prometheus Server 自身健康的持续跟踪,确保其正常工作不会成为潜在风险点之一。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值