ingress-nginx自定义指标:HPA自动扩缩容

ingress-nginx自定义指标:HPA自动扩缩容

【免费下载链接】ingress-nginx Ingress-NGINX Controller for Kubernetes 【免费下载链接】ingress-nginx 项目地址: https://gitcode.com/GitHub_Trending/in/ingress-nginx

引言

在现代云原生架构中,自动扩缩容(Auto Scaling)是确保应用高可用性和资源高效利用的关键技术。Kubernetes Horizontal Pod Autoscaler(HPA,水平Pod自动扩缩容器)通常基于CPU和内存使用率进行扩缩容决策,但对于像ingress-nginx这样的网络入口控制器,基于请求量和性能指标进行扩缩容往往更加精准。

本文将深入探讨如何利用ingress-nginx暴露的Prometheus自定义指标,实现基于实际流量模式的智能HPA自动扩缩容,帮助您构建更加弹性、高效的云原生基础设施。

ingress-nginx监控指标体系

ingress-nginx通过端口10254暴露丰富的Prometheus指标,主要分为以下几类:

请求相关指标

# 请求处理时间直方图(秒)
nginx_ingress_controller_request_duration_seconds

# 响应时间直方图(秒)  
nginx_ingress_controller_response_duration_seconds

# 请求头时间直方图(秒)
nginx_ingress_controller_header_duration_seconds

# 连接时间直方图(秒)
nginx_ingress_controller_connect_duration_seconds

# 响应大小直方图
nginx_ingress_controller_response_size

# 请求大小直方图
nginx_ingress_controller_request_size

# 请求总数计数器
nginx_ingress_controller_requests

Nginx进程指标

# 当前连接数
nginx_ingress_controller_nginx_process_connections

# 总连接数
nginx_ingress_controller_nginx_process_connections_total

# CPU使用时间
nginx_ingress_controller_nginx_process_cpu_seconds_total

# 内存使用量
nginx_ingress_controller_nginx_process_resident_memory_bytes

# 虚拟内存使用量  
nginx_ingress_controller_nginx_process_virtual_memory_bytes

# 总请求数
nginx_ingress_controller_nginx_process_requests_total

控制器指标

# 配置哈希值
nginx_ingress_controller_config_hash

# 最后重载配置状态
nginx_ingress_controller_config_last_reload_successful

# SSL证书信息
nginx_ingress_controller_ssl_certificate_info

环境准备与配置

启用ingress-nginx指标导出

通过Helm安装或升级ingress-nginx时,需要启用指标导出功能:

helm upgrade ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx \
  --set controller.metrics.enabled=true \
  --set-string controller.podAnnotations."prometheus\.io/scrape"="true" \
  --set-string controller.podAnnotations."prometheus\.io/port"="10254"

配置Prometheus监控

创建ServiceMonitor资源,让Prometheus能够自动发现和抓取ingress-nginx指标:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/instance: ingress-nginx
      app.kubernetes.io/component: controller
  endpoints:
  - port: prometheus
    interval: 30s
    path: /metrics

基于自定义指标的HPA配置

方案一:基于请求率的HPA

使用nginx_ingress_controller_requests指标,按每秒请求数进行扩缩容:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ingress-nginx-hpa
  namespace: ingress-nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ingress-nginx-controller
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: nginx_ingress_controller_requests
      target:
        type: AverageValue
        averageValue: 1000

方案二:基于响应时间的HPA

使用nginx_ingress_controller_request_duration_seconds指标,确保响应时间在可接受范围内:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ingress-nginx-hpa-latency
  namespace: ingress-nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ingress-nginx-controller
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Object
    object:
      metric:
        name: nginx_ingress_controller_request_duration_seconds
      describedObject:
        apiVersion: v1
        kind: Service
        name: ingress-nginx-controller
      target:
        type: Value
        value: 500m

方案三:多指标组合HPA

结合请求率和错误率进行智能扩缩容:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ingress-nginx-hpa-combined
  namespace: ingress-nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ingress-nginx-controller
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Pods
    pods:
      metric:
        name: nginx_ingress_controller_requests
      target:
        type: AverageValue
        averageValue: 800
  - type: Object
    object:
      metric:
        name: nginx_ingress_controller_request_duration_seconds
      describedObject:
        apiVersion: v1
        kind: Service
        name: ingress-nginx-controller
      target:
        type: Value
        value: 300m
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Pods
        value: 2
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Pods
        value: 1
        periodSeconds: 60

Prometheus Adapter配置

为了让HPA能够识别自定义指标,需要部署和配置Prometheus Adapter:

apiVersion: adapter.config.openshift.io/v1
kind: PrometheusAdapter
metadata:
  name: custom-metrics
  namespace: openshift-monitoring
spec:
  rules:
  - seriesQuery: 'nginx_ingress_controller_requests{namespace!="",ingress!=""}'
    resources:
      overrides:
        namespace: {resource: "namespace"}
        ingress: {resource: "ingress"}
    name:
      matches: "nginx_ingress_controller_requests"
      as: "nginx_requests_per_second"
    metricsQuery: 'sum(rate(nginx_ingress_controller_requests[2m])) by (namespace, ingress)'
  
  - seriesQuery: 'nginx_ingress_controller_request_duration_seconds{namespace!="",ingress!=""}'
    resources:
      overrides:
        namespace: {resource: "namespace"}
        ingress: {resource: "ingress"}
    name:
      matches: "nginx_ingress_controller_request_duration_seconds"
      as: "nginx_request_duration_seconds"
    metricsQuery: 'histogram_quantile(0.95, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket[2m])) by (le, namespace, ingress))'

Helm高级配置

在values.yaml中直接配置HPA模板:

controller:
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 10
    targetCPUUtilizationPercentage: 50
  autoscalingTemplate:
  - type: Pods
    pods:
      metric:
        name: nginx_ingress_controller_requests
      target:
        type: AverageValue
        averageValue: 1000
  - type: Object
    object:
      metric:
        name: nginx_ingress_controller_request_duration_seconds
      describedObject:
        apiVersion: v1
        kind: Service
        name: ingress-nginx-controller
      target:
        type: Value
        value: 500m

监控与告警配置

Grafana监控看板

创建专门的HPA监控看板,实时跟踪扩缩容状态:

{
  "panels": [
    {
      "title": "HPA Replica Count",
      "targets": [{
        "expr": "kube_horizontalpodautoscaler_status_current_replicas{horizontalpodautoscaler='ingress-nginx-hpa'}",
        "legendFormat": "Current Replicas"
      }]
    },
    {
      "title": "Request Rate vs HPA Target",
      "targets": [
        {
          "expr": "sum(rate(nginx_ingress_controller_requests[2m]))",
          "legendFormat": "Actual Request Rate"
        },
        {
          "expr": "1000",
          "legendFormat": "HPA Target (1000rps)"
        }
      ]
    }
  ]
}

Prometheus告警规则

设置智能告警,及时发现扩缩容异常:

groups:
- name: ingress-nginx-hpa-alerts
  rules:
  - alert: HPA Scaling Failed
    expr: kube_horizontalpodautoscaler_status_condition{condition="ScalingLimited", status="true"} == 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "HPA scaling is limited for {{ $labels.horizontalpodautoscaler }}"
      description: "HPA {{ $labels.horizontalpodautoscaler }} cannot scale due to resource constraints"
  
  - alert: High Request Latency
    expr: histogram_quantile(0.95, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m])) > 1
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "High request latency in ingress-nginx"
      description: "95th percentile request latency exceeds 1 second"

最佳实践与优化建议

1. 容量规划与阈值设置

mermaid

2. 多维度监控策略

监控维度关键指标告警阈值应对策略
请求量nginx_ingress_controller_requests> 80% 容量提前扩容
响应时间request_duration_secondsP95 > 500ms立即扩容
错误率5xx请求占比> 5%检查后端服务
连接数nginx_process_connections> 80% 限制调整配置

3. 性能优化配置

# nginx配置优化
controller:
  config:
    # 增加工作进程数
    worker-processes: "auto"
    # 优化连接处理
    worker-connections: "10240"
    # 启用keepalive
    keep-alive: "75"
    # 调整缓冲区大小
    proxy-buffer-size: "16k"
    proxy-buffers: "4 16k"

故障排查与诊断

常见问题及解决方案

  1. HPA不生效

    • 检查Prometheus Adapter日志
    • 验证指标查询语法:kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
    • 检查HPA事件:kubectl describe hpa ingress-nginx-hpa
  2. 指标数据缺失

    • 验证Prometheus抓取配置
    • 检查ingress-nginx指标端口:curl http://ingress-nginx-controller:10254/metrics
    • 查看Pod注解是否正确配置
  3. 扩缩容过于频繁

    • 调整stabilizationWindowSeconds
    • 增加metrics查询时间窗口
    • 设置合理的扩缩容策略

总结

通过ingress-nginx的自定义指标实现HPA自动扩缩容,能够为您的Kubernetes集群带来以下核心价值:

  1. 精准扩缩容:基于实际流量模式而非简单的资源使用率
  2. 成本优化:避免过度配置资源,按需分配计算能力
  3. 性能保障:确保服务响应时间在可接受范围内
  4. 高可用性:自动应对流量峰值和突发负载

本文提供的配置方案和最佳实践,帮助您构建更加智能、弹性的云原生入口架构,为业务稳定运行提供坚实保障。

【免费下载链接】ingress-nginx Ingress-NGINX Controller for Kubernetes 【免费下载链接】ingress-nginx 项目地址: https://gitcode.com/GitHub_Trending/in/ingress-nginx

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值