APISIX生产环境部署指南-优快云博客

APISIX生产环境部署指南

【免费下载链接】apisix Apisix是一个基于Nginx的API网关，主要用于微服务架构中的API管理和服务发现。它的特点是高性能、轻量级、易于配置等。适用于API管理和负载均衡场景。项目地址: https://gitcode.com/gh_mirrors/api/apisix

本文详细介绍了Apache APISIX在生产环境中的完整部署方案，包括Kubernetes Ingress控制器部署、高可用集群配置、etcd配置中心管理与备份策略，以及故障排查与日常运维最佳实践。内容涵盖架构设计、性能优化、监控告警和安全配置等方面，为企业级用户提供全面的API网关部署指南。

Kubernetes Ingress控制器部署

Apache APISIX作为云原生API网关，提供了完整的Kubernetes Ingress控制器解决方案，能够无缝集成到Kubernetes生态系统中。通过APISIX Ingress控制器，您可以获得高性能、动态配置和丰富的流量管理能力。

APISIX Ingress控制器架构

APISIX Ingress控制器采用经典的控制器模式，通过监听Kubernetes API Server的资源变化，自动将Ingress、Service等资源转换为APISIX的配置规则。其架构设计如下：

mermaid

核心组件与工作原理

APISIX Ingress控制器包含以下核心组件：

Controller Manager: 负责监听Kubernetes资源变化
Config Translator: 将Kubernetes资源转换为APISIX配置
APISIX Admin API Client: 与APISIX数据平面通信

工作流程如下：

mermaid

部署方式

APISIX Ingress控制器支持多种部署模式，满足不同场景需求：

1. Helm Chart部署（推荐）

使用官方Helm chart可以快速部署完整的APISIX Ingress解决方案：

# values.yaml 配置文件示例
apisix:
  gateway:
    type: LoadBalancer
    http:
      enabled: true
      servicePort: 80
      containerPort: 9080
    https:
      enabled: true
      servicePort: 443
      containerPort: 9443
  
  admin:
    enabled: true
    type: ClusterIP
    port: 9180

  ingress-controller:
    enabled: true
    config:
      apisix:
        baseURL: http://apisix-admin:9180/apisix/admin
        adminKey: edd1c9f0985e76a2
    
  etcd:
    enabled: true
    replicaCount: 3

部署命令：

helm repo add apisix https://charts.apiseven.com
helm repo update
helm install apisix apisix/apisix -f values.yaml

2. 独立模式部署

对于需要声明式配置的场景，可以使用Standalone模式：

# config.yaml
deployment:
  role: data_plane
  role_data_plane:
    config_provider: yaml

apisix:
  node_listen:
    - port: 9080
  enable_admin: false

3. 自定义资源部署

APISIX Ingress支持丰富的自定义资源：

资源类型	描述	示例用途
ApisixRoute	自定义路由规则	复杂路由匹配
ApisixUpstream	上游服务配置	负载均衡策略
ApisixPluginConfig	插件配置	认证、限流等

配置示例

基础Ingress配置

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: apisix-ingress
  annotations:
    kubernetes.io/ingress.class: apisix
spec:
  rules:
  - host: example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8080

高级路由配置

apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
  name: custom-route
spec:
  http:
  - name: api-route
    match:
      hosts:
      - api.example.com
      paths:
      - /v1/*
    backends:
    - serviceName: api-v1-service
      servicePort: 8080
      weight: 80
    - serviceName: api-v2-service
      servicePort: 8080
      weight: 20
    plugins:
    - name: key-auth
      enable: true
    - name: limit-count
      enable: true
      config:
        count: 100
        time_window: 60
        key: remote_addr

性能优化配置

为了获得最佳性能，建议进行以下配置优化：

# 高性能配置示例
nginx_config:
  worker_processes: auto
  worker_rlimit_nofile: 65535
  events:
    worker_connections: 20480
  http:
    keepalive_timeout: 65s
    client_header_timeout: 60s
    client_body_timeout: 60s
    send_timeout: 60s
    upstream:
      keepalive: 1024
      keepalive_requests: 10000
      keepalive_timeout: 60s

监控与运维

APISIX Ingress控制器提供完整的监控支持：

Prometheus指标：内置丰富的监控指标
健康检查：支持就绪性和存活性探针
日志收集：结构化日志输出，便于分析

监控配置示例：

# 启用Prometheus监控
plugins:
  - name: prometheus
    enable: true

# 健康检查配置
livenessProbe:
  httpGet:
    path: /apisix/admin/healthz
    port: 9180
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /apisix/admin/healthz
    port: 9180
  initialDelaySeconds: 5
  periodSeconds: 5

高可用性部署

对于生产环境，建议采用高可用部署方案：

# 高可用配置
replicaCount: 3
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

antiAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
  - weight: 100
    podAffinityTerm:
      labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - apisix
      topologyKey: kubernetes.io/hostname

安全配置

确保Ingress控制器的安全性：

# 安全配置
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  runAsGroup: 1000
  readOnlyRootFilesystem: true
  capabilities:
    drop:
    - ALL

networkPolicy:
  enabled: true
  ingress:
    - from:
      - namespaceSelector:
          matchLabels:
            project: my-app
      ports:
      - protocol: TCP
        port: 9080
      - protocol: TCP
        port: 9443

通过以上配置和部署方案，APISIX Ingress控制器能够为Kubernetes集群提供高性能、可靠且安全的API网关服务。

高可用集群配置方案

Apache APISIX 的高可用集群配置是确保生产环境稳定运行的关键环节。通过合理的架构设计和配置策略，可以实现零单点故障、自动故障转移和水平扩展能力，为企业的关键业务提供可靠的API网关服务。

集群架构设计

APISIX 支持多种高可用架构模式，根据不同的业务需求和资源情况，可以选择最适合的部署方案：

mermaid

多节点 etcd 集群配置

etcd 作为 APISIX 的配置中心，其高可用性至关重要。以下是三节点 etcd 集群的配置示例：

deployment:
  role: traditional
  role_traditional:
    config_provider: etcd
  etcd:
    host:
      - http://etcd-node1:2379
      - http://etcd-node2:2379  
      - http://etcd-node3:2379
    prefix: /apisix
    timeout: 30
    tls:
      verify: false
    user: etcd-user
    password: etcd-password

关键配置参数说明：

参数	说明	推荐值
host	etcd 集群节点地址列表	至少3个节点
prefix	APISIX 在 etcd 中的存储前缀	/apisix
timeout	连接超时时间(秒)	30
tls.verify	TLS 证书验证	生产环境建议 true

数据平面节点负载均衡

多个 APISIX 数据平面节点需要通过负载均衡器对外提供服务：

# Nginx 负载均衡配置示例
upstream apisix_cluster {
    server apisix-node1:9080 weight=3;
    server apisix-node2:9080 weight=3;
    server apisix-node3:9080 weight=3;
    keepalive 32;
}

server {
    listen 80;
    server_name api.example.com;
    
    location / {
        proxy_pass http://apisix_cluster;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # 健康检查
        proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
        proxy_connect_timeout 2s;
        proxy_read_timeout 30s;
        proxy_send_timeout 30s;
    }
}

健康检查与故障转移

APISIX 支持多种健康检查机制确保服务高可用：

# 上游服务健康检查配置
upstreams:
  - id: 1
    nodes:
      "192.168.1.10:8080": 1
      "192.168.1.11:8080": 1
      "192.168.1.12:8080": 1
    type: roundrobin
    checks:
      active:
        type: http
        http_path: /health
        healthy:
          interval: 5
          successes: 2
        unhealthy:
          interval: 2
          http_failures: 3
      passive:
        healthy:
          http_statuses: [200, 201, 202]
          successes: 2
        unhealthy:
          http_statuses: [500, 502, 503, 504]
          http_failures: 3

健康检查配置参数详解：

检查类型	参数	说明	推荐值
Active	interval	检查间隔(秒)	5
Active	successes	健康成功次数	2
Passive	http_statuses	健康状态码	[200, 201, 202]
Passive	http_failures	失败次数阈值	3

会话保持与一致性哈希

对于需要会话保持的应用场景，APISIX 提供一致性哈希负载均衡：

routes:
  - uri: /session/*
    upstream:
      nodes:
        "192.168.1.10:8080": 1
        "192.168.1.11:8080": 1
        "192.168.1.12:8080": 1
      type: chash
      key: cookie_JSESSIONID

一致性哈希配置选项：

参数	说明	示例值
type	负载均衡类型	chash
key	哈希键	cookie_JSESSIONID
key	客户端IP	remote_addr
key	请求参数	arg_user_id

监控与告警集成

高可用集群需要完善的监控体系，APISIX 支持多种监控方案：

# Prometheus 监控配置
plugins:
  prometheus:
    export_addr:
      ip: 0.0.0.0
      port: 9091
    export_uri: /metrics
    metrics:
      - name: http_requests_total
        type: counter
        desc: total http requests
      - name: http_request_duration_seconds
        type: histogram
        desc: http request duration in seconds
        buckets: [0.005, 0.01, 0.05, 0.1, 0.5, 1, 5]

监控指标分类：

指标类型	说明	关键指标
请求指标	HTTP 请求相关	请求数、延迟、错误率
上游指标	后端服务状态	健康状态、响应时间
系统指标	资源使用情况	CPU、内存、连接数

灾难恢复与备份策略

确保配置数据的安全性和可恢复性：

# etcd 数据备份脚本
#!/bin/bash
ETCDCTL_API=3 etcdctl \
  --endpoints=https://etcd-node1:2379,https://etcd-node2:2379,https://etcd-node3:2379 \
  --cacert=/etc/ssl/etcd/ca.crt \
  --cert=/etc/ssl/etcd/server.crt \
  --key=/etc/ssl/etcd/server.key \
  snapshot save /backup/etcd-snapshot-$(date +%Y%m%d-%H%M%S).db

# 保留最近7天的备份
find /backup -name "etcd-snapshot-*.db" -mtime +7 -delete

备份策略建议：

备份类型	频率	保留时间	存储位置
完整备份	每日	7天	本地磁盘
增量备份	每小时	24小时	对象存储
配置导出	实时	30天	版本控制系统

性能优化配置

针对高并发场景的性能调优建议：

# APISIX 性能优化配置
nginx_config:
  worker_processes: auto
  worker_connections: 10240
  keepalive_timeout: 60s
  keepalive_requests: 10000
  client_header_timeout: 60s
  client_body_timeout: 60s
  send_timeout: 60s
  client_max_body_size: 100m
  proxy_buffer_size: 128k
  proxy_buffers: 4 256k
  proxy_busy_buffers_size: 256k

性能关键参数优化：

参数	说明	推荐值
worker_processes	工作进程数	CPU核心数
worker_connections	单个进程连接数	10240
keepalive_requests	长连接请求数	10000
proxy_buffers	代理缓冲区	4 256k

通过上述高可用集群配置方案，APISIX 能够为企业级应用提供稳定、高性能的API网关服务，确保业务连续性和可靠性。

配置中心（etcd）管理与备份

在APISIX生产环境部署中，etcd作为核心的配置中心，承担着路由规则、上游服务、插件配置等关键数据的存储和同步职责。合理的etcd管理与备份策略是确保系统高可用性和数据安全性的关键环节。

etcd集群架构设计

APISIX支持多节点etcd集群部署，建议在生产环境中至少部署3个节点以确保高可用性。etcd集群采用Raft一致性算法，能够容忍(N-1)/2个节点故障。

mermaid

etcd配置详解

APISIX通过conf/config.yaml文件配置etcd连接参数，主要配置项包括：

配置项	说明	默认值	必填
host	etcd节点地址数组	-	是
prefix	etcd键前缀	/apisix	是
timeout	连接超时时间(秒)	30	否
user	认证用户名	-	否
password	认证密码	-	否
tls.cert	TLS证书路径	-	否
tls.key	TLS私钥路径	-	否

示例配置：

deployment:
  role: traditional
  role_traditional:
    config_provider: etcd
  etcd:
    host:
      - http://etcd1:2379
      - http://etcd2:2379
      - http://etcd3:2379
    prefix: /apisix
    timeout: 30
    user: apisix
    password: secure_password
    tls:
      cert: /path/to/client.crt
      key: /path/to/client.key

数据备份策略

1. etcd快照备份

使用etcdctl工具创建定期快照备份：

# 创建etcd快照
ETCDCTL_API=3 etcdctl --endpoints=https://etcd1:2379 \
  --cacert=/etc/etcd/ca.crt \
  --cert=/etc/etcd/client.crt \
  --key=/etc/etcd/client.key \
  snapshot save /backup/etcd-snapshot-$(date +%Y%m%d).db

# 查看快照状态
ETCDCTL_API=3 etcdctl --write-out=table snapshot status /backup/etcd-snapshot-20231201.db

2. 增量备份方案

结合etcd的watch机制实现增量备份：

#!/bin/bash
# 增量备份脚本
BACKUP_DIR="/backup/etcd/incremental"
ETCD_ENDPOINTS="https://etcd1:2379,https://etcd2:2379,https://etcd3:2379"

# 获取当前revision
CURRENT_REV=$(ETCDCTL_API=3 etcdctl --endpoints=$ETCD_ENDPOINTS endpoint status --write-out="json" | \
  jq -r '.Status[].header.revision')

# 执行增量备份
ETCDCTL_API=3 etcdctl --endpoints=$ETCD_ENDPOINTS \
  snapshot save $BACKUP_DIR/snapshot-rev-$CURRENT_REV.db

数据恢复流程

1. 从快照恢复

# 停止etcd服务
systemctl stop etcd

# 恢复快照
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-snapshot-20231201.db \
  --name etcd1 \
  --initial-cluster "etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380" \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls http://etcd1:2380

# 启动etcd服务
systemctl start etcd

2. 验证数据完整性

恢复后需要验证APISIX配置数据的完整性：

# 检查路由配置
curl http://127.0.0.1:9180/apisix/admin/routes -H "X-API-KEY: your-admin-key"

# 检查上游服务
curl http://127.0.0.1:9180/apisix/admin/upstreams -H "X-API-KEY: your-admin-key"

# 检查插件配置
curl http://127.0.0.1:9180/apisix/admin/plugins -H "X-API-KEY: your-admin-key"

监控与告警

关键监控指标

指标名称	监控项	告警阈值
etcd_server_leader_changes	领导权变更次数	>5次/小时
etcd_server_slow_apply	慢应用次数	>10次/分钟
etcd_disk_backend_commit_duration	磁盘提交延迟	>100ms
etcd_network_client_grpc_received_bytes	网络流量	突增50%

Prometheus监控配置

# prometheus.yml
scrape_configs:
  - job_name: 'etcd'
    static_configs:
      - targets: ['etcd1:2379', 'etcd2:2379', 'etcd3:2379']
    scheme: https
    tls_config:
      ca_file: /etc/etcd/ca.crt
      cert_file: /etc/etcd/client.crt
      key_file: /etc/etcd/client.key

安全最佳实践

1. 访问控制

# 启用etcd认证
etcd:
  user: apisix
  password: $(ETCD_PASSWORD)
  tls:
    cert: /etc/ssl/etcd/client.crt
    key: /etc/ssl/etcd/client.key

2. 网络隔离

mermaid

3. 定期密钥轮换

# 密钥轮换脚本
#!/bin/bash
# 生成新密钥
openssl genrsa -out /etc/ssl/etcd/client-new.key 2048
openssl req -new -key /etc/ssl/etcd/client-new.key -out /etc/ssl/etcd/client-new.csr
openssl x509 -req -in /etc/ssl/etcd/client-new.csr -CA /etc/ssl/etcd/ca.crt -CAkey /etc/ssl/etcd/ca.key -out /etc/ssl/etcd/client-new.crt

# 更新APISIX配置并重载
systemctl reload apisix

灾难恢复演练

定期进行灾难恢复演练是确保备份有效性的关键：

# 灾难恢复演练流程
1. 在生产环境外搭建测试环境
2. 恢复最新的etcd快照
3. 启动APISIX实例连接测试etcd
4. 验证所有路由、上游、插件配置
5. 模拟流量验证功能完整性
6. 记录演练结果并优化流程

通过完善的etcd管理与备份策略，结合定期演练和监控告警，可以确保APISIX生产环境在面临各种故障场景时能够快速恢复，保障业务的连续性和稳定性。

故障排查与日常运维

在APISIX生产环境部署中，建立完善的故障排查和日常运维体系至关重要。本节将详细介绍APISIX的监控、日志管理、故障诊断和日常维护的最佳实践。

监控与健康检查

APISIX提供了多种监控机制来确保网关的稳定运行。通过Status API可以实时检查APISIX的运行状态：

# 检查APISIX整体状态
curl http://127.0.0.1:7085/status

# 检查所有worker是否就绪
curl http://127.0.0.1:7085/status/ready

Status API返回的JSON响应示例：

{
  "status": "ok"
}

当有worker未完成配置加载时：

{
  "status": "error",
  "error": "worker id: 9 has not received configuration"
}

日志管理与分析

APISIX支持多种日志输出方式，可以通过插件将日志推送到不同的日志管理系统：

HTTP Logger插件配置示例

{
  "plugins": {
    "http-logger": {
      "uri": "http://your-log-server:8080/logs",
      "timeout": 3,
      "include_req_body": true,
      "include_resp_body": true,
      "log_format": {
        "host": "$host",
        "@timestamp": "$time_iso8601",
        "client_ip": "$remote_addr",
        "status": "$status",
        "request_time": "$request_time"
      }
    }
  }
}

默认日志格式字段说明

字段	描述	示例
`service_id`	服务ID	""
`apisix_latency`	APISIX处理延迟(ms)	100.99999809265
`start_time`	请求开始时间戳	1703907485819
`latency`	总延迟(ms)	101.99999809265
`upstream_latency`	上游服务延迟(ms)	1
`client_ip`	客户端IP	"127.0.0.1"
`route_id`	路由ID	"1"

调试模式配置

APISIX提供了强大的调试功能，可以通过debug.yaml配置文件启用：

basic:
  enable: true

hook_conf:
  enable: true
  name: hook_phase
  log_level: warn
  is_print_input_args: true
  is_print_return_value: true

hook_phase:
  apisix:
    - http_access_phase
    - http_header_filter_phase
    - http_body_filter_phase
    - http_log_phase

# 动态调试配置
http_filter:
  enable: true
  enable_header_name: X-APISIX-Dynamic-Debug
#END

启用调试模式后，APISIX会在响应头中添加调试信息：

HTTP/1.1 200 OK
Apisix-Plugins: limit-conn, limit-count
X-RateLimit-Limit: 2
X-RateLimit-Remaining: 1

常见故障排查场景

1. 配置验证失败

使用APISIX CLI工具验证配置的正确性：

# 测试生成的nginx.conf
apisix test

# 检查配置语法
nginx -t -p /usr/local/apisix -c conf/nginx.conf

2. 插件加载问题

检查插件加载状态和错误日志：

# 查看错误日志
tail -f logs/error.log

# 检查插件列表
curl http://127.0.0.1:9180/apisix/admin/plugins/list \
  -H "X-API-KEY: $admin_key"

3. 性能问题诊断

使用内置的Prometheus指标进行性能监控：

mermaid

日常运维命令

APISIX提供了完整的CLI工具集用于日常运维：

# 启动APISIX
apisix start

# 优雅停止
apisix quit

# 重新加载配置
apisix reload

# 重启服务
apisix restart

# 检查版本
apisix version

资源监控与告警

建立完善的监控指标体系：

监控指标	告警阈值	检查频率
CPU使用率	>80%持续5分钟	每分钟
内存使用率	>90%	每分钟
请求错误率	>5%	每5分钟
平均响应时间	>500ms	每5分钟
活跃连接数	>10000	每分钟

备份与恢复策略

配置备份

# 备份etcd配置
etcdctl snapshot save apisix-backup.db

# 备份本地配置文件
tar -czf apisix-config-$(date +%Y%m%d).tar.gz conf/ logs/

灾难恢复

# 从备份恢复etcd
etcdctl snapshot restore apisix-backup.db

# 重新加载APISIX配置
apisix reload

安全运维实践

定期轮换Admin API密钥
监控异常访问模式
定期更新SSL证书
审计日志分析
网络访问控制

通过建立完善的监控、日志、备份和应急响应机制，可以确保APISIX在生产环境中的稳定运行，快速定位和解决各类故障问题。

总结

APISIX生产环境部署需要综合考虑多个关键因素：采用高可用架构确保服务稳定性，通过多节点etcd集群保障配置数据可靠性，实施完善的监控和备份策略应对各种故障场景。本文提供的部署方案、配置示例和运维实践，能够帮助企业构建高性能、高可用的API网关基础设施，为业务系统提供可靠的流量管理和安全防护能力。

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考