Formbricks云监控:AWS CloudWatch配置教程

Formbricks云监控:AWS CloudWatch配置教程

【免费下载链接】formbricks Open Source Survey Toolbox 【免费下载链接】formbricks 项目地址: https://gitcode.com/GitHub_Trending/fo/formbricks

为什么需要CloudWatch监控?

你是否曾遭遇过生产环境突然崩溃却找不到根源?或者用户反馈系统响应缓慢但缺乏数据支撑?在现代SaaS应用中,无监控不生产已成为行业共识。Formbricks作为开源调查工具箱,其云原生架构部署在AWS上时,需要全方位的监控方案确保服务稳定性。

AWS CloudWatch提供了完整的监控解决方案,通过本文你将获得:

  • 一键部署的CloudWatch监控架构
  • 15+关键业务指标告警配置
  • Slack实时告警通知流程
  • 与Grafana无缝集成的可视化方案
  • 符合CIS基准的安全监控实践

架构概览:Formbricks监控数据流

mermaid

前置条件与环境准备

环境要求

组件版本要求用途
Terraform≥1.3.0基础设施即代码工具
AWS CLI≥2.0AWS资源管理
kubectl≥1.24Kubernetes集群管理
AWS账号管理员权限创建CloudWatch资源

权限配置

# 配置AWS凭证
aws configure
# 验证权限
aws cloudwatch describe-alarms --max-items 1

资源克隆

git clone https://gitcode.com/GitHub_Trending/fo/formbricks
cd formbricks/infra/terraform

核心配置:CloudWatch资源部署

1. 基础设施代码结构

terraform/
├── cloudwatch.tf        # CloudWatch核心配置
├── observability.tf     # 可观测性相关资源
├── main.tf              # 主配置文件
└── variables.tf         # 变量定义

2. 日志管理配置

创建CloudWatch日志组(cloudwatch.tf):

resource "aws_cloudwatch_log_group" "cloudwatch_cis_benchmark" {
  name              = "/aws/cis-benchmark-group"
  retention_in_days = 365  # 日志保留365天
  
  tags = {
    Project     = "formbricks"
    Environment = "prod"
    ManagedBy   = "Terraform"
  }
}

关键日志流配置

  • EKS集群日志:/aws/eks/formbricks-prod-eks/cluster
  • 应用日志:/aws/ecs/formbricks-app
  • 数据库日志:/aws/rds/instance/formbricks-prod/postgresql

3. 告警通知系统

Slack通知集成(cloudwatch.tf):

module "notify-slack" {
  source  = "terraform-aws-modules/notify-slack/aws"
  version = "6.6.0"

  slack_channel     = "formbricks-alerts"  # Slack目标频道
  slack_username    = "formbricks-cloudwatch"
  slack_webhook_url = data.aws_ssm_parameter.slack_notification_channel.value
  sns_topic_name    = "cloudwatch-alarms"  # SNS主题名称
  create_sns_topic  = true
}

配置Slack Webhook

  1. 在Slack工作区创建Incoming Webhook
  2. 将Webhook URL存储在AWS SSM参数中:
aws ssm put-parameter \
  --name "/prod/formbricks/slack-webhook-url" \
  --type "SecureString" \
  --value "https://hooks.slack.com/services/XXXXX/XXXXX/XXXX"

4. 关键业务指标告警

ALB负载均衡器监控

locals {
  alb_id = "app/k8s-formbricks-21ab9ecd60/342ed65d128ce4cb"
  alarms = {
    ALB_HTTPCode_Target_5XX_Count = {
      alarm_description   = "API 5XX错误率过高"
      comparison_operator = "GreaterThanThreshold"
      evaluation_periods  = 5        # 连续5个周期
      threshold           = 5        # 阈值:5个错误
      period              = 600      # 每10分钟评估
      unit                = "Count"
      namespace           = "AWS/ApplicationELB"
      metric_name         = "HTTPCode_Target_5XX_Count"
      statistic           = "Sum"
      dimensions = {
        LoadBalancer = local.alb_id
      }
    }
  }
}

RDS数据库监控

RDS_CPUUtilization = {
  alarm_description   = "RDS CPU利用率超过阈值"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 5
  threshold           = 80          # 80% CPU利用率
  period              = 60          # 每分钟评估
  unit                = "Percent"
  namespace           = "AWS/RDS"
  metric_name         = "CPUUtilization"
  statistic           = "Average"
  dimensions = {
    DBInstanceIdentifier = module.rds-aurora["prod"].cluster_instances["one"].id
  }
}

完整告警列表

告警名称监控对象阈值周期描述
ALB_HTTPCode_Target_5XX_Count应用负载均衡器5个错误10分钟API目标组5XX错误过多
ALB_TargetResponseTime应用负载均衡器5秒1分钟目标组响应时间过长
RDS_CPUUtilizationRDS数据库80%1分钟数据库CPU利用率过高
RDS_FreeStorageSpaceRDS数据库5GB1分钟数据库存储空间不足
RDS_FreeableMemoryRDS数据库100MB1分钟数据库可用内存不足
DynamoDB_ConsumedReadCapacityUnitsDynamoDB90%1分钟读取容量单位使用率过高

5. CIS基准合规监控

部署CIS基准告警

module "cloudwatch_cis-alarms" {
  source         = "terraform-aws-modules/cloudwatch/aws//modules/cis-alarms"
  version        = "5.7.1"
  log_group_name = aws_cloudwatch_log_group.cloudwatch_cis_benchmark.name
  alarm_actions  = [module.notify-slack.slack_topic_arn]
  
  # 启用关键安全告警
  enable_cis_1_2_13 = true  # 不使用默认VPC
  enable_cis_1_2_14 = true  # 不使用默认子网
  enable_cis_1_3_1  = true  # 启用VPC流日志
}

Grafana可视化集成

1. IAM权限配置

Grafana访问CloudWatch权限(observability.tf):

module "observability_grafana_iam_policy" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-policy"
  version = "5.53.0"

  name_prefix = "grafana-"
  description = "Grafana访问CloudWatch权限"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "AllowReadingMetricsFromCloudWatch"
        Effect = "Allow"
        Action = [
          "cloudwatch:DescribeAlarmsForMetric",
          "cloudwatch:ListMetrics",
          "cloudwatch:GetMetricData"
        ]
        Resource = "*"
      },
      {
        Sid    = "AllowReadingLogsFromCloudWatch"
        Effect = "Allow"
        Action = [
          "logs:DescribeLogGroups",
          "logs:GetQueryResults",
          "logs:GetLogEvents"
        ]
        Resource = "*"
      }
    ]
  })
}

2. 配置Grafana数据源

添加CloudWatch数据源

  1. 登录Grafana控制台
  2. 导航至Configuration > Data Sources
  3. 点击Add data source,选择CloudWatch
  4. 配置AWS认证:
    • 认证方式:AWS SDK Default
    • 地区:us-west-2(根据实际环境调整)
  5. 点击Save & Test验证连接

3. 导入Formbricks监控面板

# 下载Formbricks专用仪表盘
wget https://gitcode.com/GitHub_Trending/fo/formbricks/raw/main/infra/terraform/grafana-dashboard.json

# 通过Grafana API导入
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <grafana-api-key>" \
  -d @grafana-dashboard.json \
  "http://<grafana-url>/api/dashboards/db"

核心监控面板

  • 系统概览:CPU、内存、磁盘使用率
  • 应用性能:请求延迟、错误率、吞吐量
  • 数据库性能:查询延迟、连接数、缓存命中率
  • 用户体验:页面加载时间、调查提交成功率

部署与验证

1. Terraform部署

# 初始化Terraform
terraform init

# 预览资源变更
terraform plan -var-file=prod.tfvars

# 应用配置
terraform apply -var-file=prod.tfvars -auto-approve

2. 验证监控配置

检查CloudWatch资源

# 验证日志组
aws cloudwatch describe-log-groups --log-group-name-prefix /aws/cis-benchmark-group

# 验证告警
aws cloudwatch describe-alarms --alarm-name-prefix ALB_

触发测试告警

# 使用AWS CLI触发测试告警
aws cloudwatch set-alarm-state \
  --alarm-name ALB_HTTPCode_Target_5XX_Count \
  --state-value ALARM \
  --state-reason "Test alarm trigger"

检查Slack频道是否收到测试告警通知,确认通知流程正常。

高级配置:自定义监控指标

1. 自定义应用指标

使用CloudWatch Agent收集自定义指标

# 安装CloudWatch Agent
sudo yum install amazon-cloudwatch-agent -y

# 配置Agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

自定义指标配置示例(/etc/cloudwatch-agent-config.json):

{
  "metrics": {
    "metrics_collected": {
      "statsd": {
        "service_address": ":8125",
        "metrics_collection_interval": 10,
        "metrics_aggregation_interval": 60
      }
    }
  }
}

2. 自定义告警规则

添加业务指标告警

locals {
  alarms = {
    # 现有告警...
    
    # 新增业务告警
    Survey_Submission_Error_Rate = {
      alarm_description   = "调查提交错误率过高"
      comparison_operator = "GreaterThanThreshold"
      evaluation_periods  = 3
      threshold           = 5
      period              = 60
      unit                = "Percent"
      namespace           = "Formbricks/Business"
      metric_name         = "SurveySubmissionErrorRate"
      statistic           = "Average"
      dimensions = {
        Project = "formbricks"
      }
    }
  }
}

维护与最佳实践

1. 成本优化策略

资源优化措施预期效果
日志保留期非关键日志设为30天降低存储成本40%
指标粒度非实时指标设为5分钟粒度降低指标存储成本60%
告警阈值基于历史数据调整阈值减少90%误报

配置日志保留期

resource "aws_cloudwatch_log_group" "application_logs" {
  name              = "/aws/formbricks/application"
  retention_in_days = 30  # 非关键日志保留30天
}

2. 监控维护清单

每日检查

  • 告警状态(AWS Console或Grafana)
  • 关键指标趋势(响应时间、错误率)

每周维护

  • 审查告警触发历史
  • 优化阈值和周期
  • 清理不再需要的日志组

每月优化

  • 审查指标收集范围
  • 评估存储成本
  • 更新CIS基准规则

3. 故障排查流程

mermaid

总结与后续步骤

通过本文配置,你已成功部署了Formbricks的AWS CloudWatch监控系统,包括:

  • 全面的基础设施和应用监控
  • 基于Slack的实时告警通知
  • 符合CIS基准的安全监控
  • 与Grafana集成的可视化面板

后续建议

  1. 实现监控数据的长期归档(S3 + Glacier)
  2. 配置自动化运维响应(AWS Systems Manager Automation)
  3. 开发自定义业务仪表盘
  4. 集成成本监控与预算告警

立即行动

  • 克隆仓库开始部署:git clone https://gitcode.com/GitHub_Trending/fo/formbricks
  • 查看完整文档:cd formbricks/docs/self-hosting/setup/monitoring.mdx
  • 提交改进建议:在项目仓库创建Issue

【免费下载链接】formbricks Open Source Survey Toolbox 【免费下载链接】formbricks 项目地址: https://gitcode.com/GitHub_Trending/fo/formbricks

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值