Collecting Metrics and Logs

本文介绍如何通过Istio配置自动收集服务调用的遥测数据,包括新的metric和日志流。以Bookinfo应用为例,展示了如何配置Istio以实现这一目的。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

这个task展示如何配置Istio来自动收集网格中的服务的遥测。完成这个task时,将为你网格中的服务调用开启新的metric和日志流。
在这个task中使用 Bookinfo 作为示例应用。

Before you begin

  • 在你的集群中安装Istio并部署一个应用。这个task假设Mixer以默认配置 (--configDefaultNamespace=istio-system)安装。如果你使用不同的值,更新task中匹配的配置和命令的值。
  • 安装Prometheus插件。Prometheus将会被用来验证task是否成功。
kubectl apply -f install/kubernetes/addons/prometheus.yaml

查看 Prometheus 的细节。

Collecting new telemetry data

1.新建一个新的YAML文件来保存Istio自动生成和收集的信息metric和日志流的配置。
保存如下内容到 new_telemetry.yaml:

# Configuration for metric instances
apiVersion: "config.istio.io/v1alpha2"
kind: metric
metadata:
  name: doublerequestcount
  namespace: istio-system
spec:
  value: "2" # count each request twice
  dimensions:
    source: source.service | "unknown"
    destination: destination.service | "unknown"
    message: '"twice the fun!"'
  monitored_resource_type: '"UNSPECIFIED"'
---
# Configuration for a Prometheus handler
apiVersion: "config.istio.io/v1alpha2"
kind: prometheus
metadata:
  name: doublehandler
  namespace: istio-system
spec:
  metrics:
  - name: double_request_count # Prometheus metric name
    instance_name: doublerequestcount.metric.istio-system # Mixer instance name (fully-qualified)
    kind: COUNTER
    label_names:
    - source
    - destination
    - message
---
# Rule to send metric instances to a Prometheus handler
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
  name: doubleprom
  namespace: istio-system
spec:
  actions:
  - handler: doublehandler.prometheus
    instances:
    - doublerequestcount.metric
---
# Configuration for logentry instances
apiVersion: "config.istio.io/v1alpha2"
kind: logentry
metadata:
  name: newlog
  namespace: istio-system
spec:
  severity: '"warning"'
  timestamp: request.time
  variables:
    source: source.labels["app"] | source.service | "unknown"
    user: source.user | "unknown"
    destination: destination.labels["app"] | destination.service | "unknown"
    responseCode: response.code | 0
    responseSize: response.size | 0
    latency: response.duration | "0ms"
  monitored_resource_type: '"UNSPECIFIED"'
---
# Configuration for a stdio handler
apiVersion: "config.istio.io/v1alpha2"
kind: stdio
metadata:
  name: newhandler
  namespace: istio-system
spec:
 severity_levels:
   warning: 1 # Params.Level.WARNING
 outputAsJson: true
---
# Rule to send logentry instances to a stdio handler
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
  name: newlogstdio
  namespace: istio-system
spec:
  match: "true" # match for all requests
  actions:
   - handler: newhandler.stdio
     instances:
     - newlog.logentry
---

2.推送新配置

istioctl create -f new_telemetry.yaml

预期输出类似:

Created config metric/istio-system/doublerequestcount at revision 1973035
Created config prometheus/istio-system/doublehandler at revision 1973036
Created config rule/istio-system/doubleprom at revision 1973037
Created config logentry/istio-system/newlog at revision 1973038
Created config stdio/istio-system/newhandler at revision 1973039
Created config rule/istio-system/newlogstdio at revision 1973041

3.为示例应用发送流量
对于Bookinfo 示例,在你的网页中访问 http://$GATEWAY_URL/productpage 或执行如下命令:

curl http://$GATEWAY_URL/productpage

4.验证自动生成和收集的新的metric值。
在一个k8s环境中,通过执行如下命令来为Prometheus 设置端口转发:

kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=prometheus -o jsonpath='{.items[0].metadata.name}') 9090:9090 &

通过Prometheus UI查看新的metric值。
提供的链接开启了 Prometheus UI,并执行对 istio_double_request_count metric值的查询。Console(控制台)选项卡中展示的表格内容类似:

istio_double_request_count{destination="details.default.svc.cluster.local",instance="istio-mixer.istio-system:42422",job="istio-mesh",message="twice the fun!",source="productpage.default.svc.cluster.local"}  2
istio_double_request_count{destination="ingress.istio-system.svc.cluster.local",instance="istio-mixer.istio-system:42422",job="istio-mesh",message="twice the fun!",source="unknown"}   2
istio_double_request_count{destination="productpage.default.svc.cluster.local",instance="istio-mixer.istio-system:42422",job="istio-mesh",message="twice the fun!",source="ingress.istio-system.svc.cluster.local"} 2
istio_double_request_count{destination="reviews.default.svc.cluster.local",instance="istio-mixer.istio-system:42422",job="istio-mesh",message="twice the fun!",source="productpage.default.svc.cluster.local"}  2

对于使用Prometheus 查询metric值,查看 Querying Istio Metrics

5.验证日志流是否已创建,并正在填充请求。
在k8s环境中,如下搜索Mixer pod的日志:

kubectl -n istio-system logs $(kubectl -n istio-system get pods -l istio=mixer -o jsonpath='{.items[0].metadata.name}') mixer | grep \"instance\":\"newlog.logentry.istio-system\"

预期输出为:

{"level":"warn","ts":"2017-09-21T04:33:31.249Z","instance":"newlog.logentry.istio-system","destination":"details","latency":"6.848ms","responseCode":200,"responseSize":178,"source":"productpage","user":"unknown"}
{"level":"warn","ts":"2017-09-21T04:33:31.291Z","instance":"newlog.logentry.istio-system","destination":"ratings","latency":"6.753ms","responseCode":200,"responseSize":48,"source":"reviews","user":"unknown"}
{"level":"warn","ts":"2017-09-21T04:33:31.263Z","instance":"newlog.logentry.istio-system","destination":"reviews","latency":"39.848ms","responseCode":200,"responseSize":379,"source":"productpage","user":"unknown"}
{"level":"warn","ts":"2017-09-21T04:33:31.239Z","instance":"newlog.logentry.istio-system","destination":"productpage","latency":"67.675ms","responseCode":200,"responseSize":5599,"source":"ingress.istio-system.svc.cluster.local","user":"unknown"}
{"level":"warn","ts":"2017-09-21T04:33:31.233Z","instance":"newlog.logentry.istio-system","destination":"ingress.istio-system.svc.cluster.local","latency":"74.47ms","responseCode":200,"responseSize":5599,"source":"unknown","user":"unknown"}

Understanding the telemetry configuration

在这个task中,你添加Istio配置来指示Mixer为网格中的所有流量自动生成并报告一个新的metric和一个新的日志流。

添加的配置控制了Mixer的三个功能:
1.从Istio属性生成instances(在这个例子中,是metric值及日志记录)
2.创建能够处理已生成实例的handlers(已配置Mixer适配器)
3.通过设置的rulesinstances 分发到handlers

Understanding the metrics configuration

metrics配置指示Mixer发送metric值到Prometheus。它使用三段(或块)配置:instance configuration, handler configuration, and rule configuration.

kind: metric 段的配置定义了一个为名为 doublerequestcount的新metric生成metric值(或instances)的模式。这个实例配置告诉Mixer如何基于Envoy提供的属性(和Mixer自己生成的)为任何给定请求生成metric值。

doublerequestcount.metric的每个实例,配置指示Mixer支持的实例数为 2 。因为Istio为每个请求生成一个实例,这意味着这个metric记录的值和到达请求的总数的两倍相等。

一组 dimensions 对应每个 doublerequestcount.metric 实例。Dimensions 提供了根据不同需求和查询方向来分割,聚合和分析度量数据的方法。例如,在对应用行为进行故障排除时,可能需要考虑对某个目标服务的请求。

该配置指示Mixer根据属性值和字面值填充这些维度的值。例如,对于 source 维度,新的配置请求从source.service 属性获取值。如果那个属性值未被填充,规则指示Mixer使用默认值 "unknown". 对于message 维度,所有实例都将使用字面值"twice the fun!"

kind: prometheus 段的配置定义了一个叫做doublehandlerhandler 。处理器spec 配置了Prometheus适配器代码如何将接收到的metric实例转换为可由Prometheus后端处理的prometheus格式值。这个配置指定了一个名叫double_request_count的新的Prometheus metric。Prometheus适配器预先将 istio_ 命名空间添加到所有metric名字前,因此这个metric在Prometheus中展示为istio_double_request_count。 这个metric有三个与doublerequestcount.metric 实例配置的维度相匹配的标签。

对于 kind: prometheus 处理器,Mixer实例通过 instance_name 参数匹配到Prometheus metrics。这个 instance_name 值必须是Mixer实例的完全限定名 (example: doublerequestcount.metric.istio-system).

kind: rule 端的配hi定义一个名为 doubleprom.的新rule 。这个规则指示Mixer发送所有 doublerequestcount.metric 实例到 doublehandler.prometheus 处理器。因为在规则中没有 match 记录,而且规则被配置在默认配置命名空间 (istio-system), 该规则将针对网格中的所有请求执行。

Understanding the logs configuration

日志配置指示Mixer发送日志记录到stdout。它用了三段(或块)配置:instance configuration, handler configuration, and rule configuration.

kind: logentry 段配置定义了一个名为 newlog的生成的日志记录(或实例)模板。这个实例配置告诉Mixer如何针对基于Envoy提供的属性的请求生成日志记录。

severity 参数用来指示任何生成的logentry的日志等级。这个例子中,字面值 "warning" 被使用。这个值将由logentry handler 映射到支持的日志等级。

timestamp 参数提供所有日志记录的时间信息。这个例子中,时间由 request.time属性值提供,而这个值由Envoy提供。

variables 参数允许运维配置每个 logentry中应该包含的值。一组表达式控制从Istio 属性和字面值映射到构成一个 logentry的值。这个例子中,每个logentry 实例有一个叫latency 的字段由response.duration属性值填充,如果response.duration没有已知值, latency 字段将被设置为 0ms

kind: stdio 段配置定义了一个名为 newhandlerhandler 。这个处理器的spec 配置 stdio 适配器代码如何处理收到的 logentry 实例。severity_levels 参数控制 severity 字段的 logentry 值如何映射到支持的日至等级。这里, "warning" 值映射到WARNING 日志等级。outputAsJson 参数指示适配器生成JSON格式的日志线。

kind: rule 段的配置定义了一个名为 newlogstdio的新rule 。这个规则指示Mixer将所有newlog.logentry 实例发送到 newhandler.stdio 处理器。因为match 参数被设置为true,该规则将针对网格中的所有请求执行。

在rule规范中的match: true 表达式并不需要对松油请求执行配置的rule。在spec 中省略所有match 参数和设置 match: true 相等。它包含在这个位置是为了说明如何使用match 表达式来控制rule执行。

Cleanup

  • 移除新的遥测配置:
istioctl delete -f new_telemetry.yaml
  • 如果你不打算探索接下来地任何课题,参考 Bookinfo cleanup 指南来关闭应用。
### Telemetry Mechanism in Software Systems Telemetry involves automatically gathering data about a system's operation or performance remotely. In software systems, this typically means collecting metrics, logs, and traces that provide insights into how an application behaves under various conditions. In modern distributed environments, telemetry plays a crucial role by enabling continuous monitoring of applications deployed across multiple nodes or services. This allows operations teams to detect issues early, understand user behavior patterns, optimize resource usage, and ensure overall service reliability[^2]. The core components involved in implementing a telemetry mechanism include: #### Data Collection Data collection focuses on capturing relevant information such as CPU utilization, memory consumption, network traffic, request latency, error rates, etc., either periodically or based on specific triggers. Tools like Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), and others facilitate efficient aggregation of these metrics. ```python import psutil def collect_system_metrics(): cpu_usage = psutil.cpu_percent(interval=1) mem_info = psutil.virtual_memory() return { 'cpu': cpu_usage, 'memory_total': mem_info.total, 'memory_used': mem_info.used, 'memory_free': mem_info.free } ``` #### Data Transmission Once collected, telemetry data needs to be transmitted securely over networks back to centralized storage locations where further analysis can occur. Protocols used may vary depending upon requirements but often involve HTTP(S)/gRPC-based APIs along with compression techniques for efficiency purposes. #### Storage & Processing Centralized repositories store incoming streams efficiently while supporting fast query capabilities required during troubleshooting sessions or generating reports. Technologies commonly employed here encompass time-series databases (e.g., InfluxDB), NoSQL solutions (MongoDB/Cassandra), alongside traditional relational DBMSs when appropriate. #### Visualization & Alerting Finally, visualizing processed results through dashboards helps stakeholders gain actionable intelligence quickly without needing deep technical expertise. Setting up alerts ensures critical anomalies get immediate attention before they escalate into severe problems affecting end-users negatively.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值