目录
8.1.1 SpringCloud Sleuth 是什么
- SpringCloud Sleuth 必知必会
- SpringCloud Sleuth 实现的功能是:它会自动为当前应用构建起各通信通道的跟踪机制
- 通过诸如 RabbitMQ、Kafka(或者其他任何 SpringCloud Sleuth 绑定器实现的消息中间件)传递的请求
- 通过 Zuul、Gateway 代理传递的请求
- 通过 RestTemplate 发起的请求
- SpringCloud Sleuth 实现的功能是:它会自动为当前应用构建起各通信通道的跟踪机制
- SpringCloud Sleuth 跟踪实现原理
- 为了实现请求跟踪:当请求发送到分布式系统的入口端点时,只需要服务跟踪框架为该请求创建一个唯一的跟踪标识 Trace ID
- 为了统计各处理单元的时间延迟,当请求到达各个服务组件时,或是处理逻辑到达某个状态时,也通过一个唯一标识来标记它的开始、具体过程以及结束,Span ID
Span ID 如果要计算时间延迟,可以通过 spanIdA11 - spanIdA1
8.1.2 Zipkin 是什么
- Zipkin 的基础概念
- Zipkin 解决微服务架构中的延迟问题,包括数据的收集、存储、查找和展现
- Zipkin 有四大核心组件构成
- Collector:收集器组件
- Storge:存储组件
- API:RESTFul API,提供外部访问接口
- UI:Web UI,提供可视化查询页面
8.2.1 集成 SpringCloud Sleuth 实现微服务通信跟踪
8.2.1.1 集成步骤
- 保证服务与服务之间存在跨进程通信
- Maven 依赖
8.2.1.2 编写测试代码
sca-commerce-gateway 与 sca-commerce-alibaba-nacos-client 添加 Maven 依赖
<!-- 通过 Sleuth 实现链路跟踪 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
sca-commerce-alibaba-nacos-client 编写测试业务代码与控制层代码
SleuthTraceInfoService
package com.edcode.commerce.service;
import brave.Tracer;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
/**
* @author eddie.lee
* @blog blog.eddilee.cn
* @description 使用代码更直观的看到 Sleuth 生成的相关跟踪信息
*/
@Slf4j
@Service
@RequiredArgsConstructor
public class SleuthTraceInfoService {
/** brave.Tracer 跟踪对象 */
private final Tracer tracer;
/**
* 打印当前的跟踪信息到日志中
*/
public void logCurrentTraceInfo() {
log.info("Sleuth trace id: [{}]", tracer.currentSpan().context().traceId());
log.info("Sleuth span id: [{}]", tracer.currentSpan().context().spanId());
}
}
SleuthTraceInfoController
package com.edcode.commerce.controller;
import com.edcode.commerce.service.SleuthTraceInfoService;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
/**
* @author eddie.lee
* @blog blog.eddilee.cn
* @description 打印跟踪信息
*/
@Slf4j
@RestController
@RequestMapping("/sleuth")
@RequiredArgsConstructor
public class SleuthTraceInfoController {
private final SleuthTraceInfoService traceInfoService;
/**
* 打印日志跟踪信息
*/
@GetMapping("/trace-info")
public void logCurrentTraceInfo() {
traceInfoService.logCurrentTraceInfo();
}
}
8.2.1.2 测试请求与查看控制台日志
发起请求
### 查看 Sleuth 跟踪信息
GET http://127.0.0.1:9001/edcode/scacommerce-nacos-client/sleuth/trace-info
Accept: application/json
sca-commerce-user: eyJhbGciOiJSUzI1NiJ9.eyJzY2EtY29tbWVyY2UtdXNlciI6IntcImlkXCI6MTEsXCJ1c2VybmFtZVwiOlwiZWRkaWVAcXEuY29tXCJ9IiwianRpIjoiZjQ3M2NhZjctY2RjMi00ZmE4LWExNzQtZjZhYmQ5ZDFjMzAzIiwiZXhwIjoxNjM1ODY4ODAwfQ.iTtQE2gHzjPxVP5SEFHrDBkvrzI-yt6oy-w1x--Q3ahhTvYLTiYnvndtIx7IIyYipr_ayZnAQyluPt3oiLaS80r9qByaN3zQF-6gBW_wu_fd0yd89hIjPnQeP1mY2NcchV2FaMUW7Jlq8CUDPurEhW4GUDXOqBXgmxai5UTu4yoXBUfyXUXznKTx697cGo5aoVKTAKvMReJg-77n5sQuafZNDu6pz2D1KMvEucNyZtbXw0JRIl1CsK777Jt3IG1bnOnwRBt8o1tkodZ3zJbfgTGVCHJmfEuUnXwdf4DLAq568pNVvylPLh4_r-UUGGxE6Az9XwOtl1w4vzK1M2ATzw
token: edcode
响应信息
GET http://127.0.0.1:9001/edcode/scacommerce-nacos-client/sleuth/trace-info
HTTP/1.1 200 OK
transfer-encoding: chunked
Content-Type: application/json
Date: Tue, 02 Nov 2021 13:04:55 GMT
{
"code": 0,
"message": "",
"data": null
}
Response code: 200 (OK); Time: 1160ms; Content length: 35 bytes
查看日志
sca-commerce-gateway
2021-11-02 21:04:55.332 INFO [sca-commerce-gateway,353ea734cc43d6ee,353ea734cc43d6ee,true] 1060 --- [ctor-http-nio-2] c.netflix.config.ChainedDynamicProperty : Flipping property: sca-commerce-nacos-client.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
2021-11-02 21:04:55.347 INFO [sca-commerce-gateway,353ea734cc43d6ee,353ea734cc43d6ee,true] 1060 --- [ctor-http-nio-2] c.netflix.loadbalancer.BaseLoadBalancer : Client: sca-commerce-nacos-client instantiated a LoadBalancer: DynamicServerListLoadBalancer:{NFLoadBalancer:name=sca-commerce-nacos-client,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:null
2021-11-02 21:04:55.353 INFO [sca-commerce-gateway,353ea734cc43d6ee,353ea734cc43d6ee,true] 1060 --- [ctor-http-nio-2] c.n.l.DynamicServerListLoadBalancer : Using serverListUpdater PollingServerListUpdater
2021-11-02 21:04:55.372 INFO [sca-commerce-gateway,353ea734cc43d6ee,353ea734cc43d6ee,true] 1060 --- [ctor-http-nio-2] c.netflix.config.ChainedDynamicProperty : Flipping property: sca-commerce-nacos-client.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
2021-11-02 21:04:55.374 INFO [sca-commerce-gateway,353ea734cc43d6ee,353ea734cc43d6ee,true] 1060 --- [ctor-http-nio-2] c.n.l.DynamicServerListLoadBalancer : DynamicServerListLoadBalancer for client sca-commerce-nacos-client initialized: DynamicServerListLoadBalancer:{NFLoadBalancer:name=sca-commerce-nacos-client,current list of Servers=[192.168.3.192:8000],Load balancer stats=Zone stats: {unknown=[Zone:unknown; Instance count:1; Active connections count: 0; Circuit breaker tripped count: 0; Active connections per server: 0.0;]
},Server stats: [[Server:192.168.3.192:8000; Zone:UNKNOWN; Total Requests:0; Successive connection failure:0; Total blackout seconds:0; Last connection made:Thu Jan 01 08:00:00 CST 1970; First connection made: Thu Jan 01 08:00:00 CST 1970; Active Connections:0; total failure count in last (1000) msecs:0; average resp time:0.0; 90 percentile resp time:0.0; 95 percentile resp time:0.0; min resp time:0.0; max resp time:0.0; stddev resp time:0.0]
]}ServerList:com.alibaba.cloud.nacos.ribbon.NacosServerList@72186c8f
2021-11-02 21:04:55.592 INFO [sca-commerce-gateway,353ea734cc43d6ee,353ea734cc43d6ee,true] 1060 --- [ctor-http-nio-2] c.e.c.filter.GlobalElapsedLogFilter : [/edcode/scacommerce-nacos-client/sleuth/trace-info] elapsed: [1034ms]
2021-11-02 21:04:56.358 INFO [sca-commerce-gateway,,,] 1060 --- [erListUpdater-0] c.netflix.config.ChainedDynamicProperty : Flipping property: sca-commerce-nacos-client.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
sca-commerce-alibaba-nacos-client
2021-11-02 21:04:55.543 INFO [sca-commerce-nacos-client,353ea734cc43d6ee,c85be2c1bb127558,true] 33740 --- [nio-8000-exec-1] c.e.c.service.SleuthTraceInfoService : Sleuth trace id: [3836687777773377262]
2021-11-02 21:04:55.543 INFO [sca-commerce-nacos-client,353ea734cc43d6ee,c85be2c1bb127558,true] 33740 --- [nio-8000-exec-1] c.e.c.service.SleuthTraceInfoService : Sleuth span id: [-4009361721548180136]
解析:[sca-commerce-nacos-client,353ea734cc43d6ee,c85be2c1bb127558,true]
第一行:service name
第二行:trace id
第三行:span id
8.3.1 搭建 Zipkin Server 实现对跟踪信息的收集
8.3.1.1 ZS搭建步骤
- Tips:SpringCloud Finchley 版本(包含)之后,官方不建议自己搭建 Zipkin-Server,提供了已经打包好的jar文件(SpringBoot工程),直接下载启动即可
- 下载地址
- curl -sSL https://zipkin.io/quickstart.sh | bash -s
- 选择自己需要的版本即可
- 选择 *.exec.jar 结尾的 jar
8.3.1.2 Linux 终端
[root@localhost opt]# curl -sSL https://zipkin.io/quickstart.sh | bash -s
Thank you for trying Zipkin!
This installer is provided as a quick-start helper, so you can try Zipkin out
without a lengthy installation process.
Fetching version number of latest io.zipkin:zipkin-server release...
Latest release of io.zipkin:zipkin-server seems to be 2.23.4
Downloading io.zipkin:zipkin-server:2.23.4:exec to zipkin.jar...
> curl -fL -o 'zipkin.jar' 'https://repo1.maven.org/maven2/io/zipkin/zipkin-server/2.23.4/zipkin-server-2.23.4-exec.jar'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 59.0M 100 59.0M 0 0 34146 0 0:30:14 0:30:14 --:--:-- 33309
Verifying checksum...
> curl -fL -o 'zipkin.jar.md5' 'https://repo1.maven.org/maven2/io/zipkin/zipkin-server/2.23.4/zipkin-server-2.23.4-exec.jar.md5'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 32 100 32 0 0 17 0 0:00:01 0:00:01 --:--:-- 17
> md5sum -c <<< "$(cat zipkin.jar.md5) zipkin.jar"
zipkin.jar: OK
Checksum for zipkin.jar passes verification
Verifying GPG signature of zipkin.jar...
> curl -fL -o 'zipkin.jar.asc' 'https://repo1.maven.org/maven2/io/zipkin/zipkin-server/2.23.4/zipkin-server-2.23.4-exec.jar.asc'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 833 100 833 0 0 91 0 0:00:09 0:00:09 --:--:-- 180
GPG signing key is not known, skipping signature verification.
Use the following commands to manually verify the signature of zipkin.jar:
gpg --keyserver keyserver.ubuntu.com --recv FF31B515
# Optionally trust the key via 'gpg --edit-key FF31B515', then typing 'trust',
# choosing a trust level, and exiting the interactive GPG session by 'quit'
gpg --verify zipkin.jar.asc zipkin.jar
You can now run the downloaded executable jar:
java -jar zipkin.jar
[root@localhost opt]# nohup java -jar zipkin.jar &
[1] 30238
[root@localhost opt]# nohup: ignoring input and appending output to ‘nohup.out’
8.3.1.3 Zipkin Web UI
访问:IP:9411
8.3.2 配置 Zipkin Server 实现对跟踪信息的收集
- 配置 ZS
- 为什么需要对 ZS 做自定义配置?
- 默认情况下, ZS 将 跟踪信息存储在内存中(JVM),重启会丢失
- ZS 默认使用 HTTP 方式上报跟踪数据,性能较差
- ZS 配置 MySQL 跟踪数据持久化 (同时它也支持 ES)
- MySQL 中添加数据表:https://github.com/openzipkin/zipkin/blob/master/zipkin-storage/mysql-v1/src/main/resources/mysql.sql
- ZS 启动指定 MySQL 路径
- 为什么需要对 ZS 做自定义配置?
8.4.1 SpringCloud Sleuth 整合 Zipkin 实现分布式链路跟踪、收集
-
SpringCloud Sleuth 整合 Zipkin 步骤
- 简单的两个步骤(Zipkin Server 使用 MySQL 实现跟踪数据持久化)
- Maven 依赖
- bootstrap.yml 中增加 Zipkin 的配置
- 简单的两个步骤(Zipkin Server 使用 MySQL 实现跟踪数据持久化)
-
下载、安装 Kafka
- 下载 Kafka:https://kafka.apache.org/quickstart
- 解压、启动 ZK 和 Kafka Server 即可 (使用默认配置)
8.4.1.1 下载与解压 Kafka
Downloads
https://kafka.apache.org/downloads
Linux 步骤
[root@localhost opt]# wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
[root@localhost opt]# tar -zxf kafka_2.13-3.0.0.tgz
[root@localhost opt]# ls -la | grep kafka
drwxr-xr-x. 8 root root 134 Nov 3 01:48 kafka_2.13-3.0.0
-rw-r--r--. 1 root root 86396520 Sep 20 04:46 kafka_2.13-3.0.0.tgz
8.4.1.2 启动 zookeeper 与 Kafka
对外访问需要修改 Kafka 的 server.properties
找到 advertised.listeners 或者 自行添加 advertised.listeners 指定 IP 地址
[root@localhost kafka_2.13-3.0.0]# vim /opt/kafka_2.13-3.0.0/config/server.properties
advertised.listeners=PLAINTEXT://192.168.3.250:9092
后台启动 zookeeper
[root@localhost kafka_2.13-3.0.0]# nohup /opt/kafka_2.13-3.0.0/bin/zookeeper-server-start.sh config/zookeeper.properties &
[1] 31998
[root@localhost kafka_2.13-3.0.0]# nohup: ignoring input and appending output to ‘nohup.out’
后台启动 kafka
[root@localhost kafka_2.13-3.0.0]# nohup /opt/kafka_2.13-3.0.0/bin/kafka-server-start.sh config/server.properties &
[1] 32574
[root@localhost kafka_2.13-3.0.0]# nohup: ignoring input and appending output to ‘nohup.out’
8.4.1.3 运行 ZipKin 关联 Kafka 与 MySQL
[root@localhost opt]# nohup java -DKAFKA_BOOTSTRAP_SERVERS=127.0.0.1:9092 -jar zipkin.jar --STORAGE_TYPE=mysql --MYSQL_USER=root --MYSQL_PASS=123456 --MYSQL_HOST=127.0.0.1 --MYSQL_TCP_PORT=3306 --MYSQL_DB=zipkin &
[1] 601
[root@localhost opt]# nohup: ignoring input and appending output to ‘nohup.out’
连接kafka:-DKAFKA_BOOTSTRAP_SERVERS=127.0.0.1:9092
连接MySQL:–STORAGE_TYPE=mysql --MYSQL_USER=root --MYSQL_PASS=123456 --MYSQL_HOST=127.0.0.1 --MYSQL_TCP_PORT=3306 --MYSQL_DB=zipkin
8.4.1.4 检查 Linux 服务是否启动
[root@localhost opt]# ps -aux | grep -E 'nacos|zipkin|kafka|zookeeper'
[root@localhost opt]# netstat -ltnp | grep -E '8848|9092|9411|2181'
tcp6 0 0 :::8848 :::* LISTEN 1932/java
tcp6 0 0 :::9411 :::* LISTEN 601/java
tcp6 0 0 :::9092 :::* LISTEN 32574/java
tcp6 0 0 :::2181 :::* LISTEN 31998/java
8.4.1.4 IDEA 启动服务与测试发起请求
Maven 依赖 (zipkin 与 kafka)
<!-- 通过 Sleuth 实现链路跟踪 -->
<!-- <dependency>-->
<!-- <groupId>org.springframework.cloud</groupId>-->
<!-- <artifactId>spring-cloud-starter-sleuth</artifactId>-->
<!-- </dependency>-->
<!-- zipkin = spring-cloud-starter-sleuth + spring-cloud-sleuth-zipkin-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>2.5.0.RELEASE</version>
</dependency>
修改 Gateway 与 Nacos-Client 服务的配置文件
spring:
kafka:
bootstrap-servers: ${KAFKA_SERVER:127.0.0.1}:${KAFKA_PORT:9092}
producer:
retries: 3
consumer:
auto-offset-reset: latest
zipkin:
sender:
type: ${ZIPKIN_KAFKA_SENDER:web} # 默认是 web
base-url: http://${ZIPKIN_URL:localhost}:${ZIPKIN_PORT:9411}/
两个服务都需要添加 kafka 与 zipkin 的连接信息
启动服务
- 启动以下服务
- AuthorityCenterApplication :7000/ # 如果 token 过期,需要重新签发
- NacosClientApplication :8000/
- GatewayApplication :9001/
请求测试
sca-commerce-gateway/src/main/resources/http/nacos-client.http
### 查询服务
GET http://127.0.0.1:9001/edcode/scacommerce-nacos-client/nacos-client/service-instance?serviceId=sca-commerce-gateway
Accept: application/json
sca-commerce-user: eyJhbGciOiJSUzI1NiJ9.eyJzY2EtY29tbWVyY2UtdXNlciI6IntcImlkXCI6MTEsXCJ1c2VybmFtZVwiOlwiZWRkaWVAcXEuY29tXCJ9IiwianRpIjoiMWU1MGI2ZWYtNmUzOS00YmY2LWJlMjktZDc4NWU3NWQyNmY1IiwiZXhwIjoxNjM1OTU1MjAwfQ.P7GxZuMUrgiMUbD4dNYzQiV3A6YkaFpvlzg8cpBdu_hvxqDsVEuuiYODQSzZPQeN3xTQPbJ70zkSY084HV7Vsk929en1lqNiX_dpQEuGSbz2JSPqyJuLZ6v7hRX9GI32sPrZAnaKVXMdeHUXCMMmaS1L3osimSvAlaoDE0n2UukDLgu83xRlL3bddHIJbmFD5BrV6Y-u9d-blqXPOpxFEYkdwS_XrljYiULTH7Olr71TAwODUPdttnmVhHPXB0_dnOG5DZMOC0OxqokHGZJ7CC86paE4TvdNPwqotB6u6zh_d_YCCBWM3t1LmKYB6E_bnz2taL5Q4AYHlRaZZotaAA
token: edcode ## HeaderTokenGatewayFilter
###
8.4.1.5 Zipkin Web UI
如何简单的使用
打开 http://192.168.3.250:9411
默认 all,然后直接查找,会显示所有的请求信息,点击其中一条
就会看到该请求的所有经过哪些服务,耗时多少
同样,也可以在终端拿 trace Id:669b59f38adf2c38 去跟踪链路
trace Id 搜索框:669b59f38adf2c38
如何查看服务之间的依赖关系
点击上方的【依赖】
8.5.1 Spring Cloud Sleuth 设置采样率、抽样收集策略
8.5.1.1 Spring Cloud Sleuth 采样收集
- 收集跟踪信息是一把双刃剑,需要做好权衡
- 收集的跟踪信息越多,越能反映出系统的实际运行情况
- 高并发场景下,大量的请求调用会产生海量的跟踪日志信息,性能开销太大
开发与测试环境可以使用高的采样率,但是生产环境建议不要这么做。
- 可以自由选择 Zipkin brave 自带的两个抽样策略
- ProbabilityBasedSampler 采样率策略
- 默认使用的策略,以请求百分比 的方式配置和手机跟踪信息:它的默认值为 0.1,代表手机 10% 的请求跟踪信息
- spring.sleuth.sampler.probability=0.5
- RateLimitingSampler 抽样策略
- 限速采集,也就是说它可以用来限制每秒追踪请求的最大数量,优先级更高
- spring.sleuth.sampler.rate=10 ## 一秒最大只有10个跟踪策略给采集
- ProbabilityBasedSampler 采样率策略
8.5.1.2 bootstrap.yml 配置 Sleuth
sca-commerce-alibaba-nacos-client
spring:
sleuth:
sampler:
# RateLimitingSampler 抽样策略,设置了限速采样,spring.sleuth.sampler.probability 属性值无效
rate: 100 # 每秒间隔接受的 trace 量
# Probability 抽样策略
probability: 1.0 # 采样比例,1.0 表示 100%, 默认:0.1
8.5.1.3 代码配置 Sleuth
package com.edcode.commerce.sampler;
import brave.sampler.RateLimitingSampler;
import brave.sampler.Sampler;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* @author eddie.lee
* @blog blog.eddilee.cn
* @description 使用配置的方式设定抽样率 (二选一)
*/
@Configuration
public class SamplerConfig {
/**
* 限速采集(推荐)
*/
@Bean
public Sampler sampler() {
return RateLimitingSampler.create(100);
}
// /**
// * 概率采集, 默认的采样策略, 默认值是 0.1
// */
// @Bean
// public Sampler defaultSampler() {
// return ProbabilityBasedSampler.create(0.5f);
// }
}
代码与Yaml文件配置是二选一,方便简洁肯定是 Yaml配置
限速采集与概率采集无论是代码还是 Yaml文件 都是二选一
8.6.1 SpringCloud Sleuth+Zipkin 分布式日志追踪总结
8.6.1.1 SpringCloud Sleuth+Zipkin 逻辑架构图
- 跟踪、收集所涉及的三个组件(模块)Sleuth、Zipkin、Brave
- 三个组件之间的关系
- Brave 是一个 tracer 库,提供的是 tracer 接口
- Sleuth 采用了 Brave 作为 tracer 库
- Sleuth 可以不使用 Zipkin
8.6.1.2 Brave 解读
-
Brave 的两个最基本、也是最核心的概念
- trace:以看作是一个逻辑执行过程中的整个链条 (可以看作一棵树)
- span:是 trace 跟踪的基本单位
-
Brave 中常用的数据结构以及说明
- Tracing:工具类,用于生成 Tracer 类实例
- Tracer:也是工具类,用于生成 Span
- Span:实际记录每个功能块执行信息的类
- TraceContext:记录 trace 的执行过程中的元数据信息类
- Propagation:用于在分布式环境或者跨进程条件下的 trace 跟踪时实现 TraceContext 传递的工具类
8.6.1.2 SpringCloud Sleuth 如何实现跨服务Trace 追踪
- SpringCloud Sleuth 实现跨服务 Trace 追踪
- SpringCloud Sleuth 和 Brave 提供了很多不同的分布式框架的支持,例如 gRPC、Kafka、HTTP等