tcpdump pt-query-digest
学习目标:
- 使用tcpdump命令按照一定规则抓取数据包,然后转储到中间文件
- 使用pt-query-digest对中间文件按照一定规则进行分析
tcpdump
tcpdump是一个强大的命令,这里我们只是讨论一些关于tcp的输出典型方法,因为MySQL就是使用tcp(可靠)传输数据的。
[root@archpredb214 ~]#tcpdump -s 65535 -x -nn -q -tttt -i any -c 100000 port 3306 >dmp.txt
- -s 65535 或者 -s 0都表示抓取一个完整的数据包,在IP数据包中==总长度=首部+数据部分==,因此如下图所示,总长度共16~31,16bit,2^16-1=65535,即抓取整个数据包。
- -x 打印每个数据包包头跟数据包内容,要想分析数据,这个参数是必须的;下面是命令加 -x参数 跟不加 –x参数的区别。
[root@archpredb214 ~]# tcpdump -s 65535 -x -nn -q -tttt -i any -c 100000 port 3306 #有选项 -x
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
2018-01-24 09:49:54.443184 IP 10.27.139.178.57940 > 10.27.139.176.3306: tcp 0
0x0000: 4500 0034 430d 4000 4006 cc1e 0a1b 8bb2
0x0010: 0a1b 8bb0 e254 0cea e057 2c39 e536 d965
0x0020: 8011 007b 7867 0000 0101 080a 1676 5325
0x0030: 1606 9833
#有16进制数据包内容
...
[root@archpredb214 ~]# tcpdump -s 65535 -nn -q -tttt -i any -c 100000 port 3306 #无选项 -x
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
2018-01-24 09:50:16.107281 IP 127.0.0.1.43489 > 127.0.0.1.3306: tcp 0
2018-01-24 09:50:16.107293 IP 127.0.0.1.3306 > 127.0.0.1.43489: tcp 0
...
#注意这里是没有数据包内容的
- -nn 不解析ip到主机名、端口号到服务名,而是直接以 ip、port的形式显示。
[root@archpredb214 ~]# tcpdump -s 65535 -x -nn -q -tttt -i any -c 100000 port 3306 #有选项 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
2018-01-24 09:49:54.443184 IP 10.27.139.178.57940 > 10.27.139.176.3306: tcp 0
#注意主机名、端口部分
0x0000: 4500 0034 430d 4000 4006 cc1e 0a1b 8bb2
0x0010: 0a1b 8bb0 e254 0cea e057 2c39 e536 d965
0x0020: 8011 007b 7867 0000 0101 080a 1676 5325
0x0030: 1606 9833
...
...
[root@archpredb214 ~]# tcpdump -s 65535 -x -q -tttt -i any -c 100000 port 3306 #无选项 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
2018-01-24 10:01:31.397868 IP localhost.43489 > localhost.mysql: tcp 19
#注意主机名、端口部分
0x0000: 4508 0047 e96b 4000 4006 533b 7f00 0001
0x0010: 7f00 0001 a9e1 0cea ae56 4f04 fecc 12a2
0x0020: 8018 0101 fe3b 0000 0101 080a 1680 f5a0
0x0030: 1676 a7c7 0f00 0000 0373 686f 7720 6461
0x0040: 7461 6261 7365 73
- -q 安静输出,很少打印有关协议的信息,所以这个选项也要使用。下面是不加-q参数额显示,有很长的一部分对于统计分析没有意义
[root@archpredb214 ~]# tcpdump -s 65535 -x -nn -tttt -i any -c 100000 port 3306
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
2018-01-24 10:09:41.913854 IP 127.0.0.1.43489 > 127.0.0.1.3306: Flags [P.], seq 2924891927:2924891946, ack 4274787189, win 265, options [nop,nop,TS val 378040756 ecr 377550240], length 19 #不加-q的额外输出显示
0x0000: 4508 0047 e96d 4000 4006 5339 7f00 0001
0x0010: 7f00 0001 a9e1 0cea ae56 4f17 fecc 1375
0x0020: 8018 0109 fe3b 0000 0101 080a 1688 71b4
0x0030: 1680 f5a0 0f00 0000 0373 686f 7720 6461
0x0040: 7461 6261 7365 73
-tttt 在每行打印前打印日期,有必要,作为时间统计。
-i any 抓取所有网口的包(不包括lo),其实这里抓取 eth0就可以了。下面是关于
man tcpdump
的 -i选项输出
-i Listen on interface. If unspecified, tcpdump searches the system interface list for the lowest numbered, configured up interface (excluding
loopback). Ties are broken by choosing the earliest match.
On Linux systems with 2.2 or later kernels, an interface argument of ‘‘any’’ can be used to capture packets from all interfaces. Note that
captures on the ‘‘any’’ device will not be done in promiscuous mode.
If the -D flag is supported, an interface number as printed by that flag can be used as the interface argument.
- -c 100000 抓取10w个数据包
- port 3306 端口号
pt-query-digest
这其实是percona的一个脚本工具,当然需要解决依赖关系。
[root@archpredb214 ~]#yum -y install perl perl-IO-Socket-SSL perl-DBD-MySQL perl-Time-HiRes
...
[root@archpredb214 ~]#wget percona.com/get/percona-toolkit.rpm
...
rpm -ivh percona-toolkit-2.1.7-1.noarch.rpm
...
[root@archpredb214 ~]# find / -name 'pt-query-digest'
/home/mysql/mysql_tools/pt-query-digest
[root@archpredb214 ~]#
Usage: pt-query-digest [OPTIONS] [FILES] [DSN]
该命令支持stdin
典型分析用法
分析慢日志
[root@archpredb214 ~]#/home/mysql/mysql_tools/pt-query-digest slow.log
分析tcpdump工具抓取的数据包转储文件
mysql.tcp.txt
[root@archpredb214 ~]#tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 > mysql.tcp.txt
[root@archpredb214 ~]#/home/mysql/mysql_tools/pt-query-digest --type tcpdump mysql.tcp.txt
- -type [binlog | genlog | slowlog | tcpdump]
指定pt-query-digest
要分析的文件类型
- 第一部分,总体统计结果(这里以慢查询日志为例子分析)
[root@archpredb214 data]# /home/mysql/mysql_tools/pt-query-digest --type=slowlog ./archpredb214-slow.log
# 170ms user time, 10ms system time, 23.39M rss, 180.74M vsz
# Current date: Wed Jan 24 14:23:02 2018
# Hostname: archpredb214
# Files: ./archpredb214-slow.log
# Overall: 2 total, 2 unique, 0 QPS, 0x concurrency ______________________
# Time range: all events occurred at 2017-10-13 15:36:25
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 2s 517ms 1s 759ms 1s 342ms 759ms
# Lock time 176us 63us 113us 88us 113us 35us 88us
# Rows sent 0 0 0 0 0 0 0
# Rows examine 208.00k 0 208.00k 104.00k 208.00k 147.08k 104.00k
# Rows affecte 104.00k 0 104.00k 52.00k 104.00k 73.54k 52.00k
# Rows read 0 0 0 0 0 0 0
# Bytes sent 72 11 61 36 61 35.36 36
# Tmp tables 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0
# Query size 271 121 150 135.50 150 20.51 135.50
...
如上输出,后面的数据省略。
- Overall(位于第6行):2 total 表示总共有多少查询,这里一共两行; 2 unique 唯一查询的数量,这里一共两个;假设都是同一个SQL那么这个值应该为1。
- Time range :表示时间范围,上表中表示的是所有的慢查询都发生在 2017-10-13 15:36:25
Attribute 属性 | total 共计 | min 最小 | max 最大 | avg 平均 | 95% 95%置信区间 | stddev 标准差 | median 中位数 |
---|---|---|---|---|---|---|---|
Exec time 执行时间 | |||||||
Lock time 锁时间 | |||||||
Rows sent 查询返回的行数 | |||||||
Rows examine 查询检查的行数 |
….
95%:正态分布,95%,就是去掉极端情况
- 第二部分,查询分组统计结果(还是以上面第一部分的命令为例)
# Profile
# Rank Query ID Response time Calls R/Call V/M Item
# ==== ================== ============= ===== ====== ===== ===============
# 1 0x106F26B3D217B2C9 1.0003 65.9% 1 1.0003 0.00
# 2 0x01C3BF9ABB566E2C 0.5170 34.1% 1 0.5170 0.00 INSERT SELECT test.group_message group_message
Column | Meaning |
---|---|
Rank | The query’s rank within the entire set of queries analyzed |
Query ID | The query’s fingerprint |
Response time | The total response time, and percentage of overall total |
Calls | The number of times this query was executed |
R/Call | The mean response time per execution |
V/M | The Variance-to-mean ratio of response time |
Item | The distilled query |
- 第三部分,每个被统计的查询的详细信息
# Query 1: 0 QPS, 0x concurrency, ID 0x106F26B3D217B2C9 at byte 0 ________
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: all events occurred at 2017-10-13 15:36:25
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 50 1
# Exec time 65 1s 1s 1s 1s 1s 0 1s
# Lock time 35 63us 63us 63us 63us 63us 0 63us
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 0 0 0 0 0 0 0 0
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 0 0 0 0 0 0 0 0
# Bytes sent 15 11 11 11 11 11 0 11
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 44 121 121 121 121 121 0 121
# String:
# Hosts
# Last errno 0
# Users root
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'10.27.139.%' IDENTIFIED BY PASSWORD '*5C2A4F610EF47B9C3C868FF5E88A69965D887ED4'\G
常见操作
- 分析最近12小时内的查询
[root@archpredb214 lianxi]# /home/mysql/mysql_tools/pt-query-digest --since=12h /mysql/data/archpredb214-slow.log >anal1.txt
- 分析指定时间范围内的查询
[root@archpredb214 lianxi]# /home/mysql/mysql_tools/pt-query-digest 'unixtimestamp1' –until 'unixtimestamp2' >anal2.txt
- –filter参数是一个很强大的选项,甚至可以自己创造一些输出,这里只是列举一些常见的输出
- 分析只含有select语句的慢查询
[root@archpredb214 lianxi]# /home/mysql/mysql_tools/pt-query-digest --filter '$event->{fingerprint} =~ m/^select/i' /mysql/data/archpredb214-slow.log >anal3.txt
- 针对某个用户的慢查询
root@archpredb214 lianxi]# /home/mysql/mysql_tools/pt-query-digest --filter '($event->{user} || "") =~ m/^root/i' /mysql/data/archpredb214-slow.log >anal4.txt
- 所有的全表扫描或full join的慢查询
[root@archpredb214 lianxi]# /home/mysql/mysql_tools/pt-query-digest --filter '(($event->{Full_scan} || "") eq "yes") || (($event->{Full_join} || "") eq "yes")' /mysql/data/archpredb214-slow.log >anal5.txt
- 其他更详细的示例参考percona官方文档。