linux常见的性能分析工具和方法

最新推荐文章于 2022-09-29 22:48:24 发布

牛不才

最新推荐文章于 2022-09-29 22:48:24 发布

阅读量359

点赞数

分类专栏： 002-操作系统 006-工具使用 003-计算机网络文章标签： vmstat iostat netstat sar 性能分析

本文链接：https://blog.youkuaiyun.com/niu91/article/details/112497344

版权

002-操作系统同时被 3 个专栏收录

81 篇文章

订阅专栏

006-工具使用

43 篇文章

订阅专栏

003-计算机网络

15 篇文章

订阅专栏

查看系统负载 uptime

[root@localhost ~]# uptime
 10:18:27 up 6 days, 17:24,  3 users,  load average: 0.00, 0.01, 0.05

当前时间、系统已经运行了多长时间、有多少登陆用户、系统在过去的1分钟、5分钟和15分钟内的平均负载。平均负载的最佳值是1，这意味着每个进程都可以立即执行不会错过CPU周期。单核CPU，1-2是正常的，多核CPU，核心数为n，那么n-2n之间也是正常的。

free 查看空闲内存和已使用的内存

[root@kafka3 ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           7703        2318        3750         281        1635        4802
Swap:          7935           0        7935

类似还有top，htop，htop可以看做是top的一个加强扩展，但需要额外安装

vmstat

[root@localhost ~]# vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 582132   1260 2738456    0    0     1     2   30   15  0  0 100  0  0
 0  0      0 582116   1260 2738456    0    0     0     0   65  144  0  0 100  0  0
 0  0      0 582116   1260 2738456    0    0     0     0   49  115  0  0 100  0  0
 0  0      0 582116   1260 2738456    0    0     0     0   69  149  0  1 100  0  0
 0  0      0 582116   1260 2738456    0    0     0     0   55  126  0  0 100  0  0
 [root@localhost ~]# vmstat -a
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 580708 748036 2106624    0    0     1     2   30   16  0  0 100  0  0

vmstat 1 5的意思是每隔1秒钟执行一次，总共执行5次结束。
一共五列:

procs
- r 可运行进程(正在运行或等待运行时)的数量。显示多少进程正在等待CPU
- b 处于不可中断睡眠状态的进程数。多少进程正在等待io
memory
- swpd 虚拟内存大小
- free 空闲内存大小
- buff 缓冲区大小
- cache 缓存大小
- inact 未激活的内存数量
- active 当前使用的内存数量
swap
- si 交换进来的大小
- so 交换出去的大小 大部分时间这个两个值为0，如果超过十个块，在高并发时性能肯定是降低的
io 反映io效率
- bi 从块设备接收到的块数
- bo 发送到块设备的块数
system 通常是系统函数的使用情况
- in 每秒钟的中断数量，包括时钟
- cs 每秒钟上下文切换的次数
cpu 下面四个参数加起来是100%，是对cpu使用率的分解
- us 非内核态代码运行时间
- sy 内核代码运行时间
- id 空闲时间
- wa io等待时间
- st 从虚拟机窃取的时间
  使用vmstat我们可以清楚的知道，CPU，IO设备，内存的状态。为我们查找系统瓶颈提供依据。

mpstat 查看cpu信息

[root@localhost ~]# mpstat -P ALL 1 2
Linux 3.10.0-327.el7.x86_64 (localhost.localdomain) 	01/11/2021 	_x86_64_	(2 CPU)

10:51:17 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
10:51:18 AM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
10:51:18 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
10:51:18 AM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

10:51:18 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
10:51:19 AM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
10:51:19 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
10:51:19 AM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
Average:       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

如上所示共有两个CPU，mpstat会分别输出他们的指标

% usr 用户态进程CPU使用占比
% nice 指定优先级的用户态进程CPU使用占比
% sys 内核态进程CPU使用占比（不包含软硬件中断中断）
% iowait 有未完成的磁盘I/O请求期间，CPU空闲占比
% irq CPU服务硬件中断使用占比
% soft CPU服务软件中断使用占比
% steal 虚拟CPU或CPU所花费的非自愿等待时间占比
% guest CPU运行虚拟处理器占比
% gnice CPU运行一个niced gurst的时间占比
% idle CPU空闲占比

pidstat Linux任务的统计信息类似top

[root@localhost ~]# pidstat 1 3
Linux 3.10.0-327.el7.x86_64 (localhost.localdomain) 	01/11/2021 	_x86_64_	(2 CPU)

11:10:12 AM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
11:10:13 AM     0     11997    0.98    0.98    0.00    1.96     0  pidstat

11:10:13 AM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
11:10:14 AM     0     11997    1.00    2.00    0.00    3.00     0  pidstat

11:10:14 AM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
11:10:15 AM     0     11997    0.00    0.99    0.00    0.99     0  pidstat

Average:      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0     11997    0.66    1.32    0.00    1.98     -  pidstat

iostat IO设备（一般就是磁盘，分区）的统计分析

[root@kafka3 ~]# iostat -xz 1 5
Linux 3.10.0-1127.19.1.el7.x86_64 (kafka3.sd.cn) 	01/10/2021 	_x86_64_	(4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.35    0.00    0.20    0.00    0.00   99.44

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.02    0.01    1.12     0.22     5.35     9.88     0.01    8.58    7.13    8.59   7.58   0.85
sdb               0.00     0.00    0.00    0.00     0.00     0.00    58.72     0.00    0.33    0.33    0.00   0.14   0.00
dm-0              0.00     0.00    0.01    0.87     0.20     5.35    12.67     0.01   11.39    7.11   11.44   9.75   0.85
dm-1              0.00     0.00    0.00    0.00     0.00     0.00    50.09     0.00   14.48   14.48    0.00  11.82   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00    48.19     0.00    0.56    0.56    0.00   0.30   0.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00    48.19     0.00    0.40    0.40    0.00   0.23   0.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00    48.19     0.00    0.42    0.42    0.00   0.30   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00    83.98     0.00   13.35   16.40    7.48  11.57   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.25    0.50    0.00   99.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    1.00    3.00     8.00    12.00    10.00     0.04    9.25   14.00    7.67   9.25   3.70
dm-0              0.00     0.00    1.00    2.00     8.00    12.00    13.33     0.04   12.33   14.00   11.50  12.33   3.70

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.00    0.00    0.00   99.75

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.25    0.00    0.00   99.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.00    0.00    0.00   99.75

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

参数有些多，详细内容可以查阅下man手册，这里捡重点的说了。
rrqm/s wrqm/s
这类参数就是说每秒钟合并读（r）写（r）的数量。也就是说多少个逻辑请求合并成一个请求写到实际磁盘
r/s w/s rkB/s wkB/s
每秒发送到设备的读写请求，扇区。
avgrq-sz avgqu-sz 发送给设备的平均（avg）大小（sz）。qu表示请求队列，rq表示请求的大小，扇区为单位。
await r_await w_await 发送给io设备的等待时间，单位是毫秒。
7. sar 网络检测
7.1 检查网卡吞吐量

[root@localhost ~]# sar -n DEV 1 3
Linux 3.10.0-327.el7.x86_64 (localhost.localdomain) 	01/11/2021 	_x86_64_	(2 CPU)

11:33:07 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
11:33:08 AM     ens32      2.00      0.00      0.14      0.00      0.00      0.00      0.00
11:33:08 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:08 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:08 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:08 AM     ens33      2.00      0.00      0.14      0.00      0.00      0.00      0.00

11:33:08 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
11:33:09 AM     ens32      2.00      1.00      0.12      0.65      0.00      0.00      0.00
11:33:09 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:09 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:09 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:09 AM     ens33      1.00      0.00      0.06      0.00      0.00      0.00      0.00

11:33:09 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
11:33:10 AM     ens32      1.00      1.00      0.06      0.65      0.00      0.00      0.00
11:33:10 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:10 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:10 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:33:10 AM     ens33      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Average:        IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
Average:        ens32      1.67      0.67      0.10      0.43      0.00      0.00      0.00
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:    virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:       virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        ens33      1.00      0.00      0.07      0.00      0.00      0.00      0.00

收包数量每秒 rxpck/s： Total number of packets received per second.
传输包数量每秒 txpck/s： Total number of packets transmitted per second.
rxcmp/s txcmp/s cm表示compressed压缩
7.2

[root@localhost ~]#  sar -n TCP 1 3
Linux 3.10.0-327.el7.x86_64 (localhost.localdomain) 	01/11/2021 	_x86_64_	(2 CPU)

11:39:10 AM  active/s passive/s    iseg/s    oseg/s
11:39:11 AM      0.00      0.00      0.00      0.00
11:39:12 AM      0.00      0.00      1.00      1.00
11:39:13 AM      0.00      0.00      1.00      1.00
Average:         0.00      0.00      0.67      0.67
[root@localhost ~]#  sar -n TCP,ETCP 1 3
Linux 3.10.0-327.el7.x86_64 (localhost.localdomain) 	01/11/2021 	_x86_64_	(2 CPU)

11:40:04 AM  active/s passive/s    iseg/s    oseg/s
11:40:05 AM      0.00      0.00      1.00      0.00

11:40:04 AM  atmptf/s  estres/s retrans/s isegerr/s   orsts/s
11:40:05 AM      0.00      0.00      0.00      0.00      0.00

11:40:05 AM  active/s passive/s    iseg/s    oseg/s
11:40:06 AM      0.00      0.00      1.00      1.00

11:40:05 AM  atmptf/s  estres/s retrans/s isegerr/s   orsts/s
11:40:06 AM      0.00      0.00      0.00      0.00      0.00

11:40:06 AM  active/s passive/s    iseg/s    oseg/s
11:40:07 AM      0.00      0.00      1.00      1.00

11:40:06 AM  atmptf/s  estres/s retrans/s isegerr/s   orsts/s
11:40:07 AM      0.00      0.00      0.00      0.00      0.00

Average:     active/s passive/s    iseg/s    oseg/s
Average:         0.00      0.00      1.00      0.67

Average:     atmptf/s  estres/s retrans/s isegerr/s   orsts/s
Average:         0.00      0.00      0.00      0.00      0.00

三个关键指标

active/s： The number of times TCP connections have made a direct transition to the SYN-SENT state from the CLOSED state per second [tcpActiveOpens]. 意思就是本机发起的TCP连接数/s
passive/s： The number of times TCP connections have made a direct transition to the SYN-RCVD state from the LISTEN state per second [tcpPassiveOpens]. 远程发起的TCP连接数/s
retrans/s：The total number of segments retransmitted per second - that is, the number of TCP segments transmitted containing one or more previously transmitted octets [tcpRetransSegs]. 报文重传数量/s
重传数量大意味着丢包。

netstat
Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships
这个主要是用来检测网络状态的，可能和“性能”没有直接关系，但也能帮我们查出很多信息。比如当前TIME_WAIT比较多的程序是什么。

[root@localhost ~]# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      16308/mysqld        
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN      1584/distccd        
tcp        0      0 192.168.122.1:53        0.0.0.0:*               LISTEN      2823/dnsmasq        
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1577/sshd           
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      1580/cupsd          
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      2235/master         
tcp        0      0 10.10.154.22:22        10.10.157.52:51734     ESTABLISHED 27116/sshd: root@no 
tcp        0      0 10.10.154.22:22        10.10.157.52:56956     ESTABLISHED 7862/sshd: root@pts 
tcp        0      0 10.10.154.22:3306      10.10.154.148:58896    ESTABLISHED 16308/mysqld        
tcp6       0      0 :::22                   :::*                    LISTEN      1577/sshd           
tcp6       0      0 ::1:631                 :::*                    LISTEN      1580/cupsd          
tcp6       0      0 ::1:25                  :::*                    LISTEN      2235/master

常用参数如下

-t						tcp
-a, --all                display all sockets (default: connected)
-n, --numeric            don't resolve names
-p, --programs           display PID/Program name for sockets
-l, --listening          display listening server sockets
``


参考  
[0] https://baike.baidu.com/item/uptime/8818329?fr=aladdin  
[1] https://blog.youkuaiyun.com/tencent_teg/article/details/106561235  
[2] https://linux.die.net/man/1/sar  
[3] https://linux.die.net/man/1/iostat  
[4] https://linux.die.net/man/1/pmstat
[5] 高性能MySQL