eBCC性能分析最佳实践(1) - 线上lstat, vfs_fstatat 开销高情景分析 ...

本文深入探讨eBCC在Aliyun Linux 2上的应用,通过案例分析lstat系统调用的高频问题,揭示logagent如何影响系统性能,并提供解决策略。

Guide:

0. Intro

BCC是基于4.x kernel版本上的ebpf发展出来的一套性能分析工具集;

eBCC,顾名思义则是extended BCC的缩写,是阿里巴巴内核团队在Aliyun Linux 2 上对BCC项目的拓展,包含BCC本身已有的工具集,和我们新开发的一些小的工具; eBCC则是基于在最新的BCC版本0.9之上做了一些拓展。

Aliyun Linux则使用了相对比较前沿,较新的kernel版本,支持ebpf特性,所以,如果想尝试ebpf,eBCC带来的系统上对“性能优化调试” 和 “问题诊断”上的新体验,那就请赶快升级到 Aliyun Linux 2 上吧。

1. eBCC 分析 "A业务线" 线上机器cpu sys - lstat开销大

对于kernel团队的同学来说,排查问题,和性能数据分析的时候,通常情况下会先从宏观的脚本去分析整机各个子系统:SCHED,MEM,IO,NET的性能是否存在bottleneck。然后再通过各种专业的工具来剖析对应的子系统。

当我们遇到cpu sys很高的情况的时候,根据我们的经验,比较常见的可能原因为:

  1. 可能是某个kernel thread在疯狂
  2. 可能是某个syscall被用户层频繁调用,或者是某个syscall的链路上出现很大的latency

第一种情况可能比较好发现问题,针对第二种情况,比较麻烦,难以发现。

于是我们开发了eBCC工具集,来帮助用户定位问题。

下面让我们开始寻找一个线上机器, 看是否存在性能瓶颈吧....

  • 业务方: A业务
  • 诊断物理机: xxxxx
  • 内核版本: 4.19
  • 诊断工具: eBCC

1.1 SYS级别诊断 - 发现问题

如下图所示,是目标机器top10开销较大的syscall。其中TIME, MIN, MAX的时间单位都是ns

注意,这里我们的工具实际上采集了4个数据,次数,时间,最小时间,最大时间。

实际上,只有这四个数据都存在的时候,我们才能很快的分析出这里是否存在问题,是否正常。 如果我们仅仅知道COUNT,不知道时间,这是不能说明问题的。

因为次数多,不一定代表开销大。开销大,不一定代表次数多。

实际上select,poll这些操作所占用的时间开销是属于正常的的,但是,看到lstat系统调用时间开销多达338769324ns(338ms), 就是不正常的现象了! 因此,这可能就是一个可以优化的地方...

进一步分析lstat的MIN TIME, MAX TIME看上去都不大,因此可以确定这个syscall链路上并没有出现latency很大的问题,但是COUNT多达124724次,这也许才是开销很大的关键。。。

换几台线上机器,发现都存在lstat开销比较大的问题...

image

因此我们需要进一步排查哪个pid导致?

1.2 PID级别诊断 - 定位问题

  • eBCC的syscall性能分析模块,支持SYS级别和PID级别的数据采集, 我们一般应该先从SYS级别入手,发现问题的方向之后,再通过PID级别找到触发问题的元凶。
  • 第1列: pid, 第2列: syscall NR , 第3列: syscall name, 第4列 是调用次数, 第5列 是调用时间开销,单位ns, 第6列: min time (ns), 第7列: max time(ns)

image

1.3 找到元凶: PID:91885

image

1.4 Code分析

tools/perf/builtin-trace.c:     { .name     = "lstat",      .errmsg = true, .alias = "newlstat", },

SYSCALL_DEFINE2(newlstat, const char __user *, filename,
        struct stat __user *, statbuf)
{
    struct kstat stat;
    int error;

    error = vfs_lstat(filename, &stat);
    if (error)
        return error;

    return cp_new_stat(&stat, statbuf);
}

int vfs_lstat(const char __user *name, struct kstat *stat)
{
    return vfs_fstatat(AT_FDCWD, name, stat, AT_SYMLINK_NOFOLLOW);
}
EXPORT_SYMBOL(vfs_lstat);

1.5 perf辅助证明

通过perf来帮助证明eBCC的数据采集是否精确。

[root@xxx /root]
#perf stat -e syscalls:sys_enter_newlstat  -a sleep 4

 Performance counter stats for 'system wide':

            449931      syscalls:sys_enter_newlstat

       4.000941076 seconds time elapsed

基本可以说明,lstat的触发次数确实很大。

1.6 syscall次数统计

[root@xxx /root]
#ebcc syscount
Tracing syscalls, printing top 10... Ctrl+C to quit.

^C[14:58:52]
SYSCALL                   COUNT
close                   1173795
lstat                   1052795
futex                    667981
read                     325349
getdents64               257280
epoll_wait               255386
epoll_ctl                250355
write                    144328
openat                   115240
open                     111810

eBCC另外一个工具syscount 同样可以采集出次数。

1.7 vfs层证明

lstat 系统调用同样会反应到vfs的触发次数: vfs_fstatat

[root@xxx /root]
#ebcc vfscount
Tracing... Ctrl-C to end.

^C
ADDR             FUNC                          COUNT
ffffffff8124f8b1 vfs_symlink                       1
ffffffff8124f711 vfs_create                       11
ffffffff812426e1 vfs_readv                        22
ffffffff8124fbd1 vfs_unlink                       27
ffffffff81277c31 vfs_fsync_range                  98
ffffffff812790d1 vfs_statfs                     1126
ffffffff81269d51 vfs_getxattr                   1248
ffffffff812428f1 vfs_writev                     1749
ffffffff81250b01 vfs_rename                     2353
ffffffff8129bb51 vfs_lock_file                  6167
ffffffff8124d031 vfs_get_link                  21829
ffffffff812476b1 vfs_fstat                     50744
ffffffff81242301 vfs_write                    151061
ffffffff81240891 vfs_open                     181734
ffffffff812421d1 vfs_read                     336899
ffffffff81247711 vfs_fstatat                 1055146    --- tigger counts
ffffffff81247681 vfs_getattr                 1109493
ffffffff81247401 vfs_getattr_nosec           1130229

1.8 核心分析:开销来自于 “用户调用次数” 还是 “内核里vfs_fstatat 函数本身的latency” ?

Answer:

如下图所示,基本都在2-3个us就完成了。基本上可以说,latency不是很大,基本符合disk,nvme本身的latency,因此,开销主要来自counts,因此,优化调用次数,比 优化 这个内核函数本身的latency要有意义。

image

1.9 根本原因

通过eBCC中的工具追踪,发现logagent疯狂的使用vfs_fstatat,遍历A业务/home/admin/XXXX/worker/slave/disk_links/9999/ 目录下的文件, 至此,我们基本就把问题分析清楚了。

[root@xxx /home]
#/usr/share/ebcc/tools/trace -p 91885 'vfs_fstatat "%s", arg2'
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc
91885   121566  logagent        vfs_fstatat      /home/admin/XXXX/worker/slave/disk_links/9999/auto_umars_YYYY_4_YYYY.TPP_mc

2. Conclution

image

  • logagent在这个机器上有大量的lstat操作,平均每秒:11w次; 整机open次数,平均每秒:12w次。
  • 整机看vfs_fstatat 本身的latency,基本都在2-3us内完成, 没有太大的内核态似乎没有太大latency可以优化,因此,用户态的程序 优化 也许有更好的 效果。
  • logagent在机器运行1s内,大概开销占用总cpu的:338ms; 现在整机96core,那么就相当于有96s, 如果整机cpu sys占用:3%, 机器运行1s内,内核sys大概占用了3.456s,那么如果优化掉这338ms,cpu sys利用率会降低9.7%=(338ms/3.456s)。
  • 查看“A业务线”多个机器,发现是个通用现象。
pip uninstall numpy Found existing installation: numpy 2.0.2 Uninstalling numpy-2.0.2: Would remove: c:\users\liang\appdata\roaming\python\python39\scripts\f2py.exe c:\users\liang\appdata\roaming\python\python39\scripts\numpy-config.exe c:\users\liang\appdata\roaming\python\python39\site-packages\numpy-2.0.2-cp39-cp39-win_amd64.whl c:\users\liang\appdata\roaming\python\python39\site-packages\numpy-2.0.2.dist-info\* c:\users\liang\appdata\roaming\python\python39\site-packages\numpy.libs\.load-order-numpy-2.0.2 c:\users\liang\appdata\roaming\python\python39\site-packages\numpy.libs\libscipy_openblas64_-caad452230ae4ddb57899b8b3a33c55c.dll c:\users\liang\appdata\roaming\python\python39\site-packages\numpy.libs\msvcp140-23ebcc0b37c8e3d074511f362feac48b.dll c:\users\liang\appdata\roaming\python\python39\site-packages\numpy\* Proceed (Y/n)? y Successfully uninstalled numpy-2.0.2 ERROR: Exception: Traceback (most recent call last): File "D:\anaconda\lib\site-packages\pip\_internal\cli\base_command.py", line 167, in exc_logging_wrapper status = run_func(*args) File "D:\anaconda\lib\site-packages\pip\_internal\commands\uninstall.py", line 103, in run uninstall_pathset.commit() File "D:\anaconda\lib\site-packages\pip\_internal\req\req_uninstall.py", line 424, in commit self._moved_paths.commit() File "D:\anaconda\lib\site-packages\pip\_internal\req\req_uninstall.py", line 277, in commit save_dir.cleanup() File "D:\anaconda\lib\site-packages\pip\_internal\utils\temp_dir.py", line 173, in cleanup rmtree(self._path) File "D:\anaconda\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 326, in wrapped_f return self(f, *args, **kw) File "D:\anaconda\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 406, in __call__ do = self.iter(retry_state=retry_state) File "D:\anaconda\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 362, in iter raise retry_exc.reraise() File "D:\anaconda\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 195, in reraise raise self.last_attempt.result() File "D:\anaconda\lib\concurrent\futures\_base.py", line 439, in result return self.__get_result() File "D:\anaconda\lib\concurrent\futures\_base.py", line 391, in __get_result raise self._exception File "D:\anaconda\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 409, in __call__ result = fn(*args, **kwargs) File "D:\anaconda\lib\site-packages\pip\_internal\utils\misc.py", line 128, in rmtree shutil.rmtree(dir, ignore_errors=ignore_errors, onerror=rmtree_errorhandler) File "D:\anaconda\lib\shutil.py", line 759, in rmtree return _rmtree_unsafe(path, onerror) File "D:\anaconda\lib\shutil.py", line 629, in _rmtree_unsafe onerror(os.unlink, fullname, sys.exc_info()) File "D:\anaconda\lib\shutil.py", line 627, in _rmtree_unsafe os.unlink(fullname) PermissionError: [WinError 5] 拒绝访问。: 'C:\\Users\\liang\\AppData\\Roaming\\Python\\Python39\\site-packages\\~umpy.libs\\libscipy_openblas64_-caad452230ae4ddb57899b8b3a33c55c.dll' C:\Users\liang>pip install numpy==1.23.5 Defaulting to user installation because normal site-packages is not writeable Collecting numpy==1.23.5 Downloading numpy-1.23.5-cp39-cp39-win_amd64.whl (14.7 MB) ╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.2/14.7 MB 8.5 kB/s eta 0:28:18 ERROR: Exception: Traceback (most recent call last): File "D:\anaconda\lib\site-packages\pip\_vendor\urllib3\response.py", line 435, in _error_catcher yield File "D:\anaconda\lib\site-packages\pip\_vendor\urllib3\response.py", line 516, in read data = self._fp.read(amt) if not fp_closed else b"" File "D:\anaconda\lib\site-packages\pip\_vendor\cachecontrol\filewrapper.py", line 90, in read data = self.__fp.read(amt) File "D:\anaconda\lib\http\client.py", line 463, in read n = self.readinto(b) File "D:\anaconda\lib\http\client.py", line 507, in readinto n = self.fp.readinto(b) File "D:\anaconda\lib\socket.py", line 704, in readinto return self._sock.recv_into(b) File "D:\anaconda\lib\ssl.py", line 1242, in recv_into return self.read(nbytes, buffer) File "D:\anaconda\lib\ssl.py", line 1100, in read return self._sslobj.read(len, buffer) socket.timeout: The read operation timed out During handling of the above exception, another exception occurred: Traceback (most recent call last): File "D:\anaconda\lib\site-packages\pip\_internal\cli\base_command.py", line 167, in exc_logging_wrapper status = run_func(*args) File "D:\anaconda\lib\site-packages\pip\_internal\cli\req_command.py", line 247, in wrapper return func(self, options, args) File "D:\anaconda\lib\site-packages\pip\_internal\commands\install.py", line 369, in run requirement_set = resolver.resolve( File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\resolver.py", line 92, in resolve result = self._result = resolver.resolve( File "D:\anaconda\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 481, in resolve state = resolution.resolve(requirements, max_rounds=max_rounds) File "D:\anaconda\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 348, in resolve self._add_to_criteria(self.state.criteria, r, parent=None) File "D:\anaconda\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 172, in _add_to_criteria if not criterion.candidates: File "D:\anaconda\lib\site-packages\pip\_vendor\resolvelib\structs.py", line 151, in __bool__ return bool(self._sequence) File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\found_candidates.py", line 155, in __bool__ return any(self) File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\found_candidates.py", line 143, in <genexpr> return (c for c in iterator if id(c) not in self._incompatible_ids) File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\found_candidates.py", line 47, in _iter_built candidate = func() File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\factory.py", line 206, in _make_candidate_from_link self._link_candidate_cache[link] = LinkCandidate( File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\candidates.py", line 297, in __init__ super().__init__( File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\candidates.py", line 162, in __init__ self.dist = self._prepare() File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\candidates.py", line 231, in _prepare dist = self._prepare_distribution() File "D:\anaconda\lib\site-packages\pip\_internal\resolution\resolvelib\candidates.py", line 308, in _prepare_distribution return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True) File "D:\anaconda\lib\site-packages\pip\_internal\operations\prepare.py", line 438, in prepare_linked_requirement return self._prepare_linked_requirement(req, parallel_builds) File "D:\anaconda\lib\site-packages\pip\_internal\operations\prepare.py", line 483, in _prepare_linked_requirement local_file = unpack_url( File "D:\anaconda\lib\site-packages\pip\_internal\operations\prepare.py", line 165, in unpack_url file = get_http_url( File "D:\anaconda\lib\site-packages\pip\_internal\operations\prepare.py", line 106, in get_http_url from_path, content_type = download(link, temp_dir.path) File "D:\anaconda\lib\site-packages\pip\_internal\network\download.py", line 147, in __call__ for chunk in chunks: File "D:\anaconda\lib\site-packages\pip\_internal\cli\progress_bars.py", line 53, in _rich_progress_bar for chunk in iterable: File "D:\anaconda\lib\site-packages\pip\_internal\network\utils.py", line 63, in response_chunks for chunk in response.raw.stream( File "D:\anaconda\lib\site-packages\pip\_vendor\urllib3\response.py", line 573, in stream data = self.read(amt=amt, decode_content=decode_content) File "D:\anaconda\lib\site-packages\pip\_vendor\urllib3\response.py", line 538, in read raise IncompleteRead(self._fp_bytes_read, self.length_remaining) File "D:\anaconda\lib\contextlib.py", line 137, in __exit__ self.gen.throw(typ, value, traceback) File "D:\anaconda\lib\site-packages\pip\_vendor\urllib3\response.py", line 440, in _error_catcher raise ReadTimeoutError(self._pool, None, "Read timed out.") pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.
最新发布
06-03
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值