linux kernel内核slab内存泄露debug经验

  1. 打开内核的SLUB DEBUG选项

+CONFIG_SLUB_DEBUG=y

+CONFIG_SLUB_DEBUG_ON=y

  1. 观察slabinfo

cat /proc/slabinfo

启动后记录下slabinfo。运行一段时间,再观察slabinfo。

找到增长比较大的slab。

  1. 打开slab trace

echo 1 > /sys/kernel/slab/<leaking_slab>/trace
打开以后slab trace会向console打印。

如果console是串口的话很有可能把系统打的无响应。最好写一个脚本。运行一段时间后关闭slab

echo 1 > /sys/kernel/slab/<leaking_slab>/trace

sleep 60

echo 0 > /sys/kernel/slab/<leaking_slab>/trace
4. 分析

打印的slab trace大概张这样

[47744.480000] TRACE kmalloc-128 alloc 0x83df8300 inuse=16 fp=0x (null)
[47744.480000] Call Trace:
[47744.480000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.480000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.480000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.480000] [<800df2c0>] __kmalloc+0x98/0x148
[47744.480000] [<8308ad74>] amalloc_private+0x38/0x13c [asf]
[47744.480000] [<82aba2a8>] osif_forward_mgmt_to_app+0xa0/0x280 [umac]
[47744.480000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.480000]
[47744.530000] TRACE kmalloc-128 free 0x83df8300 inuse=16 fp=0x (null)
[47744.530000] Object 83df8300: 4d 61 6e 61 67 65 2e 70 72 6f 62 5f 72 65 71 20 Manage.prob_req
[47744.530000] Object 83df8310: 35 30 00 00 00 00 00 00 00 00 00 00 00 00 40 00 50…@.
[47744.530000] Object 83df8320: 00 00 ff ff ff ff ff ff 78 11 dc 0c 55 34 ff ff …x…U4…
[47744.530000] Object 83df8330: ff ff ff ff 70 ad 00 08 63 68 5f 42 38 5f 32 47 …p…ch_B8_2G
[47744.530000] Object 83df8340: 01 08 8b 96 82 84 0c 18 30 60 32 04 6c 12 24 48 …02.l.$H [47744.530000] Object 83df8350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [47744.530000] Object 83df8360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [47744.530000] Object 83df8370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [47744.530000] Call Trace: [47744.530000] [<8027c4b4>] dump_stack+0x8/0x34 [47744.530000] [<8027d81c>] free_debug_processing+0x19c/0x218 [47744.530000] [<8027d8dc>] __slab_free+0x44/0x280 [47744.530000] [<82aba324>] osif_forward_mgmt_to_app+0x11c/0x280 [umac] [47744.530000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac] [47744.530000] [47744.650000] TRACE kmalloc-128 alloc 0x830e0b00 inuse=16 fp=0x (null) [47744.650000] Call Trace: [47744.650000] [<8027c4b4>] dump_stack+0x8/0x34 [47744.650000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c [47744.650000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350 [47744.650000] [<800df2c0>] __kmalloc+0x98/0x148 [47744.650000] [<8308ad74>] amalloc_private+0x38/0x13c [asf] [47744.650000] [<82aba2a8>] osif_forward_mgmt_to_app+0xa0/0x280 [umac] [47744.650000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac] [47744.650000] [47744.700000] TRACE kmalloc-128 free 0x830e0b00 inuse=10 fp=0x830e0300 [47744.700000] Object 830e0b00: 4d 61 6e 61 67 65 2e 70 72 6f 62 5f 72 65 71 20 Manage.prob_req [47744.700000] Object 830e0b10: 38 36 00 00 00 00 00 00 00 00 00 00 00 00 40 00 86............@. [47744.700000] Object 830e0b20: 00 00 ff ff ff ff ff ff 78 11 dc 32 e2 53 ff ff ........x..2.S.. [47744.700000] Object 830e0b30: ff ff ff ff f0 8f 00 0d 58 69 61 6f 6d 69 5f 46 ........Xiaomi_F [47744.700000] Object 830e0b40: 61 6d 69 6c 79 01 08 02 04 0b 0c 12 16 18 24 03 amily.........$. [47744.700000] Object 830e0b50: 01 04 2d 1a 00 00 03 ff 00 00 00 00 00 00 00 00 ..-............. [47744.700000] Object 830e0b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 32 04 ..............2. [47744.700000] Object 830e0b70: 30 48 60 6c 00 00 00 00 00 00 00 00 00 00 00 00 0Hl…
[47744.700000] Call Trace:
[47744.700000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.700000] [<8027d81c>] free_debug_processing+0x19c/0x218
[47744.700000] [<8027d8dc>] __slab_free+0x44/0x280
[47744.700000] [<82aba324>] osif_forward_mgmt_to_app+0x11c/0x280 [umac]
[47744.700000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.700000]
[47744.810000] TRACE kmalloc-128 alloc 0x830e0b00 inuse=16 fp=0x (null)
[47744.810000] Call Trace:
[47744.810000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.810000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.810000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.810000] [<800dec80>] kmem_cache_alloc+0x3c/0xe4
[47744.810000] [<801c53b8>] sock_alloc_inode+0x4c/0xc4
[47744.810000] [<800f9080>] alloc_inode+0x28/0xac
[47744.810000] [<800fa328>] new_inode_pseudo+0x10/0x30
[47744.810000] [<801c6560>] sock_alloc+0x1c/0x80
[47744.810000] [<801c6b30>] __sock_create+0x8c/0x1cc
[47744.810000] [<801c6cec>] sock_create+0x38/0x44
[47744.810000] [<801c7294>] sys_socket+0x38/0x7c
[47744.810000] [<8006d8c4>] stack_done+0x20/0x40
[47744.810000]
[47744.900000] TRACE kmalloc-128 alloc 0x830e0500 inuse=16 fp=0x (null)
[47744.900000] Call Trace:
[47744.900000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.900000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.900000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.900000] [<800df2c0>] __kmalloc+0x98/0x148
[47744.900000] [<8308ad74>] amalloc_private+0x38/0x13c [asf]
[47744.900000] [<82aba2a8>] osif_forward_mgmt_to_app+0xa0/0x280 [umac]
[47744.900000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.900000]
[47744.950000] TRACE kmalloc-128 free 0x830e0500 inuse=11 fp=0x830e0300
[47744.950000] Object 830e0500: 4d 61 6e 61 67 65 2e 70 72 6f 62 5f 72 65 71 20 Manage.prob_req
[47744.950000] Object 830e0510: 37 39 00 00 00 00 00 00 00 00 00 00 00 00 40 00 79…@.
[47744.950000] Object 830e0520: 00 00 ff ff ff ff ff ff f0 b4 29 07 10 22 ff ff …)…"…
[47744.950000] Object 830e0530: ff ff ff ff 00 9f 00 06 4d 49 2d 4d 41 43 01 08 …MI-MAC…
[47744.950000] Object 830e0540: 02 04 0b 0c 12 16 18 24 03 01 03 2d 1a 00 00 03 …$…-…
[47744.950000] Object 830e0550: ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
[47744.950000] Object 830e0560: 00 00 00 00 00 00 00 32 04 30 48 60 6c 00 00 00 …2.0H`l…
[47744.950000] Object 830e0570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
分析起来比较困难。土法写了个脚本。将trace保存为kmalloc-t.txt

grep “TRACE kmalloc-128 alloc” kmalloc-t.txt | awk ‘{print $5}’ | sort > alloc.txt

grep “TRACE kmalloc-128 free” kmalloc-t.txt | awk ‘{print $5}’ | sort > free.txt
将alloc和free简单做一个排序。 然后通过bcompare或者vimdiff看一下同一个slab的alloc和free是否成对出现。

比较清晰的能看出来哪一个内存快没有free。

再去kmalloc-t.txt中检查一下free.txt中消失的内存块。手工分析一下是否是可疑的内存泄露点

SLAB
如果是slab的话,有两种常见方法:一是利用debug kernel的slab leak辅助功能,二是利用systemtap等工具。参见https://access.redhat.com/solutions/358933

使用kernel的DEBUG_SLAB_LEAK功能
这需要kernel编译的时候打开了”CONFIG_DEBUG_SLAB_LEAK”选项才行,默认是没打开的。

对RHEL或CentOS来说,debug kernel打开了此编译选项,可以安装名为kernel-debug-*的rpm软件包,然后重启系统并选择此debug kernel即可。

完成后/proc目录下会出现一个名为slab_allocators的文件,里面会记录类似如下的slab分配的信息,注意观察是什么代码在分配slab,有助于找到可疑的泄漏点。缺点是只记录了直接调用的函数,没有完整的backtrace:

1
2
3
4
5
6
7
8
9
10
11
12
13
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
使用systemtap
除了使用debug kernel之外,还有个方法就是用systemtap,对内核适当的位置植入探针,有助于找到可疑的slab分配,这需要对内核有一定的了解才行。

普通的slab cache是通过kmem_cache_alloc来分配的,可以用现成的systemtap probe vm.kmem_cache_alloc进行观测。但是在本例中不适用,因为本例中”size-4096″属于slab里的general purpose cache,是供kmalloc()使用的,所以systemtap应该针对kmalloc()进行探测,这里有一个现成的脚本 “kmalloc-top“,它的原理是对__kmalloc()下探针,记录backtraces,因为__kmalloc是实现kmalloc()的核心函数,有的代码会直接调用__kmalloc,所以探测它而不是kmalloc()才不会有遗漏。以上的脚本没有记录kmalloc的size,所以我修改了一下,加上了kmalloc size,修改过的内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#The systemtap script that instruments the kmalloc
$script="
global kmalloc_stack

probe kernel.function("__kmalloc") { kmalloc_stack[$size, backtrace()]++ }

probe timer.ms(100), end
{
foreach ([size, stack] in kmalloc_stack) {
printf("\n")
printf(" kmalloc size %d\n", size)
print_syms(stack)
printf("\n")
printf("%d\n", kmalloc_stack[size, stack])
}
delete kmalloc_stack
}
";
以root身份执行:

1

./kmalloc-top -o ‘–all-modules’ > /tmp/kmtop.out

间隔一段时间再ctrl-c退出,看到结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

This path seen 1021 times:
kmalloc size 4096
0xffffffff811783e0 : __kmalloc+0x0/0x230 [kernel]
0xffffffffa022401e : 0xffffffffa022401e [sisips]
0xffffffffa024d46f : 0xffffffffa024d46f [sisips]
0xffffffffa023b763 : 0xffffffffa023b763 [sisips]
0xffffffffa022abca : 0xffffffffa022abca [sisips]
0xffffffffa022d51a : 0xffffffffa022d51a [sisips]
0xffffffff81290745 : _atomic_dec_and_lock+0x55/0x80 [kernel]
0xffffffff81193611 : __fput+0x1a1/0x210 [kernel]
0xffffffff810e884e : __audit_syscall_exit+0x25e/0x290 [kernel]
0xffffffff8100b0d2 : system_call_fastpath+0x16/0x1b [kernel]

This path seen 1021 times:
kmalloc size 4096
0xffffffff811783e0 : __kmalloc+0x0/0x230 [kernel]
0xffffffffa022401e : 0xffffffffa022401e [sisips]
0xffffffffa0224a32 : 0xffffffffa0224a32 [sisips]
0xffffffffa022abac : 0xffffffffa022abac [sisips]
0xffffffffa024f5c8 : 0xffffffffa024f5c8 [sisips]
0xffffffffa022d51a : 0xffffffffa022d51a [sisips]
0xffffffff81290745 : _atomic_dec_and_lock+0x55/0x80 [kernel]
0xffffffff81193611 : __fput+0x1a1/0x210 [kernel]
0xffffffff810e884e : __audit_syscall_exit+0x25e/0x290 [kernel]
0xffffffff8100b0d2 : system_call_fastpath+0x16/0x1b [kernel]

This path seen 853 times:
kmalloc size 4096
0xffffffff811783e0 : __kmalloc+0x0/0x230 [kernel]
0xffffffffa022401e : 0xffffffffa022401e [sisips]
0xffffffffa0222c30 : 0xffffffffa0222c30 [sisips]
0xffffffffa024d46f : 0xffffffffa024d46f [sisips]
0xffffffffa024dd49 : 0xffffffffa024dd49 [sisips]
0xffffffff81178001 : s_show+0x2c1/0x330 [kernel]
0xffffffffa02240bc : 0xffffffffa02240bc [sisips]
0xffffffffa023b783 : 0xffffffffa023b783 [sisips]
0xffffffffa022abca : 0xffffffffa022abca [sisips]
0xffffffffa022d51a : 0xffffffffa022d51a [sisips]
0xffffffff81290745 : _atomic_dec_and_lock+0x55/0x80 [kernel]
0xffffffff81193611 : __fput+0x1a1/0x210 [kernel]
0xffffffff810e884e : __audit_syscall_exit+0x25e/0x290 [kernel]
0xffffffff8100b0d2 : system_call_fastpath+0x16/0x1b [kernel]

可以看到,大量的size-4096分配来自内核模块”sisips”,有理由对它产生怀疑。(因为这是Symantec的内核模块,系统上没有它的debuginfo,所以systemtap解析不了它的backtrace符号,只能显示出16进制的地址)。为了验证该模块是否真的导致了内存泄露,可以暂时禁用它,观察/proc/slabinfo看size-4096是否停止疯涨,如果停了,显然该模块就有问题了。

另一种方法:kmemleak
检测内核内存泄漏还有另一种方法,就是利用kmemleak工具,它并不是针对某一个slab,而是针对所有的内核内存。详见:
用KMEMLEAK检测内核内存泄漏

### Linux Kernel Memory Leak Detection Tools For detecting or analyzing memory leaks within the Linux kernel, several specialized tools are available. These utilities provide comprehensive capabilities tailored specifically for identifying issues related to memory management in the kernel space. #### Valgrind with Kmemcheck Valgrind offers a suite of tools aimed at improving software quality and performance. Among these is **Kmemcheck**, which integrates directly into the Linux kernel through patches applied during compilation. This allows developers to monitor allocations made by drivers and modules while running tests against them under controlled conditions[^1]. When using such workstations equipped with sufficient resources including ample virtual memory, one gains confidence that extensive testing has identified potential problems before deploying applications onto less resourceful targets where debugging might be more challenging due to constraints like limited availability of swap spaces. #### Slab Allocator Debugging Options The slab allocator plays a crucial role in managing dynamic data structures efficiently inside kernels. By enabling specific flags when configuring your build environment (`CONFIG_DEBUG_SLAB`, `CONFIG_SLUB_DEBUG`), detailed information about object lifecycles becomes accessible via logs generated upon encountering anomalies such as double frees or invalid accesses outside allocated regions[^3]. Additionally, features provided include overwriting freed objects' contents with special patterns so they cannot accidentally cause harm later on should references persist improperly somewhere else unexpectedly. #### GDB Scripting Extensions GNU Debugger (GDB) extends beyond user-space processes; it also supports examining live systems alongside post-mortem analysis from crash dumps captured after failures occur. Custom scripts written leveraging Python scripting support built-in since version 7 allow automating repetitive tasks involved in tracing down elusive bugs hidden deep within complex interactions between subsystems operating concurrently across multiple cores/CPUs simultaneously[^5]. By employing commands similar to those utilized within WinDbg's !heap functionality but adapted appropriately according to context-specific requirements faced uniquely per scenario encountered whether developing new functionalities or maintaining existing ones already deployed widely today throughout diverse industries relying heavily upon robustness guarantees offered only possible thanks largely because of rigorous validation practices employed consistently throughout development cycles ensuring reliability even under extreme stress situations pushing boundaries further than ever imagined previously conceivable limits set forth historically speaking up until now. ```bash echo "debug.kmemleak=on" | sudo tee /sys/kernel/debug/kmemleak ``` This command enables kmemleak, another useful feature integrated into many distributions’ default configurations out-of-the-box without requiring any additional installations besides having access rights necessary for modifying system parameters dynamically at runtime safely guarded behind administrative privileges checks enforced strictly preventing unauthorized tampering attempts maliciously intended otherwise. --related questions-- 1. How does Kmemcheck integrate with Valgrind? 2. What configuration options enable detailed logging for slab allocators? 3. Can you explain how custom GDB scripts aid in diagnosing kernel-level issues? 4. In what scenarios would someone prefer using kmemleak over other methods mentioned here? 5. Are there alternative approaches not covered above worth exploring?
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值