spinlock引起系统softlockup分析

本文详细分析了一次因spinlock导致的softlockup故障。通过分析堆栈信息,定位到问题是由ext4文件系统中spinlock引发的死锁,并进一步追踪到具体进程及代码路径,最终确认为ext4的原生bug。

业务反馈机房中有机器经常假死,业务无响应,登陆上去分析,原来是发生了softlockup.

一 softlockup发生原因

1.死锁(等待锁资源)

2.进程一直在某个循环运行,缺少调度检查(cond_resched()

3.当前任务关抢占时间太长(preempt_disablespin_lock())

4.中断风暴(irq storm)导致CPU无法进行调度

5.softirq/tasklet执行时间太长,导致CPU无法进行调度

6.实时线程占有cpu,饿死watchdog线程(资源超卖)

7.调度器bug(nr_running计算出错)

8.硬件bug(cpu在idle状态)

9.虚拟化超卖导致cpu 出现严重的steal,从而出现softlockup(steal很高)

二 分析过程

1,首先查看引起softlockup的是因为拿不到ext4 的group block spinlock. 可以确定softlockup是因为spinlock导致的死锁问题.

2.谁拿走了spinlock 

使用bt -a查看所有cpu堆栈,发现全部cpu都已经死锁了,而且都在两个点,

2766246.212211]  [<ffffffff81741bb0>] _raw_spin_lock+0x20/0x30
[2766246.212238]  [<ffffffffa029e606>] ext4_free_inode+0x536/0x650 [ext4]
[2766246.212249]  [<ffffffffa02a8cfb>] ext4_evict_inode+0x44b/0x4c0 [ext4]
[2766246.212252]  [<ffffffff8126d05a>] evict+0xba/0x190
[2766246.212254]  [<ffffffff8126d4d2>] iput+0x1b2/0x230
[2766246.212257]  [<ffffffff8126720b>] dentry_unlink_inode+0xab/0xe0
[2766246.212260]  [<ffffffff812681e6>] __dentry_kill+0xb6/0x160
[2766246.212262]  [<ffffffff812683f1>] dput+0x161/0x270
[2766246.212266]  [<ffffffffa050c170>] ovl_dentry_release+0x20/0x60 [overlay]
[2766246.212268]  [<ffffffff81268205>] __dentry_kill+0xd5/0x160


2766246.212454]  [<ffffffff81741bb0>] _raw_spin_lock+0x20/0x30
[2766246.212454]  [<ffffffffa029eb41>] __ext4_new_inode+0x421/0x14b0 [ext4]
[2766246.212455]  [<ffffffffa02b29f6>] ext4_create+0xc6/0x1c0 [ext4]
[2766246.212456]  [<ffffffff8125cd17>] vfs_create+0x127/0x1a0
[2766246.212456]  [<ffffffffa050f3bb>] ovl_create_real+0xab/0x220 [overlay]
[2766246.212457]  [<ffffffffa0510693>] ovl_create_or_link.part.5+0x1e3/0x6e0 [overlay]
[2766246.212457]  [<ffffffffa050dba9>] ? ovl_override_creds+0x19/0x20 [overlay]
[2766246.212458]  [<ffffffffa0512a38>] ? ovl_copy_up+0xc8/0x137 [overlay]
[2766246.212459]  [<ffffffff8126c1c0>] ? alloc_inode+0x30/0x80
[2766246.212459]  [<ffffffff8126c05b>] ? inode_sb_list_add+0x3b/0x50

也就是拿走spinlock的进程,没有在cpu上运行,证明进程拿了spinlock,却被调度出去了

用foreach bt -a 查看所有进程堆栈,找到一个可疑进程

PID: 20410  TASK: ffff8831bb6d0000  CPU: 2   COMMAND: "nginx"
 #0 [ffffc900465d7820] __schedule at ffffffff8173ca3b
 #1 [ffffc900465d78a8] _cond_resched at ffffffff8173d1c6
 #2 [ffffc900465d78c0] __getblk_gfp at ffffffff81289acf
 #3 [ffffc900465d7930] find_inode_bit at ffffffffa029d368 [ext4]
 #4 [ffffc900465d7978] __ext4_new_inode at ffffffffa029ee33 [ext4]
 #5 [ffffc900465d7a30] ext4_create at ffffffffa02b29f6 [ext4]
 #6 [ffffc900465d7aa8] vfs_create at ffffffff8125cd17
 #7 [ffffc900465d7ae8] ovl_create_real at ffffffffa050f3bb [overlay]
 #8 [ffffc900465d7b20] ovl_create_or_link at ffffffffa0510693 [overlay]

查看源代码
__ext4_new_inode在拿了group的spinlock后,调用了find_inode_bit,最终调用到可休眠接口__getblk_gfp,导致被调度出去.

很显然这个ext4的原生bug. 查看linux主线,以及修复了这个bug.

[ 116.079461][ C5] watchdog: BUG: soft lockup - CPU#5 stuck for 49s! [kswapd0:91] [ 116.079472][ C5] CPU#5 Utilization every 4s during lockup: [ 116.079475][ C5] #1: 100% system, 0% softirq, 0% hardirq, 0% idle [ 116.079479][ C5] #2: 100% system, 0% softirq, 1% hardirq, 0% idle [ 116.079482][ C5] #3: 100% system, 0% softirq, 1% hardirq, 0% idle [ 116.079485][ C5] #4: 100% system, 0% softirq, 0% hardirq, 0% idle [ 116.079489][ C5] #5: 100% system, 0% softirq, 1% hardirq, 0% idle [ 116.080193][ C5] Kernel panic - not syncing: softlockup: hung tasks [ 116.080196][ C5] CPU: 5 PID: 91 Comm: kswapd0 Tainted: G C OEL 6.6.77-android15-8-maybe-dirty-debug #1 87d0bdbac97ce09587b16630b722a5b52b4aa5e1 [ 116.080201][ C5] Hardware name: Qualcomm Technologies, Inc. Kunzite QRD (DT) [ 116.080204][ C5] Call trace: [ 116.080206][ C5] dump_backtrace+0xf0/0x140 [ 116.080210][ C5] show_stack+0x18/0x28 [ 116.080213][ C5] dump_stack_lvl+0x70/0xa4 [ 116.080218][ C5] panic+0x158/0x3e4 [ 116.080222][ C5] watchdog_timer_fn+0x394/0x494 [ 116.080227][ C5] __hrtimer_run_queues+0x1d8/0x40c [ 116.080232][ C5] hrtimer_interrupt+0xf4/0x3b8 [ 116.080237][ C5] arch_timer_handler_virt+0x50/0x64 [ 116.080241][ C5] handle_percpu_devid_irq+0x100/0x320 [ 116.080246][ C5] generic_handle_domain_irq+0x5c/0x88 [ 116.080251][ C5] gic_handle_irq+0x4c/0x114 [ 116.080255][ C5] call_on_irq_stack+0x3c/0x70 [ 116.080258][ C5] do_interrupt_handler+0x7c/0xe8 [ 116.080263][ C5] el1_interrupt+0x34/0x58 [ 116.080267][ C5] el1h_64_irq_handler+0x18/0x24 [ 116.080270][ C5] el1h_64_irq+0x68/0x6c [ 116.080273][ C5] queued_spin_lock_slowpath+0x9c/0x51c [ 116.080278][ C5] do_raw_spin_lock+0x104/0x120 [ 116.080282][ C5] _raw_spin_lock+0x74/0x98 [ 116.080286][ C5] __swap_duplicate+0xa4/0x1fc [ 116.080290][ C5] swap_duplicate+0x24/0x58 [ 116.080294][ C5] try_to_unmap_one+0x620/0xf40 [ 116.080299][ C5] rmap_walk_anon+0x1f8/0x294 [ 116.080304][ C5] try_to_unmap+0x5c/0x9c [ 116.080309][ C5] shrink_folio_list+0x7f8/0x15bc [ 116.080314][ C5] shrink_inactive_list+0x2a0/0x574 [ 116.080319][ C5] shrink_lruvec+0x54c/0xa1c [ 116.080323][ C5] shrink_node+0x270/0x11c4 [ 116.080327][ C5] balance_pgdat+0x8c0/0x1140 [ 116.080331][ C5] kswapd+0x35c/0x6cc [ 116.080335][ C5] kthread+0x118/0x158 [ 116.080340][ C5] ret_from_fork+0x10/0x20 [ 116.080344][ C5] SMP: stopping secondary CPUs 基线升级后等锁触发watchdog,我要怎么看所被谁占有了
最新发布
10-27
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值