1、问题描述
使用三方GPU项目monkey+reboot测试过程中发生黑屏、定屏等现象。分析现场确定问题,
a、有背光,按power key、touch上层log无输入事件,通过getevent可以读取event节点,底层input子系统有事件上报;
b、adb shell可连接,无大量D状态、R状态进程、system_server、launch、systemui、surfaceflinger有启动;
c、data分区未满;
d、watchdog没有触发system_server重启。
2、问题分析
定屏黑屏类问题经过上面初步分析是system_server 主线程被blocked,且watchdog线程没有kill system_server导致了定屏,确定了分析方向,我们就先来看下system_server栈信息。
----- pid 3949 at 2018-03-16 11:29:13 ----- Cmd line: system_server "main" prio=5 tid=1 Blocked | group="main"sCount=1 dsCount=0 flags=1 obj=0x72a4fc60 self=0x75804bea00 | sysTid=3949 nice=-2cgrp=default sched=0/0 handle=0x76051f19b0 | state=S schedstat=(8228478377 501420108 5264 ) utm=683 stm=139 core=1 HZ=100 |stack=0x7fc85c6000-0x7fc85c8000 stackSize=8MB | held mutexes= atcom.android.server.am.ActivityManagerService.broadcastIntent(ActivityManagerService.java:20195) - waiting to lock<0x0b6f5f01> (a com.android.server.am.ActivityManagerServiceEx) held by thread 105 atandroid.app.ActivityManager.broadcastStickyIntent(ActivityManager.java:4189) atandroid.app.ActivityManager.broadcastStickyIntent(ActivityManager.java:4179) atcom.android.server.BatteryService$9.run(BatteryService.java:587) |
从上述栈信息可知,system_server主线程在等待0x0b6f5f01锁,而这把锁被tid=105的线程持有。下面看下tid=105线程的栈
"Binder:3949_9" prio=5 tid=105 Blocked | group="main"sCount=1 dsCount=0 flags=1 obj=0x13a001d8 self=0x7566be8800 | sysTid=4812 nice=-2cgrp=default sched=0/0 handle=0x755b5f54f0 | state=S schedstat=(3587124213 577266984 5344 ) utm=290 stm=68 core=5 HZ=100 |stack=0x755b4fb000-0x755b4fd000 stackSize=1005KB | held mutexes= atcom.android.server.wm.StackWindowController.getBounds(StackWindowController.java:244) - waiting to lock<0x0452a7e7> (a com.android.server.wm.WindowHashMap) held by thread26 atcom.android.server.am.ActivityStack.getWindowContainerBounds(ActivityStack.java:574) atcom.android.server.am.ActivityStackSupervisor.getStackInfoLocked(ActivityStackSupervisor.java:4095) at com.android.server.am.ActivityStackSupervisor.getStackInfoLocked(ActivityStackSupervisor.java:4135) atcom.android.server.am.ActivityManagerService.getStackInfo(ActivityManagerService.java:11048) - locked<0x0b6f5f01> (a com.android.server.am.ActivityManagerServiceEx) atandroid.app.IActivityManager$Stub.onTransact(IActivityManager.java:2578) atcom.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:3003) atandroid.os.Binder.execTransact(Binder.java:723) |
从上面栈信息可知Binder:3949_9线程持有0x0b6f5f01锁无法释放是因为,0x0b6f5f01线程也在等锁,等待0x0452a7e7锁,而这把锁被tid=26的线程持有,看下tid=26的线程
"android.anim" prio=5 tid=26 Native | group="main"sCount=1 dsCount=0 flags=1 obj=0x13884af8 self=0x7566a99a00 | sysTid=4091 nice=-10cgrp=default sched=0/0 handle=0x7563e7e4f0 | state=S schedstat=(7924231078 545363027 4993 ) utm=731 stm=61 core=6 HZ=100 |stack=0x7563d7c000-0x7563d7e000 stackSize=1037KB | held mutexes= kernel:__switch_to+0xa8/0xbc kernel: poll_schedule_timeout+0x68/0xa4 kernel:do_sys_poll+0x3a8/0x48c kernel:SyS_ppoll+0x1f0/0x224 kernel:__sys_trace_return+0x0/0x4 &n |