Android进程启动可以参考gityuan的相关介绍,本文记录下遇到的问题,修改方法,以及引申出的其他问题
一、遇到的问题
Android O上,用户正常操作过程中,很低概率会遇到system_server卡死,看traces.txt中system_server主线程卡死状态
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 flags=1 obj=0x72b927d0 self=0x7eed0bea00
| sysTid=2196 nice=-2 cgrp=default sched=0/0 handle=0x7ef17a99b0
| state=S schedstat=( 0 0 0 ) utm=292 stm=117 core=1 HZ=100
| stack=0x7ff187a000-0x7ff187c000 stackSize=8MB
| held mutexes=
at com.android.server.am.ActivityManagerService.checkContentProviderAccess(ActivityManagerService.java:11204)
- waiting to lock <0x07934e4e> (a com.android.server.am.ActivityManagerService) held by thread 103
at com.android.server.am.ActivityManagerService$LocalService.checkContentProviderAccess(ActivityManagerService.java:23909)
at com.android.server.content.ContentService.notifyChange(ContentService.java:368)
at android.content.ContentResolver.notifyChange(ContentResolver.java:2046)
at com.android.providers.settings.SettingsProvider$SettingsRegistry$MyHandler.handleMessage(SettingsProvider.java:2877)
at android.os.Handler.dispatchMessage(Handler.java:105)
at android.os.Looper.loop(Looper.java:164)
at com.android.server.SystemServer.run(SystemServer.java:425)
at com.android.server.SystemServer.main(SystemServer.java:264)
at java.lang.reflect.Method.invoke(Native method)
at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:240)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:820)
……
"Binder:2196_1A" prio=5 tid=103 Native
| group="main" sCount=1 dsCount=0 flags=1 obj=0x13f0b4c8 self=0x7ed3a92c00
| sysTid=11637 nice=-2 cgrp=default sched=1073741824/0 handle=0x7eca5ff4f0
| state=S schedstat=( 0 0 0 ) utm=622 stm=333 core=1 HZ=100
| stack=0x7eca505000-0x7eca507000 stackSize=1005KB
| held mutexes=
kernel: (couldn't read /proc/self/task/11637/stack)
native: #00 pc 0000000000069184 /system/lib64/libc.so (recvmsg+4)
native: #01 pc 000000000011844c /system/lib64/libandroid_runtime.so (???)
native: #02 pc 00000000001180b4 /system/lib64/libandroid_runtime.so (???)
native: #03 pc 00000000005c74a4 /system/framework/arm64/boot-framework.oat (Java_android_net_LocalSocketImpl_readba_1native___3BIILjava_io_FileDescriptor_2+196)
at android.net.LocalSocketImpl.readba_native(Native method)
at android.net.LocalSocketImpl.-wrap1(LocalSocketImpl.java:-1)
at android.net.LocalSocketImpl$SocketInputStream.read(LocalSocketImpl.java:110)
- locked <0x04da015a> (a java.lang.Object)
at java.io.DataInputStream.readFully(DataInputStream.java:198)
at java.io.DataInputStream.readInt(DataInputStream.java:389)
at android.os.ZygoteProcess.zygoteSendArgsAndGetResult(ZygoteProcess.java:297)
at android.os.ZygoteProcess.startViaZygote(ZygoteProcess.java:431)
- locked <0x02d82f8b> (a java.lang.Object)
at android.os.ZygoteProcess.start(ZygoteProcess.java:207)
at android.os.Process.startWebView(Process.java:470)
at com.android.server.am.ActivityManagerService.startProcessLocked(ActivityManagerService.java:3929)
at com.android.server.am.ActivityManagerService.startProcessLocked(ActivityManagerService.java:3760)
at com.android.server.am.ActivityManagerService.startProcessLocked(ActivityManagerService.java:3649)
at com.android.server.am.ActiveServices.bringUpServiceLocked(ActiveServices.java:2161)
at com.android.server.am.ActiveServices.bindServiceLocked(ActiveServices.java:1439)
at com.android.server.am.ActivityManagerService.bindService(ActivityManagerService.java:18465)
- locked <0x07934e4e> (a com.android.server.am.ActivityManagerService)
at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:596)
at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:2944)
通过线程状态看,system_server卡在起动进程的过程中,在等待zygote通过socket写进程pid,告诉system_server启动应用的进程号
二、问题分析
一图胜千言万语,图片来自gityuan
socket有问题,或者zygote有问题?
三、措施
看源码中也有添加超时的建议,但是目前没添加,而且只有Android O存在问题,Android N或者Android P都没有遇到
那就自己尝试添加,思路是在从zygote读pid的时候,添加循环读,一次进程启动的需要读取数据长度为5
因而判断条件为:当前系统uptime < 最大超时时间 && 读取长度 < 5
最大超时时间 = 第一次进来的系统uptime + 超时
注意:
必须要用系统uptime,不能简单使用currentTimeMillis,否则时间出现跳变会导致报下面异常
09-06 10:17:09.476 1000 2255 2509 E ZygoteProcess: Starting VM process through Zygote failed
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: Failure starting process com.****.******.****
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: java.lang.RuntimeException: Starting VM process through Zygote failed
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.os.ZygoteProcess.start(ZygoteProcess.java:214)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.os.Process.start(Process.java:453)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActivityManagerService.startProcessLocked(ActivityManagerService.java:3951)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActivityManagerService.startProcessLocked(ActivityManagerService.java:3777)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActivityManagerService.startProcessLocked(ActivityManagerService.java:3666)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActiveServices.bringUpServiceLocked(ActiveServices.java:2161)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActiveServices.bindServiceLocked(ActiveServices.java:1439)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActivityManagerService.bindService(ActivityManagerService.java:18520)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:596)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:2961)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.os.Binder.execTransact(Binder.java:674)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: Caused by: android.os.ZygoteStartFailedEx: fork() failed
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.os.ZygoteProcess.zygoteSendArgsAndGetResult(ZygoteProcess.java:319)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.os.ZygoteProcess.startViaZygote(ZygoteProcess.java:469)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: at android.os.ZygoteProcess.start(ZygoteProcess.java:208)
09-06 10:17:09.476 1000 2255 2509 E ActivityManager: ... 10 more
09-06 10:17:09.649 1000 2255 2846 W ActivityManager: No pending application record for pid 2927 (IApplicationThread android.app.IApplicationThread$Stub$Proxy@ba0a631); dropping process
此后一直无法启动成功,重复出现No pending application record for pid错误,因为由于时间跳变导致认为超时抛出上面异常了,但是其实应用还在正常启动中,会回调attachApplication方法,但此时拿到的pid是下一个正在启动进程pid,导致在mPidSelfLocked中找不到,返回null
四、引申问题
上面问题是时间跳变模拟到的,并不是真正出问题的时候,真正出问题的情况是如何的呢?
或者上面理解有什么不对的么?
参考:
gityuan