一.描述:
应用崩溃过程:常驻应用不断崩溃,救援等级增加,最终达到LEVEL_FACTORY_RESET等级。进入recovery(Android U源码解析)
二.代码分析
1.
应用进程初始化在ZygoteInit:zygoteInit
中完成进程的初始化动作.
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
/**
* The main function called when started through the zygote process. This could be unified with
* main(), if the native code in nativeFinishInit() were rationalized with Zygote startup.<p>
*
* Current recognized args:
* <ul>
* <li> <code> [--] <start class name> <args>
* </ul>
*
* @param targetSdkVersion target SDK version
* @param disabledCompatChanges set of disabled compat changes for the process (all others
* are enabled)
* @param argv arg strings
*/
public static Runnable zygoteInit(int targetSdkVersion, long[] disabledCompatChanges,
String[] argv, ClassLoader classLoader) {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
RuntimeInit.redirectLogStreams();//此函数中调用crash的监听
RuntimeInit.commonInit();
ZygoteInit.nativeZygoteInit();
return RuntimeInit.applicationInit(targetSdkVersion, disabledCompatChanges, argv,
classLoader);
}
2. commonInit函数
frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
@UnsupportedAppUsage
protected static final void commonInit() {
...
LoggingHandler loggingHandler = new LoggingHandler();
RuntimeHooks.setUncaughtExceptionPreHandler(loggingHandler);//监听线程发生异常时,触发回调
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));...
}
3. KillApplicationHandler 类,触发uncaughtException 函数.
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
ensureLogging(t, e);// Don't re-enter -- avoid infinite loops if crash-reporting crashes.
//避免重复进入
if (mCrashing) return;
mCrashing = true;// Try to end profiling. If a profiler is running at this point, and we kill the
// process (below), the in-memory buffer will be lost. So try to stop, which will
// flush the buffer. (This makes method trace profiling useful to debug crashes.)
if (ActivityThread.currentActivityThread() != null) {//停止刷新缓存区
ActivityThread.currentActivityThread().stopProfiling();
}// Bring up crash dialog, wait for it to be dismissed
//会弹起对话框,等待关闭被关闭
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
if (t2 instanceof DeadObjectException) {
// System process is dead; ignore
} else {
try {
Clog_e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Clog_e() fails! Oh well.
}
}
} finally {
// Try everything to make sure this process goes away.//最终会关闭当前进程
Process.killProcess(Process.myPid());
System.exit(10);
}
}
4. handleApplicationCrash 函数
frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
public void handleApplicationCrash(IBinder app,
ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
ProcessRecord r = findAppProcess(app, "Crash");
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);handleApplicationCrashInner("crash", r, processName, crashInfo);
}
5. handleApplicationCrashInner 函数:
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
ApplicationErrorReport.CrashInfo crashInfo) {...
boolean recoverable = eventType.equals("native_recoverable_crash");
if (recoverable) {
// event type 为native_recoverable_crash,可以恢复crash的函数
mAppErrors.sendRecoverableCrashToAppExitInfo(r, crashInfo);
} else {
mAppErrors.crashApplication(r, crashInfo);
}}
6. crashApplication 函数
frameworks/base/services/core/java/com/android/server/am/AppErrors.java
void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
final int callingPid = Binder.getCallingPid();
final int callingUid = Binder.getCallingUid();final long origId = Binder.clearCallingIdentity();
try {
crashApplicationInner(r, crashInfo, callingPid, callingUid);
} finally {
Binder.restoreCallingIdentity(origId);
}
}
7. crashApplicationInner 函数
private void crashApplicationInner(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo,
int callingPid, int callingUid) {...
if (r != null) {
mPackageWatchdog.onPackageFailure(r.getPackageListWithVersionCode(),
PackageWatchdog.FAILURE_REASON_APP_CRASH);synchronized (mService) {
mService.mProcessList.noteAppKill(r, (crashInfo != null
&& "Native crash".equals(crashInfo.exceptionClassName))
? ApplicationExitInfo.REASON_CRASH_NATIVE
: ApplicationExitInfo.REASON_CRASH,
ApplicationExitInfo.SUBREASON_UNKNOWN,
"crash");
}
}...
}
8. onPackageFailure 函数
frameworks/base/services/core/java/com/android/server/PackageWatchdog.java
public void onPackageFailure(List<VersionedPackage> packages,
@FailureReasons int failureReason) {
if (packages == null) {//日志打印
Slog.w(TAG, "Could not resolve a list of failing packages");
return;
}
mLongTaskHandler.post(() -> {
synchronized (mLock) {// 若没有观察者,则中断
if (mAllObservers.isEmpty()) {
return;
}//是否立刻处理
boolean requiresImmediateAction = (failureReason == FAILURE_REASON_NATIVE_CRASH
|| failureReason == FAILURE_REASON_EXPLICIT_HEALTH_CHECK);
if (requiresImmediateAction) {//立刻处理fail项
handleFailureImmediately(packages, failureReason);
} else {
for (int pIndex = 0; pIndex < packages.size(); pIndex++) {
VersionedPackage versionedPackage = packages.get(pIndex);
// Observer that will receive failure for versionedPackage
PackageHealthObserver currentObserverToNotify = null;
int currentObserverImpact = Integer.MAX_VALUE;
MonitoredPackage currentMonitoredPackage = null;// Find observer with least user impact
for (int oIndex = 0; oIndex < mAllObservers.size(); oIndex++) {
ObserverInternal observer = mAllObservers.valueAt(oIndex);
PackageHealthObserver registeredObserver = observer.registeredObserver;//观察可以被监听的应用
if (registeredObserver != null
&& observer.onPackageFailureLocked(
versionedPackage.getPackageName())) {
MonitoredPackage p = observer.getMonitoredPackage(
versionedPackage.getPackageName());
int mitigationCount = 1;
if (p != null) {//进行次数加1
mitigationCount = p.getMitigationCountLocked() + 1;
}
int impact = registeredObserver.onHealthCheckFailed(
versionedPackage, failureReason, mitigationCount);
if (impact != PackageHealthObserverImpact.USER_IMPACT_LEVEL_0
&& impact < currentObserverImpact) {
currentObserverToNotify = registeredObserver;
currentObserverImpact = impact;
currentMonitoredPackage = p;
}
}
}// Execute action with least user impact
if (currentObserverToNotify != null) {
int mitigationCount = 1;
if (currentMonitoredPackage != null) {
currentMonitoredPackage.noteMitigationCallLocked();
mitigationCount =
currentMonitoredPackage.getMitigationCountLocked();
}
currentObserverToNotify.execute(versionedPackage,
failureReason, mitigationCount);
}
}
}
}
});
}
public boolean onPackageFailureLocked(String packageName) {
if (getMonitoredPackage(packageName) == null && registeredObserver.isPersistent()
&& registeredObserver.mayObservePackage(packageName)) {
putMonitoredPackage(sPackageWatchdog.newMonitoredPackage(
packageName, DEFAULT_OBSERVING_DURATION_MS, false));
}
MonitoredPackage p = getMonitoredPackage(packageName);
if (p != null) {
return p.onFailureLocked();
}
return false;
}
getMitigationCountLocked函数,获取当前进程的统计次数
public int getMitigationCountLocked() {
try {
final long now = mSystemClock.uptimeMillis();
while (now - mMitigationCalls.peekFirst() > DEFAULT_DEESCALATION_WINDOW_MS) {//static final long DEFAULT_DEESCALATION_WINDOW_MS = TimeUnit.HOURS.toMillis(1);
//当两次crash,发生时间大于一小时,则下次访问移除统计.
mMitigationCalls.removeFirst();
}
} catch (NoSuchElementException ignore) {
}return mMitigationCalls.size();
}
9. 救援模式的实现在RescueParty 中RescuePartyObserver,实现了PackageHealthObserver
frameworks/base/services/core/java/com/android/server/RescueParty.java
public static class RescuePartyObserver implements PackageHealthObserver {
@Override
public boolean execute(@Nullable VersionedPackage failedPackage,
@FailureReasons int failureReason, int mitigationCount) {
if (isDisabled()) {
return false;
}
if (failureReason == PackageWatchdog.FAILURE_REASON_APP_CRASH
|| failureReason == PackageWatchdog.FAILURE_REASON_APP_NOT_RESPONDING) {
final int level = getRescueLevel(mitigationCount,
mayPerformReboot(failedPackage));
executeRescueLevel(mContext,
failedPackage == null ? null : failedPackage.getPackageName(), level);
return true;
} else {
return false;
}
}}
@Override
public boolean mayObservePackage(String packageName) {
PackageManager pm = mContext.getPackageManager();
try {
// A package is a module if this is non-null
if (pm.getModuleInfo(packageName, 0) != null) {
return true;
}
} catch (PackageManager.NameNotFoundException ignore) {
}//是否persistent 进程,若是则返回true
return isPersistentSystemApp(packageName);
}
10. currentObserverToNotify.execute 对应的函数是 execute
@Override
public boolean execute(@Nullable VersionedPackage failedPackage,
@FailureReasons int failureReason, int mitigationCount) {
if (isDisabled()) {
return false;
}//此次failureReason 是 PackageWatchdog.FAILURE_REASON_APP_CRASH
if (failureReason == PackageWatchdog.FAILURE_REASON_APP_CRASH
|| failureReason == PackageWatchdog.FAILURE_REASON_APP_NOT_RESPONDING) {//获取当前失败的次数
final int level = getRescueLevel(mitigationCount,
mayPerformReboot(failedPackage));
executeRescueLevel(mContext,
failedPackage == null ? null : failedPackage.getPackageName(), level);
return true;
} else {
return false;
}
}
11. executeRescueLevel 函数
private static void executeRescueLevel(Context context, @Nullable String failedPackage,
int level) {
Slog.w(TAG, "Attempting rescue level " + levelToString(level));
try {
executeRescueLevelInternal(context, level, failedPackage);
EventLogTags.writeRescueSuccess(level);
String successMsg = "Finished rescue level " + levelToString(level);
if (!TextUtils.isEmpty(failedPackage)) {
successMsg += " for package " + failedPackage;
}
logCriticalInfo(Log.DEBUG, successMsg);
} catch (Throwable t) {
logRescueException(level, failedPackage, t);
}
}
12. executeRescueLevelInternal 函数
private static void executeRescueLevelInternal(Context context, int level, @Nullable
String failedPackage) throws Exception {
FrameworkStatsLog.write(FrameworkStatsLog.RESCUE_PARTY_RESET_REPORTED, level);
// Try our best to reset all settings possible, and once finished
// rethrow any exception that we encountered
Exception res = null;
Runnable runnable;
Thread thread;
switch (level) {
case LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS:
try {
resetAllSettingsIfNecessary(context, Settings.RESET_MODE_UNTRUSTED_DEFAULTS,
level);
} catch (Exception e) {
res = e;
}
try {
resetDeviceConfig(context, /*isScoped=*/true, failedPackage);
} catch (Exception e) {
res = e;
}
break;
case LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES:
try {
resetAllSettingsIfNecessary(context, Settings.RESET_MODE_UNTRUSTED_CHANGES,
level);
} catch (Exception e) {
res = e;
}
try {
resetDeviceConfig(context, /*isScoped=*/true, failedPackage);
} catch (Exception e) {
res = e;
}
break;
case LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS:
try {
resetAllSettingsIfNecessary(context, Settings.RESET_MODE_TRUSTED_DEFAULTS,
level);
} catch (Exception e) {
res = e;
}
try {
resetDeviceConfig(context, /*isScoped=*/false, failedPackage);
} catch (Exception e) {
res = e;
}
break;
case LEVEL_WARM_REBOOT:
// Request the reboot from a separate thread to avoid deadlock on PackageWatchdog
// when device shutting down.
SystemProperties.set(PROP_ATTEMPTING_REBOOT, "true");
runnable = () -> {
try {
PowerManager pm = context.getSystemService(PowerManager.class);
if (pm != null) {
pm.reboot(TAG);
}
} catch (Throwable t) {
logRescueException(level, failedPackage, t);
}
};
thread = new Thread(runnable);
thread.start();
break;//当到达5次时,进行
case LEVEL_FACTORY_RESET:
// Before the completion of Reboot, if any crash happens then PackageWatchdog
// escalates to next level i.e. factory reset, as they happen in separate threads.
// Adding a check to prevent factory reset to execute before above reboot completes.
// Note: this reboot property is not persistent resets after reboot is completed.
if (isRebootPropertySet()) {
break;
}
SystemProperties.set(PROP_ATTEMPTING_FACTORY_RESET, "true");
long now = System.currentTimeMillis();
SystemProperties.set(PROP_LAST_FACTORY_RESET_TIME_MS, Long.toString(now));
runnable = new Runnable() {
@Override
public void run() {
try {//恢复出厂设置
RecoverySystem.rebootPromptAndWipeUserData(context, TAG);
} catch (Throwable t) {
logRescueException(level, failedPackage, t);
}
}
};
thread = new Thread(runnable);
thread.start();
break;
}if (res != null) {
throw res;
}
}
13. rebootPromptAndWipeUserData 函数
frameworks/base/core/java/android/os/RecoverySystem.java
/** {@hide} */
public static void rebootPromptAndWipeUserData(Context context, String reason)
throws IOException {
boolean checkpointing = false;
boolean needReboot = false;
IVold vold = null;
try {
vold = IVold.Stub.asInterface(ServiceManager.checkService("vold"));
if (vold != null) {
checkpointing = vold.needsCheckpoint();
} else {
Log.w(TAG, "Failed to get vold");
}
} catch (Exception e) {
Log.w(TAG, "Failed to check for checkpointing");
}// If we are running in checkpointing mode, we should not prompt a wipe.
// Checkpointing may save us. If it doesn't, we will wind up here again.
if (checkpointing) {
try {//阻止变化
vold.abortChanges("rescueparty", false);
Log.i(TAG, "Rescue Party requested wipe. Aborting update");
} catch (Exception e) {
Log.i(TAG, "Rescue Party requested wipe. Rebooting instead.");
PowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);//发生重启,标记为rescueparty
pm.reboot("rescueparty");
}
return;
}String reasonArg = null;
if (!TextUtils.isEmpty(reason)) {
reasonArg = "--reason=" + sanitizeArg(reason);
}final String localeArg = "--locale=" + Locale.getDefault().toString();
bootCommand(context, null, "--prompt_and_wipe_data", reasonArg, localeArg);
}
三.流程小结
frameworks/base/core/java/com/android/internal/os/ZygoteInit.java ::zygoteInit //应用初始化会调用
--->frameworks/base/core/java/com/android/internal/os/RuntimeInit.java ::commonInit
--->frameworks/base/core/java/com/android/internal/os/RuntimeInit.java::KillApplicationHandler::uncaughtException //当发生crash时,会调用此方法
--->frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java :: handleApplicationCrash
--->frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java :: handleApplicationCrashInner
--->frameworks/base/services/core/java/com/android/server/am/AppErrors.java :: crashApplication
--->frameworks/base/services/core/java/com/android/server/am/AppErrors.java :: crashApplicationInner
--->frameworks/base/services/core/java/com/android/server/PackageWatchdog.java ::onPackageFailure
--->frameworks/base/services/core/java/com/android/server/RescueParty.java ::RescuePartyObserver::execute
--->frameworks/base/services/core/java/com/android/server/RescueParty.java ::executeRescueLevel
--->frameworks/base/services/core/java/com/android/server/RescueParty.java ::executeRescueLevelInternal //当crash 次数达到5次,则进入Recovery相关方法
--->frameworks/base/services/core/java/com/android/server/PackageWatchdog.java ::onPackageFailureLocked //判断是否为统计Failure的条件
--->frameworks/base/services/core/java/com/android/server/PackageWatchdog.java ::getMitigationCountLocked //记录fail的次数
--->frameworks/base/services/core/java/com/android/server/PackageWatchdog.java ::mayObservePackage //被观察的判断条件
四.总结
1. 在一小时内,persistent 发生5次crash 会进入 Recovery模式.
2. 当vold.abortChanges 函数调用出现异常时,会发生重启,不会进入recover. 标记 tag 为rescueparty.