Perfetto性能优化:卡顿检测之FrameTimeline官方文档详细剖析

Android Jank detection with FrameTimeline

NOTE: FrameTimeline requires Android 12(S) or higher

A frame is said to be janky if the time the frame was presented on screen does
not match the predicted present time given by the scheduler.

A jank can cause:

  • Unstable frame rate
  • Increased latency

FrameTimeline is a module within SurfaceFlinger that detects janks and reports
the source of the jank.
SurfaceViews
are currently not supported, but will be, in future.

UI

Two new tracks are added for every application that had at least one frame on
screen.
在这里插入图片描述

  • Expected Timeline
    Each slice represents the time given to the app for rendering the
    frame. To avoid janks in the system, the app is expected to finish within this
    time frame. The start time is the time the Choreographer callback was scheduled to run.

  • Actual Timeline
    These slices represent the actual time an app took to complete the frame
    (including GPU work) and send it to SurfaceFlinger for composition. The start time
    is the time that Choreographer#doFrame or AChoreographer_vsyncCallback started to run.
    The end time of the slices here represent max(gpu time, post time). Post time is the time the app’s frame was posted to
    SurfaceFlinger.

在这里插入图片描述

Similarly, SurfaceFlinger also gets these two new tracks representing the
expected time it’s supposed to finish within, and the actual time it took to
finish compositing frames and presenting on-screen. Here, SurfaceFlinger’s work
represents everything underneath it in the display stack. This includes the
Composer and the DisplayHAL. So, the slices represent SurfaceFlinger main
thread’s start to on-screen update.

The names of the slices represent the token received from
choreographer.
You can compare a slice in the actual timeline track to its corresponding slice
in the expected timeline track to see how the app performed compared to the
expectations. In addition, for debugging purposes, the token is added to the
app’s doFrame and RenderThread slices. For SurfaceFlinger, the same
token is shown in onMessageReceived.

在这里插入图片描述
在这里插入图片描述

Selecting an actual timeline slice

在这里插入图片描述

The selection details provide more information on what happened with the frame.
These include:

  • Present Type

Was the frame early, on time or late.

  • On time finish

Did the application finish its work for the frame on time?

  • Jank Type

Was there a jank observed with this frame? If yes, this shows what type of jank
was observed. If not, the type would be None.

  • Prediction type

Did the prediction expire by the time this frame was received by FrameTimeline?
If yes, this will say Expired Prediction. If not, Valid Prediction.

  • GPU Composition

Boolean that tells if the frame was composited by the GPU or not.

  • Layer Name

Name of the Layer/Surface to which the frame was presented. Some processes
update frames to multiple surfaces. Here, multiple slices with the same token
will be shown in the Actual Timeline. Layer Name can be a good way to
disambiguate between these slices.

  • Is Buffer?

Boolean that tells if the frame corresponds to a buffer or an animation.

Flow events

Selecting an actual timeline slice in the app also draws a line back to the
corresponding SurfaceFlinger timeline slice.

在这里插入图片描述

Since SurfaceFlinger can composite frames from multiple layers into a single
frame-on-screen (called a DisplayFrame), selecting a DisplayFrame draws
arrows to all the frames that were composited together. This can span over
multiple processes.

在这里插入图片描述

在这里插入图片描述

Color codes

| Color | Image | Description || :— | :—: | :— || Green |
在这里插入图片描述

| A good frame. No janks observed || Light Green |

在这里插入图片描述
| High latency state. The framerate is smooth but frames are presented late, resulting in an increased input latency.|| Red |
在这里插入图片描述

| Janky frame. The process the slice belongs to, is the reason for the jank. || Yellow |

在这里插入图片描述

| Used only by the apps. The frame is janky but app wasn’t the reason, SurfaceFlinger caused the jank. || Blue |
在这里插入图片描述

| Dropped frame. Not related to jank. The frame was dropped by SurfaceFlinger, preferring an updated frame over this. |

Janks explained

The jank types are defined in
JankInfo.h.
Since each app is written differently, there is no common way to go into the
internals of the apps and specify what the reason for the jank was. Our goal is
not to do this but rather, provide a quick way to tell if app was janky or if
SurfaceFlinger was janky.

None

All good. No jank with the frame. The ideal state that should be aimed for.

App janks

  • AppDeadlineMissed

The app ran longer than expected causing a jank. The total time taken by the app
frame is calculated by using the choreographer wake-up as the start time and
max(gpu, post time) as the end time. Post time is the time the frame was sent to
SurfaceFlinger. Since the GPU usually runs in parallel, it could be that the gpu
finished later than the post time.

  • BufferStuffing

This is more of a state than a jank. This happens if the app keeps sending new
frames to SurfaceFlinger before the previous frame was even presented. The
internal Buffer Queue is stuffed with buffers that are yet to be presented,
hence the name, Buffer Stuffing. These extra buffers in the queue are presented
only one after the other thus resulting in extra latency.
This can also result in a stage where there are no more buffers for the app to
use and it goes into a dequeue blocking wait.
The actual duration of work performed by the app might still be within the
deadline, but due to the stuffed nature, all the frames will be presented at
least one vsync late no matter how quickly the app finishes its work.
Frames will still be smooth in this state but there is an increased input
latency associated with the late present.

SurfaceFlinger Janks

There are two ways SurfaceFlinger can composite frames.

  • Device Composition - uses a dedicated hardware
  • GPU/Client composition - uses GPU to composite

An important thing to note is that performing device composition happens as a
blocking call on the main thread. However, GPU composition happens in parallel.
SurfaceFlinger performs the necessary draw calls and then hands over the gpu
fence to the display device. The display device then waits for the fence to be
signaled, and then presents the frame.

  • SurfaceFlingerCpuDeadlineMissed

SurfaceFlinger is expected to finish within the given deadline. If the main
thread ran for longer than that, the jank is then
SurfaceFlingerCpuDeadlineMissed. SurfaceFlinger’s CPU time is the time spent on
the main thread. This includes the entire composition time if device composition
was used. If GPU composition was used, this includes the time to write the draw
calls and handing over the frame to the GPU.

  • SurfaceFlingerGpuDeadlineMissed

The time taken by SurfaceFlinger’s main thread on the CPU + the GPU composition
time together were longer than expected. Here, the CPU time would have still
been within the deadline but since the work on the GPU wasn’t ready on time, the
frame got pushed to the next vsync.

  • DisplayHAL

DisplayHAL jank refers to the case where SurfaceFlinger finished its work and
sent the frame down to the HAL on time, but the frame wasn’t presented on the
vsync. It was presented on the next vsync. It could be that SurfaceFlinger did
not give enough time for the HAL’s work or it could be that there was a genuine
delay in the HAL’s work.

  • PredictionError

SurfaceFlinger’s scheduler plans ahead the time to present the frames. However,
this prediction sometimes drifts away from the actual hardware vsync time. For
example, a frame might have predicted present time as 20ms. Due to a drift in
estimation, the actual present time of the frame could be 23ms. This is called a
Prediction Error in SurfaceFlinger’s scheduler. The scheduler corrects itself
periodically, so this drift isn’t permanent. However, the frames that had a
drift in prediction will still be classified as jank for tracking purposes.

Isolated prediction errors are not usually perceived by the user as the
scheduler is quick to adapt and fix the drift.

Unknown jank

As the name suggests, the reason for the jank is unknown in this case. An
example here would be that SurfaceFlinger or the App took longer than expected
and missed the deadline but the frame was still presented early. The probability
of such a jank happening is very low but not impossible.

SQL

At the SQL level, frametimeline data is available in two tables

select ts, dur, surface_frame_token as app_token, display_frame_token as sf_token, process.name
from expected_frame_timeline_slice left join process using(upid)
tsdurapp_tokensf_tokenname
602304534752050000031353142com.google.android.apps.nexuslauncher
602416775402050000031373144com.google.android.apps.nexuslauncher
602528954122050000031393146com.google.android.apps.nexuslauncher
602846142411050000003144/system/bin/surfaceflinger
602958582991050000003146/system/bin/surfaceflinger
602977989132050000031473150com.android.systemui
603070757281050000003148/system/bin/surfaceflinger
603182977461050000003150/system/bin/surfaceflinger
603202364682050000031513154com.android.systemui
603295114011050000003152/system/bin/surfaceflinger
603407329561050000003154/system/bin/surfaceflinger
603426730642050000031553158com.android.systemui
select ts, dur, surface_frame_token as app_token, display_frame_token, jank_type, on_time_finish, present_type, layer_name, process.name
from actual_frame_timeline_slice left join process using(upid)
tsdurapp_tokensf_tokenjank_typeon_time_finishpresent_typelayer_namename
602304534752652637931353142Buffer Stuffing1Late PresentTX - com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#0com.google.android.apps.nexuslauncher
602416775402823580531373144Buffer Stuffing1Late PresentTX - com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#0com.google.android.apps.nexuslauncher
60252895412254652531393142None1On-time PresentTX - NavigationBar0#0com.android.systemui
602528954122794538231393146Buffer Stuffing1Late PresentTX - com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#0com.google.android.apps.nexuslauncher
602848081901031823003144None1On-time Present[NULL]/system/bin/surfaceflinger
602960677221026557403146None1On-time Present[NULL]/system/bin/surfaceflinger
60297798913523922731473150None1On-time PresentTX - NavigationBar0#0com.android.systemui
603072461611030177203148None1On-time Present[NULL]/system/bin/surfaceflinger
603184972041028119903150None1On-time Present[NULL]/system/bin/surfaceflinger
60320236468274755931513154None1On-time PresentTX - NavigationBar0#0com.android.systemui

TraceConfig

Trace Protos:
FrameTimelineEvent

Datasource:

data_sources {
    config {
        name: "android.surfaceflinger.frametimeline"
    }
}

原文地址:https://mp.weixin.qq.com/s/gOvEbqbQ0XNS5BeMEr6Kwg
更多framework实战干货,请关注下面“千里马学框架”

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

千里马学框架

帮助你了,就请我喝杯咖啡

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值