The principle of Android rendering

Posted May 25, 20207 min read

Introduction:
In the process of testing fluency, it is inevitable to contact with FPS, Jank and other indicators, but in order to deepen the understanding, today I will briefly take a look at the rendering principle of Android;
PerfDog uses Jank as an indicator to represent the fluency of the game. For details, see
Does APP & Game need to pay attention to Jank?

  1. CPU and GPU structure

Most mobile terminals will now be equipped with CPU(central processing unit) and GPU(graphics processing unit), and some now also have an NPU for processing intelligent operations. Let's take a brief look at their structure;
\ []([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/20200428151710644.png?x-oss-process=image/watermark , type_ZmFuZ3poZW5naGVpdGk, shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

\ [ Insert picture description here

]([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/20200428151710644.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk , shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

The green is the calculation unit(ALU),
The orange-red is the storage unit,
The orange is the control unit.
The CPU needs a strong versatility to handle various data types, and at the same time it needs to carry out complex mathematical and logical operations, so the internal structure of the CPU is extremely complicated;
The CPU is occupied a lot of space by the Cache, and there are many complicated control logics and many optimization circuits. In fact, the computing power is only a small part of the CPU. In the early days, the CPU was responsible for memory management, graphics display, etc. Therefore, the performance of the operation will be greatly reduced during the actual calculation, and it cannot display complex graphics, which can not meet the requirements of the current 3D games; therefore, the GPU came into being.
GPUs are faced with highly unified, independent and large-scale data and pure computing environments that do not need to be interrupted, so the structure is also very different.
The GPU uses a large number of computing units and a long pipeline, but only very simple control logic and Cache is omitted. The GPU converts and drives the display information required by the computer system and provides the line scan signal to the display to control the display. The correct display is mainly responsible for the work of the graphic display part.

  1. Android system drawing mechanism

\ [ Insert picture description here

]( https://img-blog.csdnimg.cn/2 ... )

In the current Android terminal, in a typical display system, the CPU first issues an image drawing instruction to allow the GPU to draw a style, but the CPU cannot directly communicate with the GPU, and must also comply with the corresponding rules, and we must do whatever we do now. The process is the same, it can't be messed up; so the CPU must first send some instructions to OpenGL ES to express a style. Opengl ES is a set of interface APIs. \ * \ * Through these APIs can operate the driver to let the GPU reach various Various operations; GPU receives these commands, starts rasterization processing, and displays the style on the screen;

Now we add the application to the display process
\ []([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/20200428155132663.png?x-oss-process=image/watermark , type_ZmFuZ3poZW5naGVpdGk, shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

\ [ Insert picture description here

]([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/20200428155132663.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk , shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

In the Android application layer, the LayoutInflater maps the layout XML file into an object and loads it into memory. At this time, the UI object contains information such as size, position, and so on. Then the CPU takes this UI object from the memory, and then processes it into a multi-dimensional vector graphic, and then hands it to the GPU to rasterize it into a bitmap and display it on the screen;
Briefly introduce vector and bitmap
Vector graph:described by a function that describes how this graph is generated
Bitmap:described by a matrix of pixels

The Android system redraws the Activity every 16ms, so the application must complete all the logical operations of the screen refresh within 16ms, so as to reach 60 frames per second(60FPS). However, the parameter of the number of frames per second is determined by the mobile phone hardware Now, the screen refresh rate of most mobile phones is 60 Hz(a measure of the number of repetitions of periodic changes in each second), and if it exceeds 16 ms, there will be a so-called frame loss(1000ms/60 = 16.66ms)

Three. The complete rendering process of a frame of image

\ [ Insert picture description here

]([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/2020042817102729.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk , shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

The Android application window contains many View elements, which are organized in a tree structure, which ultimately constitutes the so-called view tree structure;
Before drawing the UI of an Android application window, determine the size and position of each child View element in the parent element. The process of determining the size and position of each child View element in the parent View element is also called a measurement process and a layout process. The UI rendering process of the Android application window can be divided into
Measure, Layout, and Draw
Three stages(initiated by the performTraversals() method of the ViewRootImpl class)

Measurement-recursive(depth first) to determine the size of all views(height, width)
Layout-recursively(depth first) determines the position of all views
Draw-draw all views of the application window on the canvas

After drawing many times, all the views to be displayed in this frame have been drawn. Note that these operations of drawing the View hierarchy are completed in the graphics buffer;
At this point, the graphics buffer should be handed to the SurfaceFlinger service

SurfaceFlinger service overview:
\ []([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/20200428174014959.png?x-oss-process=image/watermark , type_ZmFuZ3poZW5naGVpdGk, shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

\ [ Insert picture description here

]([ https://img-blog.csdnimg.cn/2 ...]( https://img-blog.csdnimg.cn/20200428174014959.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk , shadow_10, text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0Njk2MjAz, size_16, color_FFFFFF, t_70))

The SurfaceFlinger service is started and run in the System process of the Android system like other system services. It is mainly responsible for the unified management of the frame buffer of the Android system in the device(Frame Buffer, which is simply understood as all the graphic effects displayed on the screen) They are all managed by it). During the startup of the SurfaceFlinger service, two threads are automatically created:one thread is used to monitor console events, and the other thread is used to render the system UI;
In order for the Android application to draw its own UI on the system's frame buffer, it needs to pass the UI data to the SurfaceFlinger service and inform itself of the specific UI data(such as the area and location of the UI to be drawn),
Android applications and SurfaceFlinger service are running in different processes, so they communicate with each other through the Binder mechanism,
It can be roughly divided into 3 steps:

  1. The first is to create a connection to the SurfaceFlinger service,
    2Use this connection to create a Surface,
  2. Request the SurfaceFlinger service to render the Surface(each window of the Android application corresponds to a canvas(Canvas), which can also be understood as a window of the Android application)
    At the APP layer, we can't do any optimization for this part, this is the work done by ROOM.

Simply put, when the Android application layer draws the View hierarchy in the graphics buffer, the application layer communicates with SurfaceFlinger through the Binder mechanism and uses an anonymous shared memory to hand over the graphics buffer to the SurfaceFlinger service. Because pure anonymous shared memory lacks effective management when transferring multiple window data, anonymous shared memory is abstracted into a more upstream data structure SharedClient. In each SharedClient, there are at most 31 SharedBufferStack, each SharedBufferStack Corresponding to a Surface is a window.

The frame buffer has an address in memory. We continue to write data to the frame buffer, and the display controller automatically fetches the data from the frame buffer and displays it. All graphics share the same frame buffer in memory.

Four. VSync mechanism

In order to reduce stuck, Android 4.1(JB) has begun to introduce VSync(vertical synchronization) mechanism
Simply put, the CPU/GPU will receive the vsync signal, and the Android system will send a Vsync signal every 16ms to trigger the rendering of the UI(that is, display a frame every 16ms)
Two tasks need to be completed in 16ms:converting UI objects into a series of polygons and textures(rasterization) and the CPU to transfer the processing data to the GPU.
But even the introduction of the vertical synchronization mechanism is not perfect. If the CPU and GPU render a frame for more than 16ms for some reason, the Vsync vertical synchronization mechanism will let the hardware display wait until the GPU completes the rasterization operation. This directly leads to this frame staying for 16ms or more, making the user look like the picture is paused.