Front-end performance optimization: when page rendering meets edge computing

Posted May 27, 202015 min read

Introduction: Several common front-end performance optimization solutions still inevitably have some shortcomings. Based on ESI(Edge Side Include), this paper proposes a new optimization idea:Edge Streaming Rendering Scheme(ESR), that is, using CDN's edge computing power to stream static content and dynamic content in a streaming manner. Return to the user successively.(Welfare at the end of the article:download the "big front-end technology covering full-end business" e-book)



For web pages, the performance of the first hop scenario(such as SEO, paid drainage) is generally worse than that of the second hop scenario. There are many reasons, mainly because the first-hop users have great disadvantages in connection reuse and local resource cache utilization. In the first-hop scenario, many on-end optimization methods(preloading, preexecution, prerendering, etc.) cannot be implemented.

In the case where the client's cache capacity is not available, using the feature that cdn is close to the user can be combined with caching to do some performance optimization.


Idea 1:SSR

For performance optimization, we generally use server-side rendering(SSR) to output the first screen dynamic content directly to the server.


The advantage of this method is that it can contain the main content of the page once the html is returned. It does not require the browser to request the interface again and then render with js. However, the shortcomings of this method are also obvious. For scenes that are far away from the server or the server takes a long time to process, users will see a white screen for a long time. And even if the html return is completed, the user will not immediately see the content, and the page needs to load the front-end js, css and other resources to see the content.

Idea 2:CSR + CDN

In order to reduce the white screen time, consider using the CDN's edge caching ability, you can cache the page html directly on the cdn node. However, for most scenarios, the main content of the page is dynamic or personalized. Caching all html content on cdn has a great impact on the business, and few scenes can accept it. So to change the way, only cache the static part of html on cdn? In fact, this idea is also a very common operation, that is, the static frame part of HTML is cached on cdn, so that users can quickly see part of the content, and then initiate an asynchronous request on the client to obtain dynamic content and render(CSR). The rendering sequence diagram in CSR + CDN mode is as follows:


The advantage of this method is that the static frame of the page is cached on cdn, and the user can quickly see the content of the page frame, reducing the waiting anxiety of the white screen. The disadvantage is that the complete page content needs to execute js again, and then render after pulling back the asynchronous interface. Eventually meaningful dynamic content is displayed later than SSR.

Idea 3:ESI

The CSR + CDN method solves the problem of white screen time, but brings a delay in the display of dynamic content. The reason for this problem is that we divided the dynamic content and static content of the page into two stages, and it is serial, and the serial process is also interspersed with the download and execution of js. Is there any way to integrate dynamic content and static content on CDN?

ESI(Edge Side Include) gave us a good idea to inspire. ESI was originally a specification proposed by CDN service providers. You can add specific dynamic tags through html tags to cache the static content of the page on cdn. Dynamic content Can be assembled freely. The rendering sequence diagram of ESI is as follows:


This solution looks very good, the static part can be cached on the CDN, and the dynamic part will be dynamically requested and spliced when the user requests it. But the most critical issue is that in the ESI mode, the first byte that is ultimately returned to the user still has to wait until all dynamic content is acquired and stitched on the CDN. That is, it does not reduce the white screen time, but only reduces the volume of content transmission between the CDN and the server, and the performance optimization benefit is very small. The final effect is not much different from SSR.

Although the effect of ESI did not meet our expectations, it gave us a good direction of thinking. If the ESI can be transformed to return static content first, and then the dynamic content is retrieved from the CDN node and then returned to the page, you can ensure that the white screen time is short and the return of dynamic content is not delayed. If you want to achieve an effect similar to streaming ESI, you need to be able to perform fine-grained operations on the request on the CDN, as well as streaming returns. Do CDN nodes support such complex operations? The answer is yes:edge computing. We can do the operation similar to the browser's service worker on the CDN, and can do flexible programming for requests and responses.

Based on the capabilities of edge computing, we have a new option:Edge Streaming Rendering Scheme(ESR). The plan details are as follows.

Rendering process

The core idea of the program is to use the ability of edge computing to stream static content and dynamic content to the user in sequence. Compared with the server, the cdn node is closer to the user and has a shorter network delay. On the cdn node, the static part of the cacheable page is quickly returned to the user. At the same time, the dynamic part content request is initiated on the cdn node, and the dynamic content is returned to the user after the response flow of the static part. The sequence diagram of the final page rendering is as follows:


As can be seen from the above figure, the CDN edge node can quickly return the first byte and the static part of the page content, and then the dynamic content is initiated by CDN to the server and streamed back to the user. The program has the following characteristics:

  • The first screen ttfb will be short, and static content(such as page header, basic structure, bone diagram) can be seen quickly.
  • Dynamic content is initiated by cdn. Compared to traditional browser rendering, the initiation time is earlier, and it does not depend on the browser to download and execute js. In theory, the final reponse end time is the same as the time to directly access the server to obtain a complete dynamic page.
  • After the static content is returned, it is already possible to start parsing of some html, and download and execution of js and css. Some operations that block the page are carried out in advance, and after the complete dynamic content is streamed back, the dynamic content can be displayed faster.
  • The network between the edge node and the server has more room for optimization than the network between the client and the server. For example, through dynamic acceleration and connection reuse between the edge and the server, TCP connection establishment and network transmission overhead can be reduced for dynamic requests. In order to achieve the return time of the final dynamic content, it is faster than the client directly accessing the server.

demo comparison

At present, a demo of the main search page has been made on alicdn( ), the following is in In the case of different networks(the speed limit is configured through charles' network throttle), compare with the loading of the original page:

Unlimited speed(wifi)

Speed limit 4G

Speed limit 3g

From the above results, it can be seen that the slower the network speed, the earlier the main elements of the stream rendering through cdn will come out earlier than the original ssr. This is also consistent with actual inferences, because the slower the network, the slower the static resource loading time, and the more obvious the effect of the corresponding browser loading static resources in advance. In addition, no matter what the network situation, the white screen time of cdn streaming rendering method is much shorter.

Overall structure

Architecture diagram


Edge stream rendering

1 Template

The template is a syntax similar to the ESI block. Based on the template, the content that needs to be dynamically requested is extracted, and the content that can be returned statically is separated and cached. So the template essentially defines the dynamic content and static content of the page.

During the streaming rendering process, the page template will be parsed from top to bottom. If it is static content, it will be returned directly to the user. If dynamic content is encountered, the fetch logic of the dynamic content will be executed. There may be alternating static and dynamic content throughout the process.

There are several types of templates designed.

1) Original HTML

This kind of template is the least intrusive to the existing business. You only need to add certain tags to the existing SSR page content to declare the dynamic part of the page:

    <linkrel = "stylesheet" type = "text/css" href = "index.css">
    <scriptsrc = "index.js"> </script> <metaname = "esr-version" content = "0.0.1" />
    <div> staic content .... </div>
    <scripttype = "esr/snippet/start" esr-id = "111" content = "SLICE"> </script>
    <div> dynamic content1 .... </div>
    <scripttype = "esr/snippet/end"> </script>
    <div> staic content .... </div>
    <scripttype = "esr/snippet/start" esr-id = "222" content = ""> </script>
    <divid = "222">
      dynamic content2 ....
    <scripttype = "esr/snippet/end"> </script>

2) Static template(no actual scene associated for the time being)

This kind of template needs to send the template to CDN separately(if the rendering layer is connected to the FASS gateway and SSR in the future, the template content can be shared with them in this section, and it is automatically synchronized to the previous CDN when the template is published in the workflow. Also clear the cache on cdn). There are two ways to render dynamic content. One is the use of dynamic html fragments from the back-end SSR, and the other is the dynamic data provided by the back-end, and the dynamic html fragments are rendered by the edge section.

The advantage of using SSR dynamic HTML fragments is that there is no need to do HTML template rendering on the edge, and there is no need for developers to write two sets of template logic. The disadvantage is that the backend needs to have SSR capability, and the dynamic content transmission volume is large.

The advantage of using edge nodes to render dynamic html content is that the backend only needs to provide dynamic data, and does not need SSR capabilities(but the frontend needs to have CSR capabilities to do the downgrade), and the volume of dynamic content transmitted is small. The pointcut is that the dynamic content cannot be transparently transmitted on the edge node. It needs to be completely downloaded to the edge node, and then returned to the user after processing.

    <linkrel = "stylesheet" type = "text/css" href = "index.css">
    <scriptsrc = "index.js"> </script>
    <div> staic content .... </div>
    <scripttype = "esr/block" esr-id = "111" content = ""> </script>
    <div> staic content .... </div>
    <scripttype = "esr/template" esr-id = "222" content = "">

2 Static content display

The static content comes from the template. For different template types, the way to get static content is different. For the "original HTML" type template, the static content will be extracted from the complete HTML returned by the first dynamic request according to the html comment tag and stored in the edge cache. For the "static template", the template file of the CDN is pulled and stored in the edge cache. Static content has cache expiration time and version number.

The static content at the beginning of the template will be directly returned to the user in response. There are two ways to follow the static content(such as the closing tags of html and body):

One is to wait for the dynamic content to return before writing it to the response stream. This method is more SEO friendly, but the disadvantage is that the dynamic content will block the subsequent static content, and if there are multiple dynamic content blocks, the dynamic template returned first cannot be displayed first, and can only be displayed in sequence.

The other way is to return the static content completely, and then insert the dynamic content into the corresponding pit through the script in a bigpipe-like manner. The advantage of this method is that static content can be completely displayed from the beginning, and multiple dynamic content can be displayed on a first-come-first-served basis. The disadvantage is that it is not SEO friendly(because dynamic content can be inserted into js).

3 Dynamic content

Dynamic content is resolved during the rendering process to the area that needs to be dynamically acquired, and a dynamic content request will be initiated on the edge. Dynamic content supports reaching the server(source station) in the form of dynamic acceleration. Continuous node and back-end dynamic content interaction can be divided into three ways:

  • The first is that the back-end dynamic content returns a full amount of pages, which need to be extracted from the content through annotation tags. The advantage of this method is that it is less intrusive to existing services, but the disadvantage is that the dynamic content transmission volume is large, and the dynamic content needs to be intercepted after downloading the complete html.
  • The second is that the back-end dynamic content only returns the content of the dynamic block. The advantage of this method is that the dynamic response can be streamed back to the user. The disadvantage is that the page needs to provide a url that only returns the content of the dynamic block.
  • The third is that the back-end dynamic content only returns data. With the dynamic rendering template in the static template, the dynamic html is rendered on the edge node and returned to the user. The advantage is that the amount of data transmitted with the backend is small, and there is no need for the backend to have SSR capability. The disadvantage is that developers need to maintain an additional set of template logic, and complex template rendering on the edge nodes may have CPU overhead and restrictions.

The dynamic content interaction between users and edge nodes can be divided into two forms:

  • Waterfall flow(corresponding to WATER \ _FALL in routing configuration):Dynamic content is returned in the form of waterfall flow. Although the operation of loading multiple dynamic content on the edge node is parallel, for users, the content of the page will be displayed in order from top to bottom. The advantage of this method is that it is SEO friendly and does not affect the loading order of page modules. The disadvantage is that when there are multiple dynamic modules, the frame of the overall page cannot be seen, the content of the first dynamic block will block the subsequent display of the dynamic block content, and the js css resources at the bottom of the page cannot be loaded and executed in advance.
  • Embedded(corresponding to ASYNC \ _INSERT in the routing configuration):static content is returned all at once, and the dynamic part of the content will first occupy some pits. Subsequent dynamic content will be inserted into the previously occupied pits in the form of innerHTML. The advantage of this method is that the js css resources at the bottom of the page cannot be loaded and executed in advance, and the page can see a full picture first. The disadvantage is that it is not friendly to SEO, and the execution order of the page module will change according to the return speed of the dynamic block. It is necessary to do some judgment and compatibility in the page logic of the browser.

Edge routing

Routing configuration:

  version:'0.0.1' //Configure the version number
    pageName:'seo', //Page name identification
    match:'/abc/efg/.*', //page path matches regular string
      //Rendering configuration
      renderType:'ESR', //Edge rendering
      templateType:'FULL_HTML', //Template type:use the complete HTML from SSR as a template
      dynamicMode:'WATER_FALL | ASYNC_INSERT', //Dynamic content append return method:waterfall flow return | asynchronous pit filling(innerHTML)
      templateUrl:'' //template url
        templateType:'STATIC', //Static template, available through cdn url
        dynamicMode:'WATER_FALL | ASYNC_INSERT', //Dynamic content append return method:waterfall flow return | asynchronous pit filling(innerHTML)
        renderType:'REDIRECT_302', //302 jump
         renderType:'PROXY_PASS', //301 jump

Routing can be considered as an entry point for edge computing. Only the pages in the routing configuration will take the corresponding rendering process. Otherwise, the page will go directly back to the source to obtain the complete content of the page. The json above is the routing configuration file currently designed. The configuration file will eventually be sent to assets cdn in a static resource way, in an overlay release. At the same time, in order to support the configuration release grayscale, there will be two configurations of the grayscale version and the full version on the line. Configure a fixed ratio in the routing code and load the grayscale or full version configuration.

At present, three rendering modes are designed in routing, namely streaming rendering, redirection and reverse proxy. The configuration of redirect and reverse proxy is relatively simple, similar to the nginx configuration, only need to mention the target url.


Scope of influence control

  • CDN switch:domain names are switched according to region and proportion, and traffic can be switched back to unified access from cdn at any time.
  • Edge computing SCOPE switch:Configure edge computing to cover paths on CDN, and control edge computing to run only under partial paths.
  • Edge computing routing switch:By reading the routing configuration in edge computing, only part of the page is controlled for streaming rendering, otherwise it is requested to go directly to dynamic acceleration to obtain the complete page content.

Exception handling

  • dns switch, if there is a serious problem with cdn, directly switch from dns to unified access.
  • If the basic functions of edge computing are abnormal, turn off the edge computing of all paths on the cdn configuration platform and take the default dynamic acceleration.
  • If you enter the edge rendering, there is an error before returning any response content to the client, capture the error and downgrade to get the complete page content.
  • If you enter the edge rendering, the response of the static part has been returned to the client, and then there is a problem loading the dynamic content at the edge node(timeout, http error code, does not match the version number of the static content), and returns a location.reload() Script tag, and end the response to force the page to refresh. When refreshing, you can bring the query parameter of bypass edge calculation to ensure that the edge rendering is not taken during refresh.


1) Gray code of edge calculation

The platform itself supports grayscale release of edge computing code.

2) Routing configuration gray

In the edge computing code, according to a fixed ratio, two configuration URLs of the gray version and the official version are loaded. Only grayscale configuration is released when grayscale is released, and full configuration is released when full volume is released. Clear the cdn cache at the time of publishing.

3) Gray content of the page

Give the grayscale page a special template version number. If you encounter this version number, you will not go to the edge to render.

Smooth release

Under the split front-end mode, there is a common problem:smooth release. When the release of static resources(js, css) of the page is not published together with the backend, it may cause a problem that the HTML content returned by the backend does not match the content of js and css in the frontend. If the mismatch between the two is not done compatible, there may be a disordered style or the document selector cannot find the element.

One way to solve the smooth release is to be compatible with the code when doing the front-end and back-end simultaneous changes. This will not affect the usability of the page.

Another way is by version number. Manually configure the version number on the back-end page. When there are incompatible releases, first send the front-end resources, and then manually modify the version number on the back-end, to ensure that only the back-end machines that have been successfully published, the new version of the static resources referenced in HTML.

The problem of smooth release actually exists in batch release and Beta release scenarios. Only in the ESR scenario, we cache the static part on cdn, which will make the front-end and back-end inconsistencies more likely. In order to solve this problem, it is necessary for the developers of the corresponding business to identify the risks at the time of release. If compatibility has already been done, no special treatment is required. However, if there is no compatibility, you need to modify the version number of the page template, the new version of the dynamic content, when you encounter static content that does not match the version number, you will give up this streaming rendering to ensure that the page does not produce dynamic content and static content. Compatibility problems.

Edge cdn service provider

At present, the major cdn service providers support edge computing as follows:


  • Supports edge computing in a service worker-like environment, and functions meet the needs.
  • Overseas nodes are currently limited, and the performance of some regions can even be compared with akamai, but the performance of some domain names is still slightly worse than akamai due to the lack of nodes.
  • Only support simple request to rewrite the calculation, not meet the needs of edge rendering.
  • ESI can assemble dynamic and static content, but does not support streaming, dynamic content will block the first screen.
  • There are many overseas nodes, which have performance advantages over alicdn in some regions.


  • Supports edge computing in a service worker-like environment, and functions meet the needs.
  • No experience in use, if you want to use it, the process may be more complicated.

Floor Plan

We will experiment in a typical first-hop scenario. At present, it has been launched in grayscale. Through the comparison of the webpagetest test program in Indonesia and the non-progress program, the optimization effect can be seen:
1.ttfb reduced by 1s
2. White screen time is reduced by 1s
3. The core content display time is reduced by 500ms

webpagetest comparison results:


\ [1 ]cloudfare edge worker
( )
\ [2 ]2016-the year of web streams
( )
\ [3 ]ESI( )
\ [4 ]Async Fragments:Rediscovering Progressive HTML Rendering with Marko( )
\ [5 ]The Lost Art of Progressive HTML Rendering
([] ( -rendering /))

Welfare is here | Free download of "big front-end technology covering full-end business"

Youku has many front-end business scenarios and complicated technology stacks, and its requirements on front-end engineering capabilities are becoming higher and higher. Alibaba Entertainment will expand the technical challenges and solution process encountered by the team in detail, hoping that the deduction of the solution will take the lead and explore the technical thinking and precipitation of the Youku front-end team in supporting the business process, which will bring some inspiration to the readers.

Click " Read Original " at the end of the article to download now!