From inputting URL to rendering page-network protocol

Posted May 29, 202012 min read

This article is the second article from the input URL to the rendering page column-Network Protocol
We know that the TCP/IP protocol divides the network protocol into four layers

We focus on the lower application layer, network layer and transport layer

How is data transmitted on the network?

Network layer IP

If data is to be transmitted on the Internet, it must comply with the Internet Protocol(IP) standard. Different online devices on the Internet all have unique address identifiers, represented by a number.
Analogous to our usual online shopping, using the unique identifier of our receiving address analog device, we know the receiving address, we can send parcels to this address. The address of a computer is called an IP address, and accessing any website is really just your computer requesting information from another computer.
If you want to send a data packet from host A to host B, before the transmission, the data packet will be appended with the IP address information of host B, so that it can be correctly addressed during the transmission process. In addition, the data packet will also be attached with the IP address of host A. Only with this information can host B reply to the message to host A. This additional information will be packed into a data structure called an IP header.

Let's take a look at the simplified transmission process of the next easy-to-understand data packet from host A to host B(not a layer 4 network protocol):

  • The upper layer hands the data packet to the network layer;
  • The network layer appends the IP header to the data packet to form a new IP data packet, and hands it to the bottom layer;
  • The bottom layer transmits the data packet to the host B through the physical network;
  • The data packet is transmitted to the network layer of host B, where host B disassembles the IP header information of the data packet and hands the disassembled data to the upper layer;
  • Eventually, the data packet containing the information reaches the upper layer of host B.

Transport layer UDP/TCP

The IP-based transmission we discussed above is a very low-level protocol. It is only responsible for transmitting data packets to the other party's computer, but the other party's computer does not know who to hand over the data packet to. Therefore, it is necessary to develop a protocol based on IP that can interact with applications, that is, the transport layer. The most common ones are the UDP and TCP protocols.

By adding a transmission layer, we can expand the previous three-layer structure to a four-layer structure, as shown in the following figure

Let's look at the data transmission route with the addition of the transmission layer:

  • The upper layer hands the data packet to the transport layer;
  • The transport layer will add a UDP/TCP header in front of the data packet to form a new data packet, and then give the new data packet to the network layer;
  • The network layer appends the IP header to the data packet to form a new IP data packet, and hands it to the bottom layer;
  • The data packet is transmitted to the network layer of host B, where host B disassembles the IP header information and hands the disassembled data to the transport layer;
  • At the transport layer, the UDP/TCP header in the data packet will be disassembled, and according to the port number provided in UDP/TCP, the data part will be handed over to the upper layer application;
  • In the end, the data packet arrives at the upper application of host B.

So what is the difference between using UDP and TCP transmission in the transport layer, and what scenarios are they suitable for? Let s take a look

Comparison of UDP and TCP

When using UDP to send data, there are various factors that can cause errors in the data packet. Although UDP can verify whether the data is correct, UDP does not provide a retransmission mechanism for the wrong data packet, but just discards the current packet. After sending, it is impossible to know whether the destination can be reached. Although UDP does not guarantee data reliability, but the transmission speed is very fast, so UDP will be used in some areas that focus on speed but do not strictly require data integrity, such as online videos and interactive games.

Disadvantages of UDP:

  • Data packets are easily lost during transmission and there is no retransmission mechanism;
  • Large files will be split into many small data packets for transmission. These small data packets will go through different routes and arrive at the receiving end at different times. The UDP protocol does not know how to assemble these data packets, so it cannot Restore these packets to a complete file.

In response to the shortcomings of UDP, in addition to the destination port and the local port number, the TCP header also provides a sequence number for sorting, so that the receiving end can rearrange the data packets by the sequence number.
In addition, transmission can guarantee the reliability of data, and provides a retransmission mechanism.
So how does TCP do it, this has to mention the famous "three handshake" and "four wave".

Let's see what the TCP transmission process is like:

  • First, the connection establishment phase(three-way handshake). TCP provides connection-oriented communication transmission. Connection-oriented refers to making preparations between the two ends before data communication begins. The so-called three-way handshake means that when establishing a TCP connection, the client and the server must send a total of three data packets to confirm the establishment of the connection.
  • Second, the data transmission stage. At this stage, the receiving end needs to confirm each data packet, that is, after receiving the data packet, the receiving end needs to send the confirmation data packet to the sending end. Therefore, after the sender sends a data packet, if it does not receive the confirmation message fed back by the receiver within the specified time, it is determined that the data packet is lost and triggers the sender's retransmission mechanism. Similarly, a large file will be split into many small data packets during transmission. After these data packets arrive at the receiving end, the receiving end will sort them according to the sequence number in the TCP header, so as to ensure the complete data.
  • Finally, the disconnection phase(four waved hands). After the data transmission is completed, the connection is terminated.

The following is a schematic diagram of the classic three-way handshake and four-hand wave on the Internet

  • The client initiates a connection, sends a SYN packet to indicate the establishment of the connection, and has a SEQ = X(random number)
  • After the server receives it, it responds to the client and sends a connection request. Send a packet containing both syn and ack, in this case ack = X + 1, SEQ = Y(random number).
  • After receiving the response from the server, the client sees ack = X + 1 to know that the server has accepted my previous request. Respond to the server connection request, ack = Y + 1


  • The first time the client sends out a FIN packet and a seq = x, then the client enters the FIN \ _WAIT \ _1 phase
  • The server replies with an ack = x + 1(the principle is the same as above) and seq = y, indicating that it has received the client s shutdown request, and then enters CLOSE \ _WAIT state, and the client enters FIN \ _WAIT \ _2
  • After the server processes its other packages, it sends an ack = x + 1 and seq = y. At this time, the server enters the LAST \ _ACK state and no longer responds to the message
  • The client enters the TIME \ _WAIT state after receiving a FIN packet from the server and ack = y + 1. The server enters the CLOSED state directly after receiving the packet. The client waits for two msl(Maximum Segment Lifetime) After no response is received, it means that the server is shut down normally, so it also enters the CLOSED state and closes the connection.

By now you should understand that TCP sacrifices the transmission speed of data packets to ensure the reliability of data transmission.

HTTP protocol

Next we look at the development history of the application layer HTTP protocol

HTTP1 era


First, let's take a look at the earliest HTTP/0.9. His appearance is mainly used for academic communication, and the demand is very simple-used to transfer HTML hypertext content between networks, so it is called the hypertext transfer protocol. Overall, its implementation is also very simple, using a request-response-based model, where a request is sent from the client and the server returns the data.

  • Because HTTP is based on the TCP protocol, the client must first establish a TCP connection based on the IP address, port, and server, and the process of establishing the connection is the process of the TCP protocol three-way handshake.
  • After the connection is established, a GET request line will be sent, such as GET /index.html to get index.html.
  • After the server receives the request information, it reads the corresponding HTML file and returns the data to the client as a stream of ASCII characters.
  • After the HTML document is transferred, disconnect.

In general, the demand at that time was very simple, that is, it was used to transmit small HTML files, so the implementation of HTTP/0.9 has the following three characteristics.

  • The first is that there is only one request line, and there is no HTTP request header and request body, because only one request line can fully express the client's needs.
  • The second is that the server does not return header information, because the server does not need to tell the client too much information, only need to return data.
  • The third is that the returned file content is transmitted in ASCII character stream. Because they are all HTML format files, it is most appropriate to use ASCII bytecode for transmission.


With the development of the Internet, it is no longer sufficient to transmit HTML. It also includes JavaScript, CSS, pictures, audio, video and other different types of files. Therefore, supporting multiple types of file downloads is a core requirement of HTTP/1.0, and the file format is not limited to ASCII encoding, there are many other types of encoded files.

In order to allow the client and server to communicate more flexibly, HTTP/1.0 introduces request headers and response headers, which are all saved as Key-Value. When sending a request in HTTP, the request header information will be brought and the server returns data , The response header information will be returned first. For example, the following code is part of the request header and response header information:

accept:text/html //Expect the server to return html files
accept-encoding:gzip, deflate, br //Expect the server to use one of gzip, deflate or br compression
accept-Charset:ISO-8859-1, utf-8 //indicates that the file encoding expected to be returned is UTF-8 or ISO-8859-1
accept-language:zh-CN, zh //Expect the preferred language of the page to be Chinese

content-encoding:br //indicates that the server uses the br compression method
content-type:text/html; charset = UTF-8 //indicates that the server returns an html file, and the encoding type of the file is UTF-8

This is a way of communication between the browser and the server in the 1.0 era, as if two people were talking about the "signature".

In addition to providing good support for multiple files, HTTP/1.0 also introduces many other features, which are implemented through request headers and response headers.
Let's take a look at some new typical features:

  • Some request servers may not be able to process, or processing error, at this time you need to tell the browser server to finally process the request, which introduces a status code. The status code notifies the browser through the response line.
  • In order to reduce the pressure on the server, a Cache mechanism is provided to cache the downloaded data.
  • The server needs to count the basic information of the client, such as the number of users of Windows and macOS, so the user agent field is added to the request header.


Although 1.0 has been able to cope with most scenarios, it still has the following defects:

  • Each HTTP communication needs to go through three stages of establishing a TCP connection, transmitting HTTP data and disconnecting a TCP connection-adding a persistent connection method(Connection:keep-alive), which is characterized by being on a TCP connection Multiple HTTP requests can be transmitted, as long as the browser or server does not explicitly disconnect, then the TCP connection will remain.(Currently, for the same domain name in the browser, it is allowed to establish 6 TCP persistent connections at the same time by default)
  • Team head blocking problem-not resolved
  • Each domain name is bound to a unique IP address, so one server can only support one domain name. However, with the development of virtual host technology, it is necessary to bind multiple virtual hosts to a physical host, each virtual host has its own separate domain name, and these separate domain names all share the same IP address-the request header The Host field has been added to indicate the current domain name address, so that the server can do different processing according to different Host values.
  • You need to set the complete data size in the response header, such as Content-Length:901, so that the browser can receive data according to the set data size. However, with the development of server-side technology, the content of many pages is dynamically generated, so the final data size is not known before the data is transmitted, which results in the browser not knowing when it will receive all the file data- To solve this problem by introducing the Chunk transfer mechanism, the server will divide the data into several data blocks of any size, each data block will be attached with the length of the last data block, and finally use a zero-length block as the sending data Completed sign. This provides support for dynamic content.


Although HTTP/1.1 has adopted many strategies to optimize the loading speed of resources, and has also achieved certain results, but HTTP/1.1 is not ideal for bandwidth utilization, which is also a core issue of HTTP/1.1.(Bandwidth refers to the maximum number of bytes that can be sent or received per second. We call the maximum number of bytes that can be sent per second the upstream bandwidth, and the maximum number of bytes that can be received per second is the downstream bandwidth.) This problem is mainly caused by the following three reasons.

  • TCP slow start. Once a TCP connection is established, it enters the state of sending data. At the beginning, the TCP protocol will use a very slow speed to send data, and then slowly increase the speed of sending data until the speed of sending data reaches an ideal state. This process is called slow start(similar to the process of starting a car). Slow start is a strategy of TCP in order to reduce network congestion, and we have no way to change it. The reason why slow start will bring performance problems is because some of the key resource files commonly used in the page are not large, such as HTML files, CSS files, and JavaScript files. Usually these files will initiate requests after the TCP connection is established. Yes, but this process is a slow start, so it takes much more time than normal, which delays the time for the precious first rendering of the page.
  • Multiple TCP connections are open at the same time, these connections will compete for fixed bandwidth. The system establishes multiple TCP connections at the same time. When the bandwidth is sufficient, the sending or receiving speed of each connection will increase slowly; and once the bandwidth is insufficient, these TCP connections will slow down the sending or receiving speed. For example, if there are 200 files on a page and 3 CDNs are used, then when loading the page, 6 \ * 3, that is, 18 TCP connections, are needed to download resources. During the download process, when insufficient bandwidth is found, Each TCP connection needs to dynamically slow down the speed of receiving data. This will cause a problem, because some TCP connections download some key resources, such as CSS files, JavaScript files, etc., while some TCP connections download common resource files such as pictures and videos, but many TCP connections There is no time to negotiate which key resources should be given priority to download, which may affect the download speed of those critical resources.
  • The head of the team is blocked. We know that when using persistent connections in HTTP/1.1, although a TCP pipe can be shared, only one request can be processed at a time in a pipe. Before the current request is ended, other requests can only be blocked.

HTTP2 introduced the famous multiplexing technology to solve the above three problems.

  • One domain name uses only one TCP long connection and eliminates the head-of-line blocking problem.
  • Divide the request into one frame of data to transmit, so that the request can be parallel.

How does the request proceed after adding the multiplexing technology?

  • First, the browser prepares the request data, including the request line, request header and other information. If it is the POST method, then there must be a request body.
  • After these data are processed by the binary framing layer, they will be converted into frames with request ID numbers, and these frames will be sent to the server through the protocol stack.
  • After receiving all the frames, the server will merge all the frames with the same ID into a complete request message.
  • Then the server processes the request and sends the processed response line, response header and response body to the binary framing layer respectively.
  • Similarly, the binary framing layer converts these response data into frames with request ID numbers and sends them to the browser through the protocol stack.
  • After receiving the response frame, the browser will submit the frame data to the corresponding request according to the ID number.

Through the above analysis, we know that multiplexing is the core function of HTTP/2, which can realize the parallel transmission of resources. The multiplexing technology is based on the binary framing layer. Based on the binary framing layer, HTTP/2 also implements many other functions, let's take a brief look below.

  1. The priority of the request can be set. We know that some data in the browser is very important, but when sending the request, the important request may be later than those less important requests. If the server replies to the data in the order of the request, then this important data may be It takes a long time to reach the browser, which is very unfriendly to the user experience. In order to solve this problem, HTTP/2 provides request priority, you can mark the request priority when sending the request, so that after the server receives the request, it will give priority to the request with higher priority.
  2. Server push. HTTP/2 can also directly push data to the browser in advance. After the user requests an HTML page, the server knows that the HTML file will refer to the JavaScript file and CSS file, then after receiving the HTML request, the CSS file and the JavaScript file to be used are sent to the browser together, so that when the browser After parsing the HTML file, you can directly get the required CSS file and JavaScript file, which plays a crucial role in the speed of opening the page for the first time.
  3. Head compression. HTTP/2 compresses the request header and response header. On the one hand, header information is compressed using gzip or compress before sending; on the other hand, the client and server maintain a header information table, all fields will be stored in this table, an index number is generated, and the same field will not be sent in the future , Only the index number is sent, which increases the speed.