Microservices technology stack: traffic shaping algorithm, service fuse and degradation
Posted Jun 28, 2020 • 7 min read
The core function of flow control is to limit the flow and burst of a connection flowing out of a network, so that such packets flow and send at a relatively uniform speed, and the purpose of protecting the system is relatively stable. Usually, the request is placed in a buffer or queue, and then the request is processed based on a specific strategy, uniform speed or batch processing. This process is also called traffic shaping.
There are two core algorithms for flow control:leaky bucket algorithm and token bucket algorithm.
- Leaky bucket algorithm
The leaky bucket algorithm is an algorithm often used in traffic shaping or rate limiting. Its main purpose is to control the rate at which data is injected into the network and smooth out burst traffic on the network. The leaky bucket algorithm provides a mechanism by which burst traffic can be shaped to provide a stable flow for the network.
The basic idea of the leaky bucket algorithm:the request(water flow) first enters the container(leaky bucket), and the leaky bucket emits water at a certain speed. This refers to the strategy of flow outflow. When the flow inflow rate is too large, the container cannot directly overflow. Through this process, the data transmission rate is limited.
the main elements
Through the above process, it is not difficult to find that the leaky bucket algorithm involves the following elements:
The size of the container directly determines the amount of flow that can be accepted. Once the container is close to saturation, it will either overflow or accelerate the flow rate;
The flow rate of traffic depends on the request processing capability of the service. The higher the concurrency supported by the interface, the higher the flow rate;
Based on the time record, determine the flow outflow speed, control the uniform speed mode,
Note:A basic decision strategy is required. The leaky bucket algorithm does not need to be enabled when the system can accept current concurrent traffic.
- Token bucket algorithm
The token bucket can continuously generate tokens at a constant rate by itself. If the token is not consumed, or the rate of consumption is less than the rate of generation, the token will continue to increase until the bucket is filled. The tokens generated later will overflow from the bucket.
Although the basic purpose of the token bucket algorithm is also to control the traffic speed, when there are enough tokens in the token bucket, the traffic is allowed to be concurrent in stages. Data packets transmitted to the token bucket need to consume tokens. Different sizes of data packets consume different amounts of tokens.
the main elements
Store tokens generated at a specific rate to control the flow rate.
The matching rules here are more for distributed systems. For example, service A is the core transaction of the system. When concurrency occurs, the most matching rule based on the token bucket allows only transaction requests to pass. For example:during the common double eleven, each The big e-commerce platform prompts that in order to ensure core transactions, the data of edge services is delayed or suspended.
Note:Although the purpose of the token bucket algorithm and the leaky bucket algorithm is the same, the implementation strategy is opposite, but there is a problem. In order to ensure that most request traffic is successful, a small part of the request will be sacrificed.
Second, the current limiting component
- Nginx proxy component
The actual operation mode of Nginx reverse proxy refers to receiving the client connection request with a proxy server, and then forwarding the request to the server on the internal network, and returning the result obtained from the server to the client. At this time, the proxy server behaves externally. As a server.
Traffic limit is a very practical function of Nginx as a proxy service. It is configured to limit the number of HTTP requests by users in a given time. The two main configuration instructions
limit_req, in order to protect high concurrency The stability of the system.
- CDN edge node
CDN edge nodes, to be precise, are not used to handle traffic restrictions, but store static pages. Content caching is a CDN network node, located at the user access point. It is a content providing device for end users. It can cache static Web content and streaming media content to achieve the edge propagation and storage of content, so that users can access nearby, so as to avoid a large number of users. Refresh the data server to save backbone network bandwidth and reduce bandwidth requirements.
In high concurrency scenarios, especially during countdown to buy similar services, users will generate a lot of page refresh operations before and after the event. Based on the CDN node, these requests will not sink to the data service interface. You can also do some request interception based on the page, such as clicking on the page to only allow a certain amount of requests per unit time, which can also achieve a current limit control.
The so-called fuse mechanism, that is, a fuse like current, of course, if the voltage is too high, it will automatically trip, thereby protecting the circuit system. The service protection in the microservice architecture is also this strategy. When the service is judged to be abnormal, it will be disconnected from the service list and wait for recovery to reconnect. There are several commonly used components to implement the strategy of service fuse degradation.
- Hystrix components
Hystrix is currently in maintenance mode, that is, it is no longer updated. As the most native fuse component in the SpringCloud microservices component, many ideas are still necessary to understand. For example:service fuse, stop the chain reaction of failure, fail quickly and recover quickly, service degradation, etc.
When a microservice fails, it is necessary to quickly cut off the service, prompt the user, follow up the request, and return directly without releasing the service. This is the service fuse.
When the server is highly concurrent and the pressure increases dramatically, according to the current business situation and traffic, some services and pages are strategically downgraded(can be understood as shutting down unnecessary services) to ease the pressure on server resources to ensure the core tasks normal operation. After the fuse becomes effective, the request will be called after a specified time to test whether the dependency is restored, and the fuse will be closed after the dependent application is restored.
First determine the switch status of the service fuse. If the service is not blown, the request is released; if the service is blown, it returns directly.
Each call executes two functions markSuccess(duration) and markFailure(duration) to count the number of successful and failed calls within a certain period of time.
Based on the above calculation strategy for the number of successes and failures, it is determined whether the fuse should be opened. If the error rate is higher than a certain threshold, the fuse mechanism will be triggered.
The fuse has a life cycle. After the period, the fuse enters a half-open state, allowing a trial request to be released; otherwise, it is not allowed to be released.
- Sentinel components
Based on the microservices model, the stability between services becomes increasingly important. Sentinel uses flow as an entry point to protect the stability of services from multiple dimensions such as flow control, fuse degradation and system load protection.
Sentinel can collect the paths of resources based on different operating indicators(such as QPS, concurrent calls, system load, etc.) for different call relationships, and store the call paths of these resources in a tree structure for Call the path to control the flow of resources.
Traffic Shaping Strategy
Direct rejection mode is the default flow control method, that is, after the request exceeds the threshold of any rule, the new request will be rejected immediately.
Start the preheating mode:When the flow rate surges, control the flow rate of the flow, let the flow rate increase slowly, and gradually increase to the upper limit of the threshold within a certain time, give the cold system a warm-up time to avoid the cold system from being crushed.
The uniform speed queuing method will strictly control the interval time for request passing, that is, let the request pass at a uniform speed, corresponding to the leaky bucket algorithm.
Sentinel is essentially based on the fuse mode, which supports fuse degradation based on anomaly ratio. When the call reaches a certain level and the failure rate reaches the set threshold, it will automatically fuse. At this time, all calls to this resource will be blocked until Heuristically restored after the specified time window.
Four, source code address
GitHub·Address https://github.com/cicadasmile/husky-spring-cloud GitEE·Address https://gitee.com/cicadasmile/husky-spring-cloud
Recommended reading:Microservice Architecture Series