A simple introduction to Kubernetes Ingress

Posted May 26, 20206 min read

This article is reproduced from Rancher Labs

I don't know if you noticed a strange phenomenon. Although the Kubernetes Ingress API is still in the bata state, many companies have used it to expose Kubernetes services. Engineers working on related projects said that the Kubernetes Ingress API is increasingly likely to take off its beta label. In fact, the Kubernetes Ingress API has been in beta for several years. To be precise, it entered this stage in the autumn of 2015. However, the lengthy beta phase will allow Kubernetes contributors time to perfect the specification and align it with the already implemented implementation software(HAProxy, NGINX, Traefik, etc.), thereby standardizing the API to reflect the most common and demanded features .

With the approach of this feature GA, it should be a suitable time to help novices quickly understand how Ingress works. In short, Ingress is a rule that can map out how services within the cluster bridge the gap and expose the outside world where customers can use it. At the same time, an agent called an Ingress controller listens at the edge of the cluster network(monitoring the rules to be added) and maps each service to a specific URL path or domain name for public use. While the Kubernetes maintainers are developing APIs, other open source projects have also implemented Ingress Controller and added their own unique features to their agents.

In this article, I will introduce these concepts and help you understand the driving force behind the Ingress mode.

Routing problem

When creating a Pod in Kubernetes, you need to assign it a selector tag, as shown in the following fragment of the Deployment manifest:

The Deployment creates three copies of the Docker image my-app and assigns it the tag app = foo. In addition to directly accessing Pods, they are usually grouped under Service, which allows them to be used on a single cluster IP address(but only in the same cluster). Service acts as an abstraction layer, hiding the short-lived characteristics of pods, which can be increased or decreased or replaced at any time. It can also perform basic cyclic load balancing.

For example, the following Service definition collects all Pods with selector tag app = foo and routes traffic among them.

However, this service can only be accessed from within the cluster and other Pods running nearby. Kubernetes Operator is trying to solve how to provide access permissions for clients outside the cluster. This problem has appeared early, and the two mechanisms are directly integrated into the Service specification for processing. When writing the service manifest, include a field called type, the value of this field is NodePort or LoadBalancer. This is an example of setting the type to NodePort:

NodePort-type services are simple to use. Essentially, these services want the Kubernetes API to assign them a random TCP port and expose it to the cluster. This is convenient because the client can use this port to target any node in the cluster, and their messages will be relayed to the correct location. This is similar to you can make any call within the United States, and the person who answers the call will make sure to transfer you to the right person.

The disadvantage is that the value of the port must be between 30,000 and 32767. Although this range safely avoids the range of commonly used ports, it is obviously not very standard compared to the common HTTP ports 80 and HTTPS 443. In addition, randomness itself is also an obstacle, because it means that you do n t know what the value is in advance, which makes configuring NAT and firewall rules more challenging especially the need to set different random ports for each service.

Another option is to set the type to LoadBalancer. However, there are some prerequisites-it only works if you are running in a cloud hosting environment like GKE or EKS and can use the cloud provider s load balancer technology, because it is automatically selected and configured . The disadvantage is that it is more expensive, because using this type of service will start a hosted load balancer and a new public IP address for each service, which will incur additional costs.

Ingress routing

Assigning a random port or external load balancer is easy to operate, but it also brings unique challenges. Defining many NodePort services will cause random port confusion, and defining many load balancer services will result in more cloud resource fees than actually needed. These situations cannot be completely avoided, but it may be possible to reduce its scope of use, even if you only need to allocate a random port or a load balancer to expose many internal services. Therefore, this platform requires a new abstraction layer that can integrate many services behind entry points.

At that time, the Kubernetes API introduced a new manifest called Ingress, which provided new ideas for routing issues. The way it works is this:you write an Ingress manifest, stating how you want the client to route to the service. The manifest does not actually perform any operations on your own. You must deploy the Ingress Controller to your cluster to monitor these declarations and perform operations on them.

Like any other application, Ingress controllers are Pods, so they are part of the cluster and can see other Pods. They are built using reverse proxy that has been developed in the market for many years, so you can choose HAProxy Ingress Controller, NGINX Ingress Controller, etc. The bottom layer proxy provides layer 7 routing and load balancing functions. Different agents put their own feature sets in the table. For example, the HAProxy Ingress Controller does not need to be reloaded as frequently as the NGINX Ingress Controller because it allocates slots to the server and uses the Runtime API to populate the slots at runtime. This makes the Ingress Controller have better performance.

The Ingress Controller itself is located inside the cluster, and like other Kubernetes pods, it is also susceptible to "prison" in the same "prison". You need to expose them to the outside through NodePort or LoadBalancer type services. However, now that you have only one entry point, all traffic will pass through here:a service is connected to an Ingress Controller, which in turn is connected to many internal Pods. The Controller has the function of checking the HTTP request, and can direct the client to the correct Pod based on the features it discovers(such as URL path or domain name).

Refer to this Ingress example, which defines how the URL path/foo should connect to the backend service named foo-service, and the URL path/bar is directed to the service named bar-service.

As shown above, you still need to set the service for your Pod, but you do not need to set the type field on the Pod, because routing and load balancing will be handled by the Ingress layer. The role of the service is simplified to group Pods under a common name. Eventually, the two paths,/foo and/bar, will be served by a public IP address and domain name, such as example.com/foo and example.com/bar. Essentially, this is the API Gateway model, where a single address routes requests to multiple back-end applications.

Add Ingress Controller

The declarative method of the Ingress manifest is that you can specify the required content without knowing how to implement it. One of the tasks of the Ingress Controller is to execute. It needs to monitor the new ingress rules and configure its underlying agents to formulate corresponding routes.

You can use the Kubernetes package management tool Helm to install HAProxy Ingress Controller. First, install Helm by downloading the Helm binary file and copying it to the folder contained in the PATH environment variable(eg/usr/local/bin /). Next, add the HAProxy Technologies Helm library and use the helm install command to deploy the Ingress Controller.

Run the command kubectl get service to list all running services to verify that the Ingress Controller has been created.

The HAProxy Ingress Controller runs in the pods of the cluster and uses NodePort-type Service resources to publish access to external clients. In the output shown above, you can see that port 31704 is selected for HTTP and port 32255 is selected for HTTPS. You can also view the HAProxy information statistics page on port 30347. HAProxy Ingress Controller will provide detailed indicators about the traffic flowing through it, so you can better observe the traffic entering the cluster.

When the controller creates a service of type NodePort, this means that a random port needs to be assigned and the port number is often high, but now you only need to manage a few such ports, that is, only the ports connected to the Ingress Controller, There is no need to create a port for each service. You can also configure it to use the LoadBalancer type, as long as it is operated in the cloud. It looks as follows:

Overall, there is no need to manage too many Ingress Controllers. After installation, it basically performs its work in the background. You only need to define the Ingress manifest and the controller will connect them immediately. The definition of the Ingress manifest is different from the referenced service, so you can control when the service is exposed.

in conclusion

Ingress resources integrate how external clients can access services in the Kubernetes cluster by allowing API gateway-style traffic routing. Proxy services are relayed through public entry points. You can use intent-driven and YAML statements to control when and how to expose services.

After the Ingress API function GA, you will definitely see this model become more and more popular. Of course, some subtle changes may occur, mainly to make the API consistent with the functions already implemented in the existing controller. Other improvements may guide how controllers continue to evolve to meet the vision of Kubernetes maintainers. All in all, now is a good time to start using this feature!