Bare metal in the cloud native world

Posted May 25, 20209 min read

Guide:What do you think of when you see "bare-metal"? No, you may be wrong. The bare metal in the cloud native world is far more exciting than you think.

What do you think when you hear the word "bare metal"? To me, it is reminiscent of very tangible images, connecting rows of servers together with colored cables, and a loud fan spins and locks in the basement or warehouse.

As an ambassador for the Cloud Native Computing Foundation, I spend most of my time on the stack, which are "level 7"(application) and "level 8"(personnel). I'm used to using APIs and tools such as Terraform to get calculations when needed. One of the most important parts of my career is to convince people to forget the server.

So, what does bare metal have to do with "cloud native"? If you are a novice in this field, what do you need to know?

I asked my fans on Twitter to share their favorite bare metal tools, which produced a curated list:great bare metal. Please continue reading to understand the comparison between bare metal and cloud-native cloud, as well as some of the concepts involved and the challenges facing administrators.

Cloud Native Horizon

When deploying a piece of code, you can call a REST API. There is nothing easier than this, go directly to your AWS dashboard and paste the code into the text box, and then see the code is executed by AWS Lambda for billing per second and automatic scaling. Since its birth, the overall interface of Lambda functions has barely changed, making it a relatively stable platform.

The above picture shows the CPU count and system uptime of Lambda function on hello-world.

AWS Lambda is a SaaS product, which is open source. Therefore, if you are concerned about the risk of falling into a single vendor, you can consider using Kubernetes. Compared with AWS Lambda, Kubernetes uses Docker or OCI format container images to achieve portability between clouds. Once packaged in an image and pushed to the registry, your code can move between Kubernetes clusters with relative ease. If you like to use managed products to ease the maintenance burden, you can use many Kubernetes services, which can be set up quickly and easily.

One of the disadvantages of platforms like Kubernetes is that it changes with the speed of the node. Between each release, you may see major changes, and you must rewrite all the integrations you have, not to mention third-party tools and code generation tools that make changes at a similar pace. Therefore, managed clouds like Amazon EKS usually run multiple versions behind the community.

The most popular software of the Cloud Native Computing Foundation(CNCF) is concentrated on the top of the stack. Kubernetes is the foundation's first project, and many subsequent projects are complementary to the foundation at the operational or organizational level. Tools such as Prometheus and AlertManager improve operational efficiency and allow monitoring of many of our services. NATS provides high-speed messaging across local networks and the Internet. Linkerd builds a grid between containers to add indicators, strategies and end-to-end encryption.

As a practitioner and maintainer of a popular serverless project, I am often told "there are still servers in the serverless" as if I didn't know. For me, serverless has always been the focus of developers:it is not about hardware specifications and network VLANs, but about APIs. Reasoning based on API is a cloud-native way.

In a similar way, bare-metal servers are the foundation of Kubernetes and all cloud-native applications, whether they are accessed directly through the hypervisor or through the API of the IaaS provider.

You should know that "they are somewhere" similar questions are still needed for many people unless you have specific needs. Some companies, such as Cherry Servers, AWS, and Packet, have achieved a good balance between the advantages of bare-metal isolation and performance, and the powerful APIs usually associated with VMs.

"Use bare metal" literally, what you see is what you get. There is no confusing marketing term like "serverless" on the market, but GB of RAM, Gbps of network bandwidth and GHz of CPU are obtained. Since its creation, this is indeed refreshing, but it has not changed much.

About 20 years ago, I used bare metal work at school every day. I help network administrators by installing operating systems on i386 and i486 and newer computers with Intel Pentium processors. Just like today, the machines at the time still had hard drives, RAM, network cards, and storage.

We have about 5 laboratories, each of which has more than 30 computers. Sometimes this involves reading the CD-ROM and reinstalling the operating system, but sometimes it involves booting the computer over the network to remotely deploy the image to The laboratory. Compared to using a small amount of CD-ROM, it has greater scalability and takes less time.

This is where you start with bare metal and operating systems. Once installed, you may not be able to tell the difference between it and an EC2 virtual machine running on AWS.

I mentioned that the development of Kubernetes is relatively fast. It turns out that the hardware does not. We use the same tools and techniques used in the laboratory 20 years ago to start the system on the network.

Bare Metal Glossary

Just as Kubernetes and Cloud have their own terms, bare metal also has its own terms. I have compiled a quick glossary of concepts and tools:

  1. Network card:Use a cable to physically connect the computer to the network. It can be copper or, in some cases, optical fiber. Some computers have multiple network cards or ports.

  2. Management port:This is a server-specific concept. To increase efficiency, administrators need to manage computers remotely without plugging in a keyboard and mouse.

  3. Intelligent Platform Management Interface(IPMI):The management interface is often vendor-specific and uses the client as a Java interface to access through the network.

  4. Wake on LAN(WoL):You can use a remote wake-up computer instead of allowing remote management.

  5. PXE(Pre-Boot Execution Environment):used to boot the computer through the local network, only the network card is required. iPXE can be used to extend existing PXE firmware using TFTP, or it can be flashed directly to some network cards.

  6. iPXE:An updated open source network boot firmware, which also allows booting via HTTP and the Internet.

  7. Netbook:From the network boot tool, you do not need to physically access the computer to configure or install an operating system.

  8. DHCP:Assign IP address and other metadata(such as the main DNS server) to the network interface.

9.TFTP(Simple File Transfer):UDP-based file server for obtaining firmware to boot through the network.

  1. NFS(Network File System):NFS is one of the most common file systems used with network booting or file sharing. It allows Linux computers to work without their own disks. Unfortunately, NFS is not compatible with the overwrite file system used by the container

  2. iSCSI(Internet Small Computer System Interface):An alternative to NFS, it provides block-level devices instead of network file systems. You can use the ext4 file system to format the disk as needed, and even run Docker.

  3. Thin client:Projects like Linux Terminal Server Project LTSP enable you to turn any PC into a thin client without any local storage. This may be useful for IoT devices like Raspberry Pi, because the device needs to rely on a short-lived flash memory to run out

  4. Operating system-whether to deploy Windows, Linux or other. The operating system must usually be installed using an interactive UI, CLI or through a predefined configuration.

Of course, not all bare metal is created equal. For example, consumer devices such as workstations, home PCs, Intel NUC or Raspberry Pi are unlikely to have IPMI management ports.

Netbook example

This is an example workflow for starting a computer over a network(as shown below):

  1. The bare metal server is turned on.

  2. The network card attempts to boot through the network using PXE.

  3. Send a DHCP request to obtain an IP address.

  4. Receive a DHCP response with IP and prompt where to find the startup firmware.

  5. The PXE process now gets the firmware from the TFTP server and loads it.

At that moment, a file system will be installed via NFS on the netbook, and an operating system will be run remotely, or the OS will be installed to the local system using a temporary environment. In the subsequent boot, the hard disk will be used to load the operating system.

Some concepts, such as netbooks and network interfaces, somewhat overlap with virtual machines. The EC2 dashboard equivalent to IPMI on AWS, where you can select a disk image(also known as Amazon Machine Image/AMI) to start and customize the behavior of the computer.

Learn more about open source information Welcome to follow the WeChat public account "Open Source Village OSV"

Why do we need bare metal provisioning tools?

I recently asked my fans on Twitter what my favorite bare-metal configuration tool is. There are many different answers, including 5-10 projects, some of which are newer and others are more mature.

It turns out that although bare-metal and DHCP, low-level tools like TFTP and NFS have not changed much in more than 20 years, people are still trying to make them easier to automate. Many data centers contain heterogeneous hardware, some with RAID arrays, some without RAID arrays, some with one disk, some with two disks, and different firmware and functions.

Provisioning tools need to help us:

  1. Service:Provide software services(or servers) such as DHCP, TFP, NFS, HTTP, etc.

  2. Inventory:used to list and collect servers and their functions.

  3. Image repository:OS image repository, ready to be deployed to computers via the network. These images usually need to be customized, so they can be built using tools such as Packer.

  4. Delivery:Link the older tools together to create a safe way to install the operating system. Some projects refer to this as a "workflow," while others use state machines.

You can see the results of Tweet in my awesome bare metal GitHub repository, which covers bare metal configuration software and tools that simplify the use of low-level networks(such as MetalLB and imports).

This is a brief introduction of some projects mentioned by the community:

  1. Digital Rebar:Digital Rebar is a data center automation, resource allocation and infrastructure-as-a-code(IaC) platform that uses cloud-native architecture design and can replace Cobbler, Foreman, MaaS or similar technologies.

  2. Canonical's MAAS(Metal as a Service) "Metal as a Service".

  3. Ironic:"Service for managing and configuring bare metal servers from the OpenStack Foundation.

  4. method to start various operating system installation programs or utilities from a location in the BIOS via PXE without having to retrieve the media to run the tool.

  5. Plundr:Plunder is a single binary server. Its design purpose is to make the configuration of servers, platforms and applications easier.

  6. Tinkerbell:A bare-metal provisioning engine, built and maintained with love by the Packet team.

These most popular tools can help us better automate the workflow we saw above, such as PXE, DHCP, TFTP, NFS, hard disk pre-installation, and operating system installation. After installation, the Kubernetes cluster can be started, but these tools are intended for more general purpose. After the operating system is installed, management tools such as SSH, Ansible, Puppet, or Chef are usually used to manage the computer and its software packages.

Now, many update tools shared by the community are concentrated on these tools to build a Kubernetes cluster, so we ended the cycle on bare metal and Kubernetes. Examples include:

  1. Metalk8s:Scality stated that "The release of MetalK8s is to make it easier to run Kubernetes(k8s) on bare metal servers that require persistent storage."

  2. Metal Stack:"We believe that Kubernetes runs best on bare metal. We built an API for managing bare metal hardware and Kubernetes on this basis."

  3. Metal³:"bare metal host configuration for Kubernetes", comes from ClusterAPI support and created Ironic.

Some of the Kubernetes tools mentioned go a step further and include an abstract definition called Cluster API(CAPI). The goal of CAPI is to transform a group of VMs or servers into a functional Kubernetes cluster.

The cluster API is a declarative Kubernetes project that is a Kubernetes-style API for cluster creation, configuration, and management. It provides optional additional functions on top of the core Kubernetes to manage the life cycle of the Kubernetes cluster.

Future availability

When considering modern cloud computing, there are a series of controls and portability. On the one hand, we have AWS Lambda, which is a highly proprietary open source SaaS product that can hardly be controlled, but is highly practical and efficient. Moving up, we have Kubernetes, which brings the super power of horizontal scaling, but at a certain price.

Finally, we have bare-metal servers that require professional tools for careful management. I personally think the intersection of the two is interesting, which is why I am excited about tools like Metal³ and Metal Stack, which are designed to smooth the experience from heterogeneous bare metal hardware to Kubernetes.

I am also closely watching the projects of Tinkerbell and Plundr. Plundr adopts a holistic approach, bundling as much as possible into a single binary and simple workflow engine, that is, turning bare metal into a cloud service. Tinkerbell is a set of microservices, based on the work of Packet in the past 6 years, and designed with security and heterogeneous hardware in mind. I like how Tinkerbell sets each step in the workflow to be defined as a Docker image. Define a Dockerfile, then build the image and store it in the registry to achieve portability, version control, and reproducible artifacts.