Kubernetes is the tip of the spear as far as tools for handling scale problems is concerned. Lovingly called k8s, it embodies the learnings of years of building web applications at scale.
To understand how K8s is the answer and what exactly were the questions, we need to go back to the beginning of web services.
Age of Machines
In the beginning, web services were deployed as code or executables hanging off a webserver. The most famous webservers being Apache and Tomcat. The assumption was that you had a physical server and installed all your code there and when things changed you redeployed only your application binary or code and restarted the webserver.
Given the huge load on the system, companies would lease thousands of these physical servers from Service Providers. Unfortunately all web services did not use the same stack. There was much variance in the base machine. For example a Windows based application could not be run on a linux box and vice-versa. So how would a Service Provider know how many boxes of different hardware types to hold?
Enter VMs
The next step of evolution began with the invention of Virtual Machines. This software allowed users to run which ever operating system on which ever hardware by providing a layer that mimicked the hardware that a binary could run on.
Executables for different hardwares were organized differently and used different instructions. Thus they don’t work out of the box on other machines. VMs solve this problem by acting like the machine the executable was written for and translating the instructions to the actual hardware of the box.
Service providers loved it. Now they could run anything on any box and could buy default hardware for all boxes. The fore-runner in this technology was VMWare.
Containing restart time with Containers
The biggest problem with VMs was that they were so slow to restart. To start a VM, it literally had to boot up like an operating system running directly on hardware. Also as there was only one VM running at a time, there was immense waste of idle CPU cycles if the VM was over-provisioned for the traffic load.
The million dollar realization was that in essence, an application server was running one process. The application process. All other processes were either unnecessary or played a part in the running of the application.
By bundling together only the parts of the operating system required for an executable to run, it was possible for a container engine to run these bundles on any platform. Further more as these bundles were essentially binarys they start almost immediately. These bundles were called “container images” and when they were run they were called “containers“. The most famous of the containerization platforms is docker, but LXC or the Linux Containers are also still available.
Containment Breach
With containers things began looking good for Web companies.
- As containers were very light compared to VMs a server could host a number of containers and use its resources such as CPU and RAM more effectively.
- The super fast bootup time of containers made creating more containers, when more load was seen, easy and auto-scaling became the default behaviour.
- While containers were less resilient to resource hogging, they still had enough safeguards in place to handle the odd rogue container which stepped out of its limits.
As with all solutions, they came with their own set of problems. How do you keep track of dying containers? What about logging? How do we make sure that if a Machine on which containers were running failed, how do we make sure the application can still run ? How do we manage disk space of containers ? How do we make sure that containers were secure ?
Kubernetes
Kubernetes helps manage container systems. It is not the only container management system, there are others like Swarm etc, but it is arguably the most famous.
Kubernetes is designed to manage containers at scale. The system a Kubernetes instance manages is called a kube cluster. Physically it is a bunch of Machines/VMs connected by a network.
You provide a set of Machines/VMs with their hardware specifications and Kubernetes will handle deploying and managing the lifecycle of containers you want to run. Apart from this, kubernetes provides many features which make building resilient, fault tolerant systems using kubernetes a breeze.