All the code for this chapter (and other chapters) is available at https://github.com/param108/kubernetes101 check the directory 012
In the last post we created a simple web application that returned a string when you hit it’s /ping
endpoint. In this post, we create a pod with 2 containers, a web application and a postgres db.
Web Application
The web application for this chapter is available in the directory “012/service/web`. The code is written in golang and it has the following features
- POST /write accepts a json object
{ "data": "one line of data" }
and writes it to the database - GET /read returns all the lines stored in the database
- GET /quit kills the application
- It expects a postgres to be available to read and write from. It waits until the postgres db is up before it starts serving traffic.
Setup
As before, we need to first connect to minikube’s docker registry.
$ eval $(minikube -p minikube docker-env)
To build the image for the web application go to the directory 012/service
and do
$ make docker
There is a pod config called webpod.yml
in the same directory. Apply that configuration to create the pod.
kubectl apply -f webpod.yml
This is what you should be able to see if everything is working correctly
$ kubectl get pods │ break
NAME READY STATUS RESTARTS AGE
web2 2/2 Running 0 3s
The Pod Spec
apiVersion: "v1"
kind: Pod
metadata:
name: web2
labels:
app: web
version: v1
role: backend
spec:
containers:
- name: web
image: web:latest
imagePullPolicy: Never
command: ["/web"]
args: []
ports:
- containerPort: 8080
protocol: TCP
hostPort: 8080
env:
- name: db_name
value: web
- name: db_user
value: web
- name: db_pass
value: web
- name: db_host
value: 127.0.0.1
- name: db
image: postgres:9.6
env:
- name: POSTGRES_DB
value: web
- name: POSTGRES_USER
value: web
- name: POSTGRES_PASSWORD
value: web
The web
part of the spec is pretty much the same as in the previous post apart from the database credentials being passed as env
variables. These variables will be set as environment variables in the container.
The db
is the additional section under containers. POSTGRES_DB, POSTGRES_USER and POSTGRES_PASSWORD are required by the container to setup the db. we dont specify an imagePullPolicy for the db because we want kubernetes to pull from the public dockerhub.
Hiding in plain sight
The first question that comes to mind is why is the web application talking to the database on its localhost (127.0.0.1)?
The reason for this is because containers in a pod share the network layer and port space.
Postgres listens on port 5432 while the app is listening on 8080 as if they are running on the same machine.
What else is shared by containers in a pod ?
- Network namespace (same ip address and ports)
- IPC namespace (pipes and semaphores etc)
- Shared volumes (we will learn about volumes later).
In our case we are using the shared Network namespace for communication.
Starting up order
Ideally, we would want the postgres database up before the web application. Our podspec does not guarantee this. There are ways to guarantee startup order, one example is init-containers, we will maybe take that up later.
For now, the web app actually waits for the postgres to come up through code. This is NOT the best practice, but works in this non-production work-load.
Life Cycle
What happens if we kill the web application, does the whole pod restart ? Lets see.
Lets begin by creating the pod.
$ kubectl apply -f webpod.yml
pod/web2 created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
web2 2/2 Running 0 3s
Do you see that 2/2 under READY? This corresponds to the 2 containers in the pod. Lets test the various end points.
$ curl 192.168.99.100:8080/write -d '{"data":"firstline"}'
{"success": "true"}
$ curl 192.168.99.100:8080/write -d '{"data":"secondline"}'
{"success": "true"}
$ curl 192.168.99.100:8080/write -d '{"data":"thirdline"}'
{"success": "true"}
$ curl 192.168.99.100:8080/read
{"success":"true",
"Lines":[{"id":1,"data":"firstline"},
{"id":2,"data":"secondline"},
{"id":3,"data":"thirdline"}]
}
Die Web container!
So the data about the lines is persisted in the db.
What happens if the application crashes ? The /quit
end point simulates an application crash.
$ curl 192.168.99.100:8080/quit
<html>ok</html>
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
web2 1/2 Running 0 105m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
web2 2/2 Running 1 105m
$ curl 192.168.99.100:8080/read
{"success":"true",
"Lines":[{"id":1,"data":"firstline"},
{"id":2,"data":"secondline"},
{"id":3,"data":"thirdline"}]
}
So even though the container restarted, the data still persisted as the database container didn’t die.
Wipe out
Lets delete the pod and recreate it.
$ kubectl delete pods/web2
pod "web2" deleted
$ kubectl apply -f webpod.yml
pod/web2 created
$ curl 192.168.99.100:8080/read
{"success":"true","Lines":[]}
So we lost all the data!!!!
Architectural thoughts
Its impossible to horizontally scale this pod as the databases would be replicated as well. Vertical scaling is still possible.
By editing the pod spec, you could add more web pods, but managing that pod spec would get tougher with each container you add. There would be a lot of repeated configuration and updating the web application later would be a pain.
There is a kinda-high-availability going on. When one container dies, it is restarted, although if the postgres container dies, it loses data. The fact that we need to think about HA separately for postgres is a good reason to deploy it separately as well.
Updating the application container is going to be complicated and will definitely take a down time.
As the containers are part of the same pod, they are all scheduled on the same node always. There is a good chance that resources for all containers in a pod may not be available in a node, while resources for each container may be available in different nodes. The lack of resources in one node will cause the pod to be un-schedule-able.
All in all, this is fine for a college project, but not ready for production.
Learnings
containers in a pod share things, which allows them to talk to each other efficiently
Keeping application pods separate from database pods is a good idea as these pods have different needs and scale differently.
Killing one container in a pod does not restart all others. The container that died is resurrected. This increments the Restarts of the pod though!
Boot up order of containers in the pod is not deterministic. Boot up of containers happens in parallel.
All containers in a pod are scheduled on the same node. This can be a problem when resources become strained.
Conclusion
While pods give us the ability to run separate containers together and provide an efficient way of communicating between containers, this may not be the best way of setting up your services. Separating your state-less components (application servers) and state-ful components (databases, redis etc) will help you manage each better. We will continue exploring good ways to set this up in the next posts.