All the code for this chapter (and other chapters) is available at https://github.com/param108/kubernetes101 check the directory 019
Volumes are, essentially, disk spaces which are added to a container for it to use. To understand more about Volumes lets look a little closer at Docker volumes.
Docker volumes
Lets create an image with a volume.
Create a volume
Lets create the volume first
docker volume create new-vol1
This creates the volume we are looking for. Physically its just allocating a directory on our host machine.
$ docker inspect new-vol1
[
{
"CreatedAt": "2020-04-10T12:23:58+05:30",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/new-vol1/_data",
"Name": "new-vol1",
"Options": {},
"Scope": "local"
}
]
So our volume is at /var/lib/docker/volumes/new-vol1/_data
Create a docker image that uses a volume
In the file /019/docker/Dockerfile
in the code repository there is a very simple docker definition. It is based on Ubuntu with its main command an infinite loop. It requests a volume and wishes it is mounted at /v1
FROM ubuntu:18.04
EXPOSE 8080
RUN mkdir /v1
VOLUME /v1
CMD /bin/bash -c 'while [ "a" = "a" ]; do echo "Hello"; sleep 10; done'
Build the docker image using docker build . -t plainubuntu
Create the container and bind the volume to it
We use the --mount
option in docker run
command to bind the volume to the created container.
$ docker run -d --mount 'target=/v1,source=new-vol1' plainubuntu:latest
29eeb648732769ed580d3a47a9860eed8c28b02085cda2b45b6a477f9675a003
$ docker run -d --mount 'target=/v1,source=new-vol1' plainubuntu:latest
2d8f85a7dff4bd43dcb15e5e01752ef181207f1ecc8054884127f80a1b763ed7
As you can see I have created 2 separate containers and mounted the same volume into both.
The mounts talk to the same filesystem at the end of the day. This means what one container writes, the other can read. More importantly if we delete the container, the volume doesn’t get deleted. A new container created will be able to see the old data.
One use case for this is Database files. If the database container dies, we can still retain the data. Lets see how volumes work in kubernetes and use them to create our first resilient postgres.
Volumes on Kubernetes
While volumes on kubernetes utilize the docker volume functionality, the properties are slightly different.
In docker the volumes survived even after the container died or was removed. In kubernetes, however, the volume is deleted as soon as the POD is deleted.
Lets try this out in minikube.
$ eval $(minikube -p minikube docker-env)
$ cd 019/docker
$ docker build . -t plainubuntu
Now lets create a pod with 2 containers using the same docker image above and the same mounts.
apiVersion: "v1"
kind: Pod
metadata:
name: webgroup
labels:
app: web
version: v1
role: backend
spec:
containers:
- name: web1
image: plainubuntu:latest
imagePullPolicy: Never
volumeMounts:
- mountPath: /v1
name: test-volume
- name: web2
image: plainubuntu:latest
imagePullPolicy: Never
volumeMounts:
- mountPath: /v1
name: test-volume
volumes:
- name: test-volume
hostPath:
path: /data
Shared Volume
Here we can see that we have mounted the test-volume
on both containers. The test-volume is actually the directory /data in the host machine. Lets create the pods and test a few things.
$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# ls /v1
root@webgroup:/# cd /v1
root@webgroup:/v1# echo "hello from web1" >> chat
root@webgroup:/v1# exit
$ kubectl exec -it webgroup --container web2 /bin/bash
root@webgroup:/#
root@webgroup:/#
root@webgroup:/# cat /v1/chat
hello from web1
root@webgroup:/# echo "hello from web2" >> /v1/chat
root@webgroup:/# echo "hello from web2" >> /v1/chat
root@webgroup:/# exit
exit
$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# cat /v1/chat
hello from web1
hello from web2
Container Death
Now lets kill one container and see if the data is still available
$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# killall5
command terminated with exit code 137
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
webgroup 2/2 Running 1 15m
$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# cat /v1/chat
hello from web1
hello from web2
Data persists!
Pod Death
Lets kill the pod and recreate it.
$ kubectl delete pods/webgroup
pod "webgroup" deleted
$ kubectl apply -f dualpod.yml
pod/webgroup created
$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# cat /v1/chat
hello from web1
hello from web2
root@webgroup:/# exit
exit
What ? The data is still there.
The reason the data is still there is because the configuration says
volumes:
- name: test-volume
hostPath:
path: /data
This volume has been defined as a hostPath which means the data resides on the node that the pod is scheduled on.
As expected the volume was indeed deleted but the data on the disk of the host remained the same. The next time we created the pod a new volume was created but pointed to the same path. As a result the data was preserved.
Types of volumes
hostPath is one type of volume. There are many types. You can see the full list here. Each type has its own behaviour and configs.
Learnings
Volumes are disk storage provided externally for a container.
Kubernetes supports many different types of Volumes and each has its own behaviour and characteristics.
Life of a volume is the life of the pod.
If the underlying technology allows persistence you may be able to get back your data.
Volumes can be mounted on multiple containers in the pod. This can be used to share data as well.
Conclusion
The most important feature provided by Volumes is that storage can be handled outside a pod.
What we saw here, was direct use of Volumes in the pod. The problem with this approach is that you are tying the pod to the technology of the Volume. In truth the Pod just wants some hard disk space of a particular size. Kubernetes should provide the Volume it has to fulfill that requirement.
This is exactly what PersistentVolume and PersistentVolumeClaim are used for. We will take a look at this in the next post.