All the code for this chapter (and other chapters) is available at https://github.com/param108/kubernetes101 check the directory 019

Volumes are, essentially, disk spaces which are added to a container for it to use. To understand more about Volumes lets look a little closer at Docker volumes.

Docker volumes

Lets create an image with a volume.

Create a volume

Lets create the volume first

docker volume create new-vol1

This creates the volume we are looking for. Physically its just allocating a directory on our host machine.

$ docker inspect new-vol1
[
    {
        "CreatedAt": "2020-04-10T12:23:58+05:30",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/new-vol1/_data",
        "Name": "new-vol1",
        "Options": {},
        "Scope": "local"
    }
]

So our volume is at /var/lib/docker/volumes/new-vol1/_data

Create a docker image that uses a volume

In the file /019/docker/Dockerfile in the code repository there is a very simple docker definition. It is based on Ubuntu with its main command an infinite loop. It requests a volume and wishes it is mounted at /v1

FROM ubuntu:18.04

EXPOSE 8080
RUN mkdir /v1
VOLUME /v1
CMD /bin/bash -c 'while [ "a" = "a" ]; do echo "Hello"; sleep 10; done'

Build the docker image using docker build . -t plainubuntu

Create the container and bind the volume to it

We use the --mount option in docker run command to bind the volume to the created container.

$ docker run -d --mount 'target=/v1,source=new-vol1' plainubuntu:latest
29eeb648732769ed580d3a47a9860eed8c28b02085cda2b45b6a477f9675a003

$ docker run -d --mount 'target=/v1,source=new-vol1' plainubuntu:latest
2d8f85a7dff4bd43dcb15e5e01752ef181207f1ecc8054884127f80a1b763ed7

As you can see I have created 2 separate containers and mounted the same volume into both.

The mounts talk to the same filesystem at the end of the day. This means what one container writes, the other can read. More importantly if we delete the container, the volume doesn’t get deleted. A new container created will be able to see the old data.

One use case for this is Database files. If the database container dies, we can still retain the data. Lets see how volumes work in kubernetes and use them to create our first resilient postgres.

Volumes on Kubernetes

While volumes on kubernetes utilize the docker volume functionality, the properties are slightly different.

In docker the volumes survived even after the container died or was removed. In kubernetes, however, the volume is deleted as soon as the POD is deleted.

Lets try this out in minikube.

$ eval $(minikube -p minikube docker-env)
$ cd 019/docker
$ docker build . -t plainubuntu

Now lets create a pod with 2 containers using the same docker image above and the same mounts.

apiVersion: "v1"
kind: Pod
metadata:
  name: webgroup
  labels:
    app: web
    version: v1
    role: backend
spec:
  containers:
  - name: web1
    image: plainubuntu:latest
    imagePullPolicy: Never
    volumeMounts:
      - mountPath: /v1
        name: test-volume
  - name: web2
    image: plainubuntu:latest
    imagePullPolicy: Never
    volumeMounts:
      - mountPath: /v1
        name: test-volume
  volumes:
    - name: test-volume
      hostPath:
        path: /data

Shared Volume

Here we can see that we have mounted the test-volume on both containers. The test-volume is actually the directory /data in the host machine. Lets create the pods and test a few things.

$ kubectl exec -it webgroup --container web1 /bin/bash                                                                         
root@webgroup:/# ls /v1                                                                                                                                                                  
root@webgroup:/# cd /v1                                                                                                                                                                  
root@webgroup:/v1# echo "hello from web1" >> chat                                                                                                                                        
root@webgroup:/v1# exit                                                                                                                                                                  
                                                                                                                                                                                     
$ kubectl exec -it webgroup --container web2 /bin/bash                                                                                                                                                                                                                                                                  
root@webgroup:/#                                                                                                                                                                         
root@webgroup:/#                                                                                                                                                                         
root@webgroup:/# cat /v1/chat                                                                                                                                                            
hello from web1                                                                                                                                                                          
root@webgroup:/# echo "hello from web2" >> /v1/chat                                                                                                                                                                                                                   
root@webgroup:/# echo "hello from web2" >> /v1/chat                                                                                                                                      
root@webgroup:/# exit                                                                                                                                                                    
exit                                           
                                                                                                                                          
$ kubectl exec -it webgroup --container web1 /bin/bash                                                                         
root@webgroup:/# cat /v1/chat                                                                                                                                                            
hello from web1                                                                                                                                                                          
hello from web2                                                                                                                                                                          

Container Death

Now lets kill one container and see if the data is still available

$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# killall5                                                                                                                                                                
command terminated with exit code 137                                                                                                                                                    

$ kubectl get pods                                                                                                             
NAME       READY   STATUS    RESTARTS   AGE                                                                                                                                              
webgroup   2/2     Running   1          15m                                                                                                                                              

$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# cat /v1/chat
hello from web1
hello from web2

Data persists!

Pod Death

Lets kill the pod and recreate it.

$ kubectl delete pods/webgroup
pod "webgroup" deleted

$ kubectl apply -f dualpod.yml
pod/webgroup created

$ kubectl exec -it webgroup --container web1 /bin/bash
root@webgroup:/# cat /v1/chat
hello from web1
hello from web2
root@webgroup:/# exit
exit

What ? The data is still there.

The reason the data is still there is because the configuration says

  volumes:
    - name: test-volume
      hostPath: 
        path: /data

This volume has been defined as a hostPath which means the data resides on the node that the pod is scheduled on.

As expected the volume was indeed deleted but the data on the disk of the host remained the same. The next time we created the pod a new volume was created but pointed to the same path. As a result the data was preserved.

Types of volumes

hostPath is one type of volume. There are many types. You can see the full list here. Each type has its own behaviour and configs.

Learnings

Volumes are disk storage provided externally for a container.

Kubernetes supports many different types of Volumes and each has its own behaviour and characteristics.

Life of a volume is the life of the pod.

If the underlying technology allows persistence you may be able to get back your data.

Volumes can be mounted on multiple containers in the pod. This can be used to share data as well.

Conclusion

The most important feature provided by Volumes is that storage can be handled outside a pod.

What we saw here, was direct use of Volumes in the pod. The problem with this approach is that you are tying the pod to the technology of the Volume. In truth the Pod just wants some hard disk space of a particular size. Kubernetes should provide the Volume it has to fulfill that requirement.

This is exactly what PersistentVolume and PersistentVolumeClaim are used for. We will take a look at this in the next post.