Kubernetes 101 - Part 17 - Resource Limits

All the code for this chapter (and other chapters) is available at https://github.com/param108/kubernetes101 check the directory 017

Continuing with the theme of Scheduling in Kubernetes. Lets look a little more deeply at ways to influence the scheduling algorithm.

Whenever we create pods, we can specify the resources it will need. You also have flexibility to say whether this is a preference or a limit. Lets see how.

Resource Requests

Consider this to be the minimal requirements for your pod to be scheduled by Kubernetes. If Kubernetes does not see this requirement available on a node (due to pods already scheduled), it will choose not to schedule your pod there.

Resource requests are configured on a per-container basis.

Which resources ?

The 2 main resources that we are looking at are

CPU – in cpu units 1 cpu unit is 1 vCPU/core. Fractional numbers are allowed
RAM or memory – in bytes.

Other resources do exist, but we wont talk about them here. Further you can add custom resources as well.

Choosing Request Sizes

In the diagrams below the green boxes are the requests and the red shaded regions are the current usage at a particular point of time.

First (box on the left, below), lets choose the requests larger than the peak. The dark green lines are the requests. The dark red is the current utilization. In the diagram we have requested 100% but probably using less than 50%.

The problem here is the wastage. For nearly the whole day, the nodes are under utilized.

Second (above right), lets choose the requests equal to the average for the whole day. Here, the area surrounded by dark green border is only 70% of the total. At peak, though, the CPU level for each pod crosses the requested amount and you can see that the utilization on the node is 100%. This can lead to eviction of pods and maybe mis-behaviour of the node as it cant handle the extra load.

Resource requests are not hard limits, pods may take more resources than they request.

The Right Way

There is obviously no right way which works for all cases. This method is better than most though.

The first thing to do is average out your load across a number of pods. Lots of small pods should be the intent.

Next, your requests should be the peak of the new average value.

Finally, when the load drops, you can consider deleting a few pods and remember to scale up again when its time for load. Auto-scaling, which we will cover in a future post, will be key here.

No auto-scaling

Even in the case of no-auto-scaling, it is still better to have numerous smaller pods. This gives Kubernetes leeway to optimize usage better.

Effects of exceeding Resource Requests

If a Container exceeds its memory request, it is likely that its Pod will be evicted whenever the node runs out of memory.

Resource Limits

Resource Limits are hard limits for resource usage. If a container exceeds its resource limit, it may be stopped and evicted. Specifically

If a Container exceeds its memory limit, it might be terminated. If it is restartable, the kubelet will restart it, as with any other type of runtime failure.
A Container might or might not be allowed to exceed its CPU limit for extended periods of time. However, it will not be killed for excessive CPU usage.

Setup

Setup the cluster like in previous posts with 4 nodes. The config kind.yml is in directory 017.

Configuration

You can configure Limits and Requests in your PodSpec or equivalent (eg. template in Replicaset). You configure it inside each container. The Limit/Request for a pod is the sum of limits and requests of the constituent containers.

You can configure the limits and requests in a container as below

spec:
  containers:
  - name: db
    image: postgres
    env:
    - name: postgres_password
      value: "password"
    resources:
      requests:
        memory: "32Mi"
        cpu: "125m"
      limits:
        memory: "64Mi"
        cpu: "250m"

Node resources

To see node resources you can use the command

$ kubectl  get nodes/realkind-worker -o json | jq ".status.allocatable"
{
  "cpu": "4",
  "ephemeral-storage": "229646500Ki",
  "hugepages-1Gi": "0",
  "hugepages-2Mi": "0",
  "memory": "16212588Ki",
  "pods": "110"
}

Please download jq, it is amazing help when working with json output!

This above command picks out the section section.allocatable from the json object returned by kubectl.

As you can see in kind, the memory allocatable is 16Gig, CPU is 4.

Scheduling 1 pod per node

Lets go back to our ubuntuPod from a previous chapter. You can find the code in 017/ubuntuPod.yml in the code repository and try and get Kubernetes to schedule only one pod per node relying only on resource requests.

apiVersion: "v1"
kind: Pod
metadata:
  name: ubuntu-pod
  labels:
    app: ubuntu-pod
    version: v5
    role: backend
spec:
  containers:
  - name: ubuntu-container
    image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "while [ \"a\" = \"a\" ]; do echo \"Hi\"; sleep 5; done" ]
    resources:
      requests:
        cpu: 3

As you can see above I have added a cpu resource request of 3 for the ubuntu pod. The files 017/ubuntuPod1.yml … 017/ubuntuPod3.yml have the same config with different names.

Apply

$ kubectl apply -f ubuntuPod.yml 
pod/ubuntu-pod created

$ kubectl apply -f ubuntuPod1.yml 
pod/ubuntu-pod-1 created

$ kubectl apply -f ubuntuPod2.yml 
pod/ubuntu-pod-2 created

$ kubectl apply -f ubuntuPod3.yml 
pod/ubuntu-pod-3 created

$ kubectl get pods
NAME           READY   STATUS              RESTARTS   AGE
ubuntu-pod     0/1     ContainerCreating   0          16s
ubuntu-pod-1   0/1     ContainerCreating   0          12s
ubuntu-pod-2   0/1     ContainerCreating   0          9s
ubuntu-pod-3   0/1     ContainerCreating   0          5s

$ kubectl get pods -owide
NAME           READY   STATUS              RESTARTS   AGE   IP       NODE               NOMINATED NODE   READINESS GATES
ubuntu-pod     0/1     ContainerCreating   0          22s   <none>   realkind-worker    <none>           <none>
ubuntu-pod-1   0/1     ContainerCreating   0          18s   <none>   realkind-worker4   <none>           <none>
ubuntu-pod-2   0/1     ContainerCreating   0          15s   <none>   realkind-worker2   <none>           <none>
ubuntu-pod-3   0/1     ContainerCreating   0          11s   <none>   realkind-worker3   <none>           <none>

So each Pod is on a separate node.

What happens if we try to create one more pod ?

$ kubectl apply -f ubuntuPod4.yml 
pod/ubuntu-pod-4 created
 

$ kubectl get pods
NAME           READY   STATUS    RESTARTS   AGE
ubuntu-pod     1/1     Running   0          3m39s
ubuntu-pod-1   1/1     Running   0          3m35s
ubuntu-pod-2   1/1     Running   0          3m32s
ubuntu-pod-3   1/1     Running   0          3m28s
ubuntu-pod-4   0/1     Pending   0          12s

Unfortunately our pod is still pending, lets figure out why!

Events

One of the most useful commands for debugging for pod’s in pending state is

$ kubectl get events --field-selector involvedObject.kind=Pod,involvedObject.name=ubuntu-pod-4

LAST SEEN   TYPE      REASON             OBJECT             MESSAGE
0s          Warning   FailedScheduling   pod/ubuntu-pod-4   0/5 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 4 Insufficient cpu.

This says the pod ubuntu-pod-4 could not schedule because one node “had a taint that it didnt tolerate”. This is probably the control-plane node. The second part says “4 Insufficient cpu” which means that 4 nodes didn’t have enough CPU.

Learnings

Setting Resource Requests and Limits correctly will allow your nodes to not fail under increased load.
Auto-scaling will allow you to scale up and scale down infrastructure to optimize costs.
kubectl get events will help you debug Scheduling failures.
Numerous small size pods will give kubernetes the flexibility it needs.
None of this is Mandatory nor are brushing your teeth or maintaining hygiene.

Conclusion

This is the last post on Scheduling until a new feature comes around which gives us even more control.

Understanding how scheduling works and the variables Kubernetes uses to come up with a scheduling decision become more and more important as your infrastructure grows. As pressure on your cluster grows, an organized way of allocating resources will allow you to optimally use your cluster’s resources and also debug contention issues when they crop up.

In the next port we will look at how traffic can enter a kubernetes cluster.

Hands On Scale

Kubernetes 101 – Part 17 – Resource Limits

Resource Requests

Which resources ?

Choosing Request Sizes

The Right Way

No auto-scaling

Effects of exceeding Resource Requests

Resource Limits

Setup

Configuration

Node resources

Scheduling 1 pod per node

Apply

Events

Learnings

Conclusion

Previous & Next posts

Kubernetes 101 – Part 18 – Ingress

Kubernetes 101 – Part 16 – Taints and tolerations

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

About author

Param

Kubernetes 101 – Part 17 – Resource Limits

Resource Requests

Which resources ?

Choosing Request Sizes

The Right Way

No auto-scaling

Effects of exceeding Resource Requests

Resource Limits

Setup

Configuration

Node resources

Scheduling 1 pod per node

Apply

Events

Learnings

Conclusion

Previous & Next posts

Kubernetes 101 – Part 18 – Ingress

Kubernetes 101 – Part 16 – Taints and tolerations

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

About author

Param

You might also like

Kubernetes 101 – Part 21 – Deployments

Kubernetes – 101 – Part 20 – Volumes – 2

Kubernetes 101 – Part 19 – Volumes 1