5: Kubernetes QoS (Quality of Service) Classes Explained

Kubernetes QoS (Quality of Service) Classes Explained

In Kubernetes, Quality of Service (QoS) classes are a vital mechanism to manage resources efficiently and prioritize pods during scheduling and resource contention. This blog explores QoS classes, their characteristics, examples, and how Kubernetes handles resource allocation.

QoS Classes Overview

Kubernetes defines three QoS classes based on the resource requests and limits specified in a pod’s configuration.

1. Best Effort

Characteristics:
- No CPU or memory requests or limits are specified in the pod's YAML file.
- Kubernetes tries to provide resources but doesn’t guarantee any minimum allocation.
Priority: Lowest.
- Best Effort pods are the first to be evicted during resource contention.
Use Case:
- For non-critical workloads where performance is not a concern (e.g., logs processing or testing).
Analogy: A person carrying a heavy load without assurance of help—unreliable and low priority.

Example:

apiVersion: v1
kind: Pod
metadata:
  name: best-effort-pod
spec:
  containers:
  - name: nginx
    image: nginx

Here, no resource requests or limits are defined, making this pod Best Effort.

2. Burstable

Characteristics:
- CPU and/or memory requests and limits are defined, but limits are greater than requests.
- At startup, the requested resources are guaranteed.
- The pod can use up to its limit when additional resources are available.
Priority: Medium.
- Burstable pods are evicted after Best Effort pods but before Guaranteed pods.
Use Case:
- Applications that can run on minimal resources but occasionally require more during peak times (e.g., dynamic APIs).
Analogy: A rollercoaster—starts with minimal resources but ramps up during high demand.

Example:

apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
spec:
  containers:
  - name: redis
    image: redis
    resources:
      requests:
        memory: "128Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "1000m"

In this example:

The pod is guaranteed 128Mi memory and 250m CPU at startup.
It can "burst" up to 512Mi memory and 1000m CPU if needed.

3. Guaranteed

Characteristics:
- CPU and memory requests and limits are equal.
- The requested resources are fully reserved at startup, ensuring stable performance.
Priority: Highest.
- Guaranteed pods are evicted last during resource contention.
Use Case:
- Mission-critical applications requiring consistent performance (e.g., databases).
Analogy: Cruising in a stable car with predictable performance.

Example:

apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
spec:
  containers:
  - name: postgres
    image: postgres
    resources:
      requests:
        memory: "512Mi"
        cpu: "1000m"
      limits:
        memory: "512Mi"
        cpu: "1000m"

In this example:

The pod is guaranteed 512Mi memory and 1000m CPU at startup.
This pod is of the Guaranteed QoS class because requests and limits are identical.

How Kubernetes Handles Resource Contention

Eviction Priority:
- Best Effort → Burstable → Guaranteed.
- During node stress, Kubernetes evicts pods starting with the lowest priority class to free up resources.
Resource Scheduling and Enforcement:
- Requests: The Kubernetes scheduler ensures a node has the minimum required resources before scheduling a pod.
- Limits: Enforced by the kubelet, which manages the upper bounds of resource usage.

Handling Resource Overload

1. Memory Overload

When a pod exceeds its memory limit, the OOM (Out of Memory) Killer terminates the container.
The pod restarts, but its IP remains unchanged.

Example YAML for Memory Stress:

apiVersion: v1
kind: Pod
metadata:
  name: memory-crash-pod
spec:
  containers:
  - name: stress
    image: polinux/stress
    args:
    - --vm
    - "1"
    - --vm-bytes
    - "512Mi"
    - --vm-hang
    resources:
      requests:
        memory: "256Mi"
      limits:
        memory: "512Mi"

This pod is designed to use 512Mi of memory. Exceeding the limit triggers the OOM Killer, resulting in error code 137.
To identify OOM logs on the node:
```
  journalctl | grep -i oom
```

2. CPU Overload

If CPU usage exceeds its limit, CPU throttling occurs. The container doesn’t crash but operates at a reduced speed.

Monitoring Resource Usage

To monitor node and pod resource usage, the Metrics Server must be installed.

Install Metrics Server:

Use this configuration to set it up:
Metrics Server YAML

Monitor Metrics:

For nodes:
```
  kubectl top nodes
```
For pods:
```
  kubectl top po
```
Use watch to view live metrics:
```
  watch kubectl top po
```

Practical Example: Simulating Resource Contention

Let’s simulate a scenario where memory contention occurs:

Create a pod with the following YAML:
Memory Exceed YAML
Trigger Memory Stress:
The pod exceeds its memory limit and crashes with an error code 137.
Verify Logs:
- Check the logs of the node for OOM events:
```
  journalctl | grep -i oom
```

Conclusion

Understanding Kubernetes QoS classes is essential for resource management and workload prioritization. By categorizing workloads into Best Effort, Burstable, and Guaranteed, Kubernetes ensures efficient use of resources while maintaining performance for critical applications. Use tools like the Metrics Server to monitor and optimize cluster resource utilization.

Now it’s your turn! Experiment with these examples and simulate different scenarios to deepen your understanding of Kubernetes QoS.

I am writing these blogs because I recently completed a comprehensive DevOps course where I gained in-depth knowledge of the topics mentioned. As I progressed through the course, I realized the importance of having a concise and accessible resource to revise and reinforce my understanding of each topic. Therefore, I decided to create cheat sheets in the form of blog posts. These cheat sheets will not only serve as a handy reference for myself but also benefit others who are also interested in mastering DevOps concepts. By documenting each topic and providing concise explanations, I aim to create a valuable resource that simplifies complex concepts and facilitates hands-on practice. This way, I can solidify my own understanding while helping others on their DevOps journey.

5: Kubernetes QoS (Quality of Service) Classes Explained

Kubernetes QoS (Quality of Service) Classes Explained

QoS Classes Overview

1. Best Effort

2. Burstable

3. Guaranteed

How Kubernetes Handles Resource Contention

Handling Resource Overload

1. Memory Overload

2. CPU Overload

Monitoring Resource Usage

Install Metrics Server:

Monitor Metrics:

Practical Example: Simulating Resource Contention

Conclusion

Comments

More from this blog

🤖 From Chatbots to AI Agents — Understanding the Agent Era (Lecture 1 Notes)

Red Hat Certified System Administrator

CCNA Notes

🚀 Containerizing a .NET 8 Application with SQL Server using Docker

Day 10 – GitLab CI/CD for Python & Using GitLab Container Registry

Command Palette

Kubernetes QoS (Quality of Service) Classes Explained

QoS Classes Overview

1. Best Effort

2. Burstable

3. Guaranteed

How Kubernetes Handles Resource Contention

Handling Resource Overload

1. Memory Overload

2. CPU Overload

Monitoring Resource Usage

Install Metrics Server:

Monitor Metrics:

Practical Example: Simulating Resource Contention

Conclusion

Comments

More from this blog