10: Kubernetes Scheduler: Understanding How It Works with Examples

The Kubernetes Scheduler is a critical component of the Kubernetes control plane. It is responsible for determining which node in the cluster will run a newly created pod. Let’s explore how the Kubernetes scheduler works, its two-step process, and the different ways to influence scheduling decisions. Examples are provided for clarity.
How Does the Kubernetes Scheduler Work?
The scheduler operates in the following sequence when scheduling pods:
Request Flow:
A user or controller issues a request (e.g., using
kubectl) to the API server.The API server modifies the state in etcd, marking the pod's status as
Pending.

Scheduler's Role:
- The scheduler identifies any pods with the
Pendingstatus and takes responsibility for assigning them to a node.
- The scheduler identifies any pods with the
Binding:
- Once the best node is identified, the pod is bound to it, changing the pod's state from
PendingtoRunning.
- Once the best node is identified, the pod is bound to it, changing the pod's state from

Two-Step Scheduling Process: The scheduler follows two steps to assign the best node:
Filtering Nodes: Eliminates nodes that do not meet the pod’s requirements (e.g., resource requests, hardware specifications, or taints).
Ranking Nodes: Scores the remaining nodes based on resources and policies, selecting the most suitable node.


Example of Scheduling
Scenario:
We have a cluster with seven nodes (A to G). Only nodes B, D, and G have GPUs.
Steps:
Filtering:
Nodes A, C, E, and F are removed because they lack GPUs.
Remaining nodes: B, D, G.
Ranking:
Node B: GPU utilization = 50%.
Node D: GPU utilization = 100%.
Node G: GPU utilization = 0%.
The scheduler selects Node G because it has the most available resources (100% GPU free).
Binding:
- The pod is bound to Node G, changing its status to
Running.
- The pod is bound to Node G, changing its status to
If no node satisfies the requirements, Kubernetes can choose a node randomly or based on additional rules.

Can We Influence Scheduling Decisions?
Yes! Kubernetes provides several ways to influence the scheduler’s behavior:

1. Node Name:
You can specify a particular node for a pod using the nodeName field in the pod specification.
Example:
spec:
nodeName: node-1
When you describe the pod, the events will show that no scheduling occurred because the pod was directly assigned to a specific node.
2. Node Selector:
A Node Selector lets you bind pods to nodes with specific labels. This is a simple one-condition matching mechanism.
Steps:
Label nodes with specific attributes (e.g.,
cpu=i7ordisk=ssd):kubectl label nodes node-1 disk=ssdUse the label in your pod definition:
spec: nodeSelector: disk: ssd
Mistakes:
If you apply the wrong label, you can fix it using
--overwrite:kubectl label nodes node-1 disk=hdd --overwrite
Difference Between -l and -L:
-l: Filters resources by label.-L: Lists the values of specific labels.
3. Node Affinity and Anti-Affinity:
Node Affinity supports multi-condition matching with logical operators (AND/OR), providing more flexibility than Node Selectors.
Node Affinity: Assigns pods to specific nodes based on labels and conditions.
Node Anti-Affinity: Avoids assigning pods to nodes with certain labels.
Example:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disk
operator: In
values:
- ssd
In this example, the pod will only be scheduled on nodes with the label disk=ssd.
4. Pod Affinity and Anti-Affinity:
Pod Affinity: Ensures pods are scheduled on the same node as other pods with matching labels.
Pod Anti-Affinity: Prevents pods from being scheduled on the same node as pods with matching labels.
Example:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
labelSelector:
matchLabels:
env: dev
topologyKey: "kubernetes.io/hostname"
Here, pods with env=dev labels will be placed on the same node.
5. Taints and Tolerations:
Taints: Used to repel pods from a node.
Tolerations: Allow pods to tolerate a node’s taint.
Example of Taint:
kubectl taint nodes node-1 key=value:NoSchedule
Example of Toleration:
tolerations:
- key: "key"
operator: "Equal"
value: "value"
effect: "NoSchedule"
Conclusion
The Kubernetes scheduler is a powerful tool that ensures optimal placement of pods across nodes. By understanding and using features like nodeName, Node Selectors, Affinities, and Taints, you can have fine-grained control over where your pods run. Whether you want to co-locate pods for efficiency or separate them for redundancy, Kubernetes provides the flexibility to meet your needs.




