Scheduling Explainer

The Scheduling tab on a Pod detail view answers a single question: which nodes and Karpenter NodePools could this pod land on, and why not? It is a one-shot simulation, not a watch, so the output is a snapshot from the moment you opened the tab. Refresh to re-run it.

Open it from any Pod by pressing S, clicking the Scheduling tab, or following a deep link. The page is also useful on pods that have already landed: when spec.nodeName is set, the simulation explains why the scheduler picked that node.

Contents

What the Explainer Computes

When you open the tab, Buoy fetches the target pod, lists every node, lists every pod cluster-wide (for affinity evaluation), lists Karpenter NodePools and NodeClaims when present, and lists namespaces. A progress strip at the top of the tab shows which phase is running.

For each node, the explainer evaluates the pod’s constraints and produces a list of RuleVerdict records. A verdict carries:

  • A rule source: nodeSelector, nodeAffinityRequired, nodeAffinityPreferred, podAffinityRequired, podAffinityPreferred, podAntiAffinityRequired, podAntiAffinityPreferred, taint, nodeCondition, unschedulable, resources, podCount, topologySpread, hostPortConflict, schedulingGates, schedulerName, boundNode, and the Karpenter-specific nodePoolRequirements, nodePoolTaints, nodePoolLimits, nodePoolWeight, nodePoolNodeClass.
  • An outcome: pass, fail, match, nomatch, or info.
  • An optional weight for preferred-affinity terms.
  • A human-readable message, optionally with a YAML snippet of the failing constraint and references to other pods (for affinity rules).

The same evaluation runs against each Karpenter NodePool using requirement intersection, taint coverage, limits, and an instance-type-fit heuristic.

Sub-Tabs

Parameters

The pod’s constraints, organized by category, with a chip strip across the top summarizing what was parsed. Each section expands to show the underlying YAML and a per-term node match count once the simulation completes.

SectionContents
ResourcesCPU, memory, ephemeral storage, and extended resources (GPUs, hugepages) from spec.containers[*].resources.requests. Includes a “capacity available on N of M nodes” badge.
Node Affinityspec.nodeSelector, plus required and preferred nodeAffinity terms with per-term match counts and weights.
Pod (Anti-)AffinityRequired and preferred pod affinity and anti-affinity, with topology keys, label selectors, and namespace selectors.
TolerationsEvery spec.tolerations entry, paired with the count of tainted nodes it covers.
Topology SpreadEach spec.topologySpreadConstraints entry: maxSkew, topologyKey, whenUnsatisfiable, minDomains.
Host PortsContainer hostPorts and their protocols.

A copy button on each section emits the constraint as YAML so you can paste it into another pod spec.

Existing Nodes

A sortable list of nodes with their fit verdict. Each node row shows a status pill (preferred, eligible, tolerated, blocked), the score (for preferred matches), key labels (zone, instance type, NodePool, arch, OS), and a stacked CPU and memory bar.

The bars use three colors. Gray is existing usage from pods already on the node. Blue is this pod’s request, shown when it fits in the remaining capacity. Hatched red is overflow: the pod’s request exceeds what is left.

Click a row to expand the full list of rule verdicts. Each verdict line uses an icon (check, x, star, dot, i) and a message. Verdicts that carry a YAML snippet expose a button to expand it; verdicts that reference other pods show the first three with a “+N more” affordance.

Grouping

The grouping controls live above the table.

Group keyBucket by
FlatNo grouping.
Poolkarpenter.sh/nodepool label.
Zonetopology.kubernetes.io/zone label.
Typenode.kubernetes.io/instance-type label.
StatusFit status (preferred, eligible, tolerated, blocked).
ReasonFirst failing rule plus message.

The default is Flat when the cluster has up to 50 nodes. With more than 50, Buoy picks the first applicable grouping among Pool, Zone, and Status.

Each group card shows a stacked status histogram, the total count, and, when every node in the group is blocked by the same rule, the top blocking reason. Expand a card to see the nodes it contains; large groups cap at 200 nodes with a “show all” expander.

Reason grouping is the most useful for autoscaling clusters with hundreds of nodes blocked by the same taint or insufficient capacity: 200 blocked nodes collapse to one row.

NodePools

One row per Karpenter NodePool. Each pool shows its fit status, weight, allowed zones and capacity types, architecture, OS, and the configured limits. Expand a row to see the rule verdicts and a grid of compatible instance types (up to twelve rendered).

The instance-type catalog is built from instance types Buoy has observed running on the cluster’s nodes. On a fresh cluster that has not yet provisioned a wide range of types, this catalog is incomplete; the resulting verdicts can report false negatives for instance fit. See Caveats.

Summary

A roll-up of the per-node and per-pool verdicts:

  • Eligible vs. blocked node counts.
  • Top blocking reasons, sorted by the number of nodes affected.
  • Eligible vs. blocked NodePool counts.

When zero nodes are eligible and the pod is neither bound nor gated, the most common failure rule is highlighted as the top blocker.

Headline Chips and Banners

The header above the sub-tabs shows the pod’s phase, the bound node (clickable, when present), the eligible-node count over total (red when zero), and the scheduler name (when non-default).

Banners appear above the sub-tabs for:

  • Scheduling gates: the pod is gated and the default scheduler will not consider it until the gates are removed. The banner lists each gate name.
  • Kueue admission: if a gate’s name starts with kueue.x-k8s.io/, Buoy resolves the controlling Workload via the kueue.x-k8s.io/job-uid label and shows the queue, admission status (awaiting, reserved, admitted, evicted), and a link to the Workload detail view.
  • Custom scheduler: when spec.schedulerName is something other than the default scheduler, a note explains the classification (Kueue, Yunikorn, Volcano, or “other”).
  • Warnings: additional cautions (for example, “cluster has no nodes”).

Scheduling Gates

spec.schedulingGates blocks the default scheduler before any constraint evaluation. When gates are present, every node in the Existing Nodes table is marked blocked by a SchedulingGates rule, and the headline switches to “Gated, N gates pending”. The Kueue gate (kueue.x-k8s.io/admission) is recognized specifically so Buoy can pull the matching Workload status into the banner.

Custom Schedulers

The explainer classifies the scheduler by looking at spec.schedulerName and spec.schedulingGates:

DetectionClassification
schedulerName empty or default-schedulerDefault.
A kueue.x-k8s.io/ scheduling gateKueue, even if schedulerName is the default.
schedulerName contains “yunikorn”Yunikorn.
schedulerName contains “volcano”Volcano.
Any other explicit schedulerNameOther.

For non-default schedulers, the per-node verdicts still apply (those constraints are universal) but the placement decision is made by the named scheduler, not the cluster default. The banner makes that explicit.

Caveats

  • Instance type catalog: the Karpenter pool fit heuristic uses the instance types already present on the cluster’s nodes. A fresh cluster that has not yet provisioned a diverse set of instance types may report false-negative pool fits.
  • Pod affinity snapshot: pod-affinity and anti-affinity verdicts are evaluated against currently running pods, not pending ones. A pod that has not yet landed cannot satisfy another pod’s affinity requirement during simulation.
  • Snapshot, not a watch: open the tab again, or click the Refresh button, to re-evaluate.
  • NodePool requirements: pool fit uses requirement-set intersection on selectors. A pod that pins a label the pool cannot honor surfaces as a fail; a pod that pins a label the pool can honor surfaces as a pass even if every concrete instance type would still need provisioning. Combine the NodePool sub-tab and the instance-type list for a complete picture.

Edit this page on GitLab