Kubernetes Cost Allocation with OpenCost on AKS
A practitioner guide to Kubernetes cost allocation with OpenCost on AKS — namespace chargeback, idle cost, and when Kubecost earns its price.
Every cloud provider's Kubernetes pricing page advertises a small management fee. What it does not tell you is that a single AKS cluster running forty workloads across twelve teams produces one undifferentiated bill. Finance sees "AKS compute: EUR 38,000." Nobody can say which product line, which team, or which feature consumed it. That opacity is the single most common reason Kubernetes cost initiatives stall.
Kubernetes cost allocation is the discipline of turning that one number into a per-namespace, per-team, per-workload breakdown you can defend in a budget review. OpenCost — the CNCF project — and its commercial sibling Kubecost are the standard tools for doing it on AKS. This post is the allocation model we deploy at CC Conceptualise, including the parts the documentation glosses over.
TL;DR / Key takeaways
- OpenCost attributes shared AKS node cost down to individual pods using resource requests, actual usage, and real Azure pricing — making namespace and team chargeback possible.
- Idle cost (the gap between capacity paid for and capacity used) is typically 30-50% of AKS compute spend and must be reported separately, or teams game each other instead of fixing it.
- OpenCost is the free, open allocation engine; Kubecost is the commercial layer (UI, retention, multi-cluster, governance) on top. Choose based on operating model and scale, not allocation accuracy.
- Wire OpenCost to your Azure billing export first, so allocations reflect Reserved Instance and Savings Plan discounts and reconcile to the actual invoice.
- Allocation only creates value when paired with an accountability model — showback to start, chargeback once the numbers are trusted.
Why container cost visibility is hard on AKS
The Azure portal bills you per node, per disk, per load balancer. Kubernetes schedules many pods onto each node, with no native concept of which pod cost what. The mapping between the resource you pay for (a node) and the unit you care about (a workload owned by a team) does not exist in either system. Cost allocation is the layer that builds it.
Three facts make this harder than tagging VMs:
- Pods share nodes. A node's hourly price has to be split across every pod scheduled on it, weighted by what each consumed.
- Requests and usage diverge. A team might request 4 vCPU and use 0.5. You can allocate by request (what they reserved) or by usage (what they burned) — and the answer changes the bill materially.
- Capacity is never fully used. The unscheduled remainder of every node is real spend that belongs to no workload. If you silently spread it across teams, you punish efficient teams for the cluster's overprovisioning.
OpenCost solves the first by proportional attribution, exposes the second by reporting both dimensions, and solves the third by reporting idle cost as its own line.
How OpenCost calculates allocation
OpenCost runs as a lightweight agent alongside Prometheus. For every pod, over every time window, it measures CPU, memory, GPU, persistent volume, and network consumption. It multiplies each dimension by the effective hourly price of the underlying resource — pulled from the Azure Rate Card or, better, from your billing export so discounts apply — and sums them into a per-pod cost.
From pod-level cost, it aggregates up any dimension you ask for:
| Aggregation key | Typical use | Chargeback fit |
|---|---|---|
| Namespace | Team or product boundary | Strong — if namespaces map cleanly to cost centres |
| Label / annotation | Cross-cutting (e.g. cost-center, env) | Strongest — survives namespace sprawl |
| Controller / Deployment | Per-service cost | Good for engineering, too granular for finance |
| Cluster | Environment (prod/stage/dev) | Coarse rollup for executive reporting |
The single most important decision is the allocation key, and it should be a label, not the namespace. Namespaces drift; a cost-center label enforced by admission policy does not. We mandate a small set of required labels at cluster onboarding precisely so allocation never breaks later.
The idle cost problem
Idle cost is where most engagements find their savings — and most internal tools hide it. If your cluster's nodes provide 100 vCPU and workloads request 60, the 40 vCPU difference is paid-for, unused capacity.
OpenCost reports idle as a distinct bucket rather than folding it into team allocations. This matters for behaviour, not just accuracy. When idle is hidden, the platform team's overprovisioning gets billed to product teams who then argue about a number they cannot control. When idle is visible, everyone sees a shared 40% inefficiency and the conversation turns to the actual levers: rightsizing requests, tuning the cluster autoscaler, and consolidating fragmented node pools.
On AKS specifically, three patterns drive idle:
- Oversized requests copied from a template nobody revisited. Requests, not limits, reserve schedulable capacity, so inflated requests directly burn money.
- Node pool fragmentation — too many small pools that each carry system-component overhead and cannot bin-pack tightly.
- Autoscaler headroom held for burst that rarely arrives. Some headroom is legitimate; most clusters carry far more than their actual burst profile justifies.
OpenCost vs Kubecost: which do you actually need
This is the question every platform lead asks, and the honest answer is that they are not competitors — Kubecost is built on OpenCost. The decision is about your operating model.
| Dimension | OpenCost (open source) | Kubecost (commercial) |
|---|---|---|
| Allocation accuracy | Full | Full (same engine) |
| Cost | Free | Per-cluster / per-node licence |
| UI | Basic; you build Grafana | Rich, role-based dashboards |
| Data retention | Prometheus-bound (short by default) | Long-term, durable store |
| Multi-cluster aggregation | Manual | Built-in |
| Savings recommendations | None | Governed, actionable |
| SSO / RBAC for finance users | No | Yes |
| Support | Community | Enterprise SLA |
Choose OpenCost when you run one or two clusters, your team is fluent in Prometheus and Grafana, and finance consumes exported CSVs. Choose Kubecost when you operate many clusters, need long retention for trend analysis and audit evidence, or need non-engineers to self-serve cost data behind SSO. The trigger is scale and stakeholder breadth, not a gap in the allocation maths.
In one recent post-merger consolidation we ran for a manufacturing client, OpenCost was sufficient for the first two clusters; by the time we had unified seven business units onto a shared platform, the retention and multi-cluster requirements made Kubecost the cheaper option once we counted the Grafana engineering it replaced.
A deployment and rollout checklist
- Install OpenCost with Prometheus on each cluster (Helm). Confirm it scrapes node and pod metrics.
- Connect Azure pricing via the billing export, not list pricing, so Reserved Instance and Savings Plan discounts flow into allocations and reconcile to the invoice.
- Enforce allocation labels (
cost-center,team,env) through an admission policy — Azure Policy for AKS or a Gatekeeper/Kyverno constraint — so unlabelled workloads are rejected. - Validate against the invoice. Sum all allocations plus idle and confirm it matches the Azure bill within a few percent. If it does not, your pricing source is wrong.
- Start with showback. Publish per-team dashboards for a quarter. Let teams see and trust the numbers before money moves.
- Surface idle separately and assign it to the platform team as an explicit, owned efficiency target.
- Move to chargeback only once allocations are trusted and the accountability model is agreed.
Allocation tooling is necessary but not sufficient. The number only changes behaviour when it lands on a named owner with a budget. We treat the FinOps Inform, Optimize, Operate phases as the operating frame: OpenCost delivers Inform, rightsizing and autoscaler tuning deliver Optimize, and chargeback with policy guardrails delivers Operate.
Where this fits in your wider Azure cost programme
Kubernetes is one workload class. The same FinOps discipline applies across your estate, and container allocation should plug into it rather than live as a silo. If you are building the broader programme, pair this with Azure cost anomaly detection so a sudden AKS spike is caught the day it happens, GPU and AI workload cost control if you run model training or inference on GPU node pools, and Microsoft Fabric capacity sizing where your data platform shares the same chargeback model.
The end state is one accountability map: every euro of cloud spend traces to an owner, container costs included.
FAQ
What is the difference between OpenCost and Kubecost?
OpenCost is the open-source CNCF project that defines a vendor-neutral specification for Kubernetes cost allocation and ships a free monitoring agent. Kubecost is the commercial product built on the same engine, adding a richer UI, longer data retention, multi-cluster aggregation, savings recommendations, and enterprise support. For most teams, OpenCost provides the allocation data; Kubecost provides the workflow and governance layer on top.
How does OpenCost allocate the cost of a shared AKS node to individual pods?
OpenCost measures each pod's resource requests and actual usage for CPU, memory, GPU, persistent storage, and network, then attributes a proportional share of the underlying node's hourly price to that pod. It pulls real Azure pricing through the Azure Rate Card and your reservation or Savings Plan discounts, so allocations reflect what you actually pay, not list price. Costs that cannot be tied to a workload — unallocated node capacity — are reported as idle.
Can OpenCost show the cost of a single namespace or team on AKS?
Yes. Allocation can be aggregated by namespace, label, annotation, controller, deployment, or any combination, which is what makes namespace-based or label-based chargeback possible. Most enterprises align namespaces or a team label to cost centres, then export those aggregates into showback dashboards or chargeback reports. This is the foundation of container cost visibility in a multi-tenant cluster.
Does OpenCost account for AKS Reserved Instances and Savings Plans?
OpenCost reflects the effective rate you pay when it is configured with your Azure billing export or a custom pricing source, so reservation and Savings Plan discounts flow through to per-namespace allocations. Without that integration it falls back to on-demand list pricing, which overstates cost. We always wire OpenCost to the Azure billing export first so chargeback numbers reconcile with the actual invoice.
What is idle cost in Kubernetes and why does it matter?
Idle cost is the gap between the capacity you pay for on your nodes and the capacity your workloads actually request and use. On real AKS clusters it is frequently 30 to 50 percent of compute spend, driven by oversized requests, fragmented bin-packing, and headroom for autoscaling. Surfacing idle cost separately stops teams from blaming each other for a shared inefficiency and points directly at rightsizing and autoscaler tuning.
Is OpenCost enough or do enterprises need Kubecost?
OpenCost is enough when you have one or two clusters, a team that is comfortable with Prometheus and Grafana, and a finance process that consumes exported data. Kubecost earns its licence when you run many clusters, need long retention for trend analysis and audits, want governed savings recommendations, or need role-based access and SSO for non-engineering stakeholders. The decision is about operating model and scale, not capability gaps in allocation accuracy.
If you are standing up Kubernetes cost allocation on AKS — or your existing chargeback numbers will not reconcile to the invoice — our architects can help you build a model finance trusts. See our cloud architecture and FinOps services.
Topics