Reserved Instances vs Savings Plans vs Spot on Azure
A practitioner deep-dive into Azure commitment strategy — when Reserved Instances, Savings Plans, or Spot VMs actually save money, and how to combine them.
Azure gives you three distinct ways to pay less than the pay-as-you-go rate for compute: Reserved Instances, Savings Plans, and Spot. They are routinely confused, frequently mis-applied, and collectively responsible for one of the largest controllable line items in any enterprise Azure bill. Buying the wrong instrument — or the right one for the wrong workload — locks in waste for one to three years.
This is a practitioner deep-dive into Azure commitment strategy: what each instrument actually is, where it wins, where it quietly destroys value, and how to layer all three into a single coherent plan.
TL;DR / Key takeaways
- Reserved Instances give the deepest discount but lock you to a VM family and region for one or three years — best for steady, well-understood baselines.
- Savings Plans commit you to a fixed hourly spend, not a resource shape, so the discount follows your compute around — best for predictable spend with evolving architecture.
- Spot VMs offer the largest discounts of all but can be evicted in ~30 seconds — best for interruptible, checkpointable, or stateless work.
- The mature pattern is layered: cover the baseline with RIs/Savings Plans, flex with Savings Plans, burst on Spot. Azure applies the benefits in order, so they do not conflict.
- Commitment is a FinOps Optimize-phase discipline: never commit to capacity you have not first measured and right-sized.
The three instruments, precisely
It is worth being exact, because the marketing names blur the distinctions that matter operationally.
Reserved Instances (RIs)
A Reserved Instance is a commitment to a specific VM family (for example, the Dsv5 series) in a specific region, for a one- or three-year term, paid up front or monthly. In return you get the deepest available discount off the pay-as-you-go rate. RIs are shape commitments: the benefit applies to matching running instances. Azure provides instance size flexibility within a family, so a reservation for one large VM can cover several smaller ones in the same family — useful, but it does not rescue you if you migrate to a different family entirely.
RIs reward you for knowing the shape of your steady-state estate.
Savings Plans
An Azure Savings Plan for compute is fundamentally different: you commit to a fixed hourly spend — say, $12/hour — for one or three years, and Azure automatically applies that discounted rate to your eligible compute usage across VMs, Container Instances, Azure Functions Premium, App Service, and more, in any region. You are not betting on a VM family; you are betting on a spend floor.
The trade-off is that the maximum Savings Plan discount is typically lower than the maximum RI discount for the same workload. You pay a little for the flexibility.
Savings Plans reward you for knowing the size of your steady-state spend, even when the shape is still moving.
Spot VMs
Spot VMs sell Azure's unused capacity at a steep, variable discount. The catch: Azure can evict your Spot instance with roughly 30 seconds notice when it needs the capacity back, and price varies with supply. Spot is not a commitment and carries no SLA. It is the right tool for anything that can absorb interruption — and a trap for anything that cannot.
Side-by-side comparison
| Dimension | Reserved Instances | Savings Plans | Spot VMs |
|---|---|---|---|
| What you commit | A VM family + region | A fixed $/hour spend | Nothing |
| Term | 1 or 3 years | 1 or 3 years | None |
| Discount depth | Deepest | Moderate–deep | Largest, but volatile |
| Flexibility | Low (shape-locked) | High (spend-locked) | N/A |
| Interruption risk | None | None | High (~30s eviction) |
| Best for | Steady, known baselines | Predictable spend, evolving shape | Batch, CI/CD, stateless, checkpointed |
| Applies to | Matching VM family | Broad eligible compute | Eligible VM families |
| SLA | Full | Full | None |
A decision framework
The question is never "RI or Savings Plan or Spot?" in the abstract. It is "what is the behaviour of this workload?" Answer that, and the instrument follows.
- Is the workload interruptible? If it can checkpoint or is stateless and idempotent, your first lever is Spot. This is where the largest savings live, and it requires no commitment.
- Is the non-interruptible part steady and shape-stable for the term? If the VM family and region are genuinely stable for one to three years, Reserved Instances extract the deepest discount.
- Is spend predictable but the shape still moving? During migration or active modernisation — when families, regions, and even compute services are changing — Savings Plans protect the discount without betting on a shape that may not survive the quarter.
- What is left over? Anything spiky and short-lived beyond your committed baseline stays on pay-as-you-go. Do not commit to peaks.
In our delivery work at CC Conceptualise, the single most common failure we inherit is a large three-year RI purchase made early in a migration, before the architecture stabilised. Six months later the team has re-platformed onto a different VM family, the reservation is stranded, and the "savings" are now a sunk loss. The fix is almost always to lead with Savings Plans during change and convert to RIs only once the estate is provably stable.
Building coverage: how the layers stack
Azure applies the benefits in a fixed order each hour: Reserved Instance benefit first, then Savings Plan benefit, then any remaining usage at Spot or pay-as-you-go rates. This ordering is what makes layering safe — the instruments do not compete for the same usage.
A healthy target profile looks like this:
- Baseline (always-on, shape-stable): Reserved Instances, ideally three-year where confidence is high.
- Variable-but-predictable layer: Savings Plans sized to your reliable spend floor, one- or three-year by confidence.
- Elastic / interruptible layer: Spot, with eviction handling and graceful drain built in.
- True peaks: pay-as-you-go — uncommitted by design.
The metric to watch is commitment coverage: the share of eligible compute hours covered by a commitment. High coverage on the baseline is the goal; chasing 100% coverage across everything is how you end up over-committed.
A practical rollout sequence
- Inform first. Pull at least 30–90 days of usage from Cost Management. You cannot commit responsibly to capacity you have not measured. Pair this with cost anomaly detection so spend spikes are explained before they become "baseline".
- Right-size before you commit. Committing to over-provisioned VMs locks in the waste. Right-size, then commit.
- Move interruptible work to Spot. Capture the largest discount first, with no commitment risk.
- Cover the stable baseline. Start with Savings Plans if the architecture is still moving; use RIs where the shape is genuinely fixed.
- Guardrail it. Use Azure Policy to prevent un-tagged or un-owned resources from quietly inflating the baseline you are about to commit to.
- Operate and revisit quarterly. Coverage and utilisation drift as the estate changes. Treat commitment as a living portfolio, not a one-time purchase.
Common, expensive mistakes
- Committing during migration. The architecture is the least stable it will ever be. Lead with flexible instruments.
- Buying RIs for bursty workloads. Reservations bill whether or not you use them; spiky workloads leave reservations idle.
- Treating Spot as free production capacity. Eviction is not an edge case — it is the contract. Stateful single-instance services on Spot is an outage waiting to happen.
- Ignoring GPU/AI pricing dynamics. GPU capacity is scarce and prices move fast. Long commitments here are riskier; see our note on GPU and AI workload cost control.
- Forgetting non-VM compute. Savings Plans cover far more than VMs, and Fabric capacity has its own commitment logic — covered in Fabric capacity sizing.
Where this sits in your FinOps practice
Commitment strategy is squarely a FinOps Optimize-phase activity, and it only works if the Inform phase is healthy: accurate tagging, allocation, and showback so you know what is actually steady-state. It must then move into the Operate phase — commitments are reviewed, renewed, and rebalanced continuously, not bought once and forgotten. A commitment portfolio left unmanaged for a year is almost always misaligned with the estate by the end of it.
The discipline is unglamorous but high-leverage: measure, right-size, then commit deliberately, layer by layer, with the deepest commitment reserved for the most predictable workloads.
Conclusion
Reserved Instances, Savings Plans, and Spot are not competitors — they are complementary tools for three different workload behaviours. RIs reward shape stability with the deepest discount. Savings Plans reward spend predictability with flexibility. Spot rewards interruption tolerance with the largest discount of all. The enterprises that get this right do not pick one; they layer all three onto a baseline they have already measured and right-sized.
If you want a second pair of senior eyes on your Azure commitment portfolio — or a clean-slate strategy before your next big purchase — our cloud architecture and migration practice does exactly this kind of work.
FAQ
What is the difference between Azure Reserved Instances and Savings Plans?
Reserved Instances commit you to a specific VM family in a specific region for one or three years, in exchange for the deepest discount. Azure Savings Plans for compute commit you to a fixed hourly spend (in dollars per hour) across eligible compute services, and that commitment automatically applies to whatever you run — VMs, Container Instances, Azure Functions Premium, and more. RIs reward predictability of shape; Savings Plans reward predictability of spend with far more flexibility.
When should I use Azure Spot VMs instead of a commitment?
Use Spot for workloads that tolerate interruption: batch processing, CI/CD agents, rendering, big-data jobs, and stateless or checkpointed compute. Spot can be evicted with 30 seconds notice when Azure needs the capacity back, so it is unsuitable for production databases, stateful single-instance services, or anything with a tight SLA. Spot is a capacity discount, not a commitment, so it stacks conceptually with your baseline coverage rather than replacing it.
Can I combine Reserved Instances, Savings Plans, and Spot in one strategy?
Yes, and you should. The mature pattern is to cover your stable always-on baseline with Reserved Instances or Savings Plans, use Savings Plans to flexibly cover the variable-but-predictable layer, and run interruptible or bursty work on Spot. Azure applies RI benefit first, then Savings Plan benefit, so the layers do not conflict. The goal is high commitment coverage on the baseline without over-committing to capacity you might not use.
What happens to my Savings Plan if I stop using a VM family?
Nothing breaks. Because a Savings Plan is a dollars-per-hour commitment rather than a resource-shape commitment, the discount simply re-applies to other eligible compute you are running. This is the central advantage over Reserved Instances, which are tied to a VM family and region and can end up underutilised if your architecture changes. The trade-off is that the maximum Savings Plan discount is usually lower than the maximum RI discount.
How do I decide between a one-year and a three-year Azure commitment?
Match commitment term to confidence. Three-year terms unlock the deepest discounts but assume your workload, region, and architecture will be stable for that long — a strong assumption during cloud migration or active modernisation. One-year terms cost more per hour but preserve optionality. We generally recommend three-year commitments only for genuinely steady-state platforms and one-year for anything still evolving.
Do Azure commitments apply to GPU and AI workloads?
Some GPU VM families are eligible for Reserved Instances and Savings Plans, but GPU and AI capacity is scarcer and pricing moves faster, so blind three-year commitments are riskier. For training and batch inference that can checkpoint, Spot GPU capacity is often the bigger lever. For steady production inference, a shorter commitment plus Spot for elastic load is usually the better balance.
Topics