Azure Policy as Code: Enforcing Governance at Scale Without Blocking Deployments
A comprehensive guide to implementing Azure Policy as Code with lifecycle management, policy definitions in Bicep and Terraform, initiative bundles, exemption management, compliance dashboards, and remediation tasks.
Azure Policy is the guardrail system for your cloud environment. It can enforce tagging standards, block insecure configurations, and ensure compliance with regulatory requirements — all automatically. But most enterprises implement it wrong: they create policies through the portal, forget to test them, and then wonder why a critical deployment fails on a Friday afternoon because someone changed a policy effect from Audit to Deny without telling anyone.
This post covers how to manage Azure Policy as code: authoring, testing, deploying, monitoring, and handling the inevitable exemptions.
The Policy as Code Lifecycle
Policy management follows the same lifecycle as application code:
Each stage has specific practices:
Author — Write policy definitions in your IaC tool (Bicep or Terraform). Store them in a dedicated repository or a policies/ directory in your landing zone repo.
Review — Pull request review with both platform team and security team approvals required. Policy changes affect every team in the organization.
Test — Deploy to a test management group and verify the policy behaves as expected. Confirm it catches violations without false positives.
Deploy (Audit) — Deploy to production management groups in Audit mode. Monitor compliance for 2-4 weeks.
Monitor — Review compliance data. Are there legitimate resources flagged? Adjust the policy rule or add exemptions.
Enforce (Deny) — Switch the effect to Deny once compliance is above 95% and all legitimate exemptions are documented.
Maintain — Review policies quarterly. Remove obsolete policies. Update rules as Azure services evolve.
Policy Definitions in Bicep
Here are real-world policy definitions for common enterprise governance rules.
Require Tags on Resource Groups
// policies/require-tags-rg.bicep
targetScope = 'managementGroup'
@description('List of required tag names')
param requiredTags array = [
'Environment'
'CostCenter'
'Owner'
'Application'
]
resource policyDefinition 'Microsoft.Authorization/policyDefinitions@2023-04-01' = {
name: 'require-tags-on-resource-groups'
properties: {
displayName: 'Require specific tags on resource groups'
description: 'Ensures resource groups have required tags for cost management and ownership tracking'
policyType: 'Custom'
mode: 'All'
metadata: {
category: 'Tags'
version: '1.2.0'
}
parameters: {
effect: {
type: 'String'
metadata: {
displayName: 'Effect'
description: 'Deny or Audit the policy'
}
allowedValues: ['Audit', 'Deny']
defaultValue: 'Audit'
}
}
policyRule: {
if: {
allOf: [
{
field: 'type'
equals: 'Microsoft.Resources/subscriptions/resourceGroups'
}
{
anyOf: [for tag in requiredTags: {
field: 'tags[\'${tag}\']'
exists: 'false'
}]
}
]
}
then: {
effect: '[parameters(\'effect\')]'
}
}
}
}Deny Public IP Addresses (with Exceptions)
// policies/deny-public-ip.bicep
targetScope = 'managementGroup'
resource policyDefinition 'Microsoft.Authorization/policyDefinitions@2023-04-01' = {
name: 'deny-public-ip-addresses'
properties: {
displayName: 'Deny public IP address creation'
description: 'Prevents creation of public IP addresses except in approved resource groups'
policyType: 'Custom'
mode: 'All'
metadata: {
category: 'Network'
version: '2.0.0'
}
parameters: {
effect: {
type: 'String'
allowedValues: ['Audit', 'Deny']
defaultValue: 'Deny'
}
excludedResourceGroups: {
type: 'Array'
metadata: {
displayName: 'Excluded Resource Groups'
description: 'Resource groups where public IPs are allowed (e.g., DMZ, bastion)'
}
defaultValue: []
}
}
policyRule: {
if: {
allOf: [
{
field: 'type'
equals: 'Microsoft.Network/publicIPAddresses'
}
{
field: 'Microsoft.Network/publicIPAddresses/publicIPAllocationMethod'
exists: 'true'
}
{
value: '[resourceGroup().name]'
notIn: '[parameters(\'excludedResourceGroups\')]'
}
]
}
then: {
effect: '[parameters(\'effect\')]'
}
}
}
}Enforce TLS 1.2 Minimum on Storage Accounts
// policies/enforce-tls-storage.bicep
targetScope = 'managementGroup'
resource policyDefinition 'Microsoft.Authorization/policyDefinitions@2023-04-01' = {
name: 'enforce-tls12-storage'
properties: {
displayName: 'Enforce TLS 1.2 minimum on Storage Accounts'
description: 'Storage accounts must use TLS 1.2 or higher'
policyType: 'Custom'
mode: 'Indexed'
metadata: {
category: 'Storage'
version: '1.0.0'
}
parameters: {
effect: {
type: 'String'
allowedValues: ['Audit', 'Deny', 'Modify']
defaultValue: 'Modify'
}
}
policyRule: {
if: {
allOf: [
{
field: 'type'
equals: 'Microsoft.Storage/storageAccounts'
}
{
field: 'Microsoft.Storage/storageAccounts/minimumTlsVersion'
notEquals: 'TLS1_2'
}
]
}
then: {
effect: '[parameters(\'effect\')]'
details: {
roleDefinitionIds: [
'/providers/Microsoft.Authorization/roleDefinitions/17d1049b-9a84-46fb-8f53-869881c3d3ab'
]
conflictEffect: 'audit'
operations: [
{
operation: 'addOrReplace'
field: 'Microsoft.Storage/storageAccounts/minimumTlsVersion'
value: 'TLS1_2'
}
]
}
}
}
}
}Policy Definitions in Terraform
If your landing zone uses Terraform, here is the equivalent approach:
# policies/deny_public_ip/main.tf
resource "azurerm_policy_definition" "deny_public_ip" {
name = "deny-public-ip-addresses"
display_name = "Deny public IP address creation"
description = "Prevents creation of public IP addresses except in approved resource groups"
policy_type = "Custom"
mode = "All"
management_group_id = var.management_group_id
metadata = jsonencode({
category = "Network"
version = "2.0.0"
})
parameters = jsonencode({
effect = {
type = "String"
metadata = {
displayName = "Effect"
}
allowedValues = ["Audit", "Deny"]
defaultValue = "Deny"
}
excludedResourceGroups = {
type = "Array"
metadata = {
displayName = "Excluded Resource Groups"
}
defaultValue = []
}
})
policy_rule = jsonencode({
if = {
allOf = [
{
field = "type"
equals = "Microsoft.Network/publicIPAddresses"
},
{
field = "Microsoft.Network/publicIPAddresses/publicIPAllocationMethod"
exists = "true"
},
{
value = "[resourceGroup().name]"
notIn = "[parameters('excludedResourceGroups')]"
}
]
}
then = {
effect = "[parameters('effect')]"
}
})
}Terraform Policy Assignment
# assignments/production.tf
resource "azurerm_management_group_policy_assignment" "deny_public_ip" {
name = "deny-public-ip-prod"
management_group_id = data.azurerm_management_group.production.id
policy_definition_id = azurerm_policy_definition.deny_public_ip.id
parameters = jsonencode({
effect = { value = "Deny" }
excludedResourceGroups = { value = ["rg-dmz-prod", "rg-bastion-prod"] }
})
non_compliance_message {
content = "Public IP addresses are not allowed. Use Private Endpoints or Azure Front Door instead. Contact platform-team@company.com for exceptions."
}
identity {
type = "SystemAssigned"
}
location = "westeurope"
}Initiative Bundles
Group related policies into initiatives (policy sets) for easier assignment:
# initiatives/security-baseline.tf
resource "azurerm_policy_set_definition" "security_baseline" {
name = "security-baseline-initiative"
display_name = "Enterprise Security Baseline"
description = "Core security policies applied to all subscriptions"
policy_type = "Custom"
management_group_id = var.root_management_group_id
metadata = jsonencode({
category = "Security"
version = "3.1.0"
})
parameters = jsonencode({
storageEffect = {
type = "String"
defaultValue = "Deny"
}
networkEffect = {
type = "String"
defaultValue = "Deny"
}
})
policy_definition_reference {
policy_definition_id = azurerm_policy_definition.enforce_tls_storage.id
parameter_values = jsonencode({
effect = { value = "[parameters('storageEffect')]" }
})
reference_id = "enforceTlsStorage"
}
policy_definition_reference {
policy_definition_id = azurerm_policy_definition.deny_public_ip.id
parameter_values = jsonencode({
effect = { value = "[parameters('networkEffect')]" }
})
reference_id = "denyPublicIp"
}
policy_definition_reference {
policy_definition_id = "/providers/Microsoft.Authorization/policyDefinitions/404c3081-a854-4457-ae30-26a93ef643f9"
reference_id = "secureTransferStorage"
}
}Assign the initiative once to a management group, and all member subscriptions inherit the policies.
Exemption Management
Exemptions are inevitable. The key is managing them as code with expiration dates:
# exemptions/payment-team-public-ip.tf
resource "azurerm_resource_policy_exemption" "payment_gateway_public_ip" {
name = "payment-gateway-public-ip-exemption"
resource_id = data.azurerm_resource_group.payment_gateway.id
policy_assignment_id = azurerm_management_group_policy_assignment.deny_public_ip.id
exemption_category = "Waiver"
description = "Payment gateway requires public IP for PCI DSS compliant external endpoint. Approved by security team in JIRA-SEC-1234."
expires_on = "2026-06-30T00:00:00Z"
metadata = jsonencode({
approvedBy = "security-team"
ticketNumber = "JIRA-SEC-1234"
reviewDate = "2026-06-15"
justification = "PCI DSS requirement for payment processor callback endpoint"
})
}Governance rules for exemptions:
- Every exemption must have a JIRA/ADO ticket reference
- Maximum exemption duration: 6 months (renewable with re-review)
- Exemptions require security team approval in the pull request
- A monthly report lists all active exemptions approaching expiration
- Expired exemptions are automatically removed by the pipeline
# azure-pipelines.yml — exemption cleanup
schedules:
- cron: '0 8 * * 1'
displayName: 'Weekly exemption review'
branches:
include: [main]
steps:
- script: |
# Find exemptions expiring in the next 14 days
az policy exemption list \
--query "[?properties.expiresOn < '$(date -d '+14 days' -u +%Y-%m-%dT%H:%M:%SZ)']" \
--output table
displayName: 'Report expiring exemptions'Compliance Dashboards
Azure Policy provides built-in compliance views, but for enterprise reporting you need more:
Azure Resource Graph Queries
// Overall compliance by management group
PolicyResources
| where type == 'microsoft.policyinsights/policystates'
| where properties.complianceState != 'Compliant'
| summarize NonCompliantCount = count() by
ManagementGroup = tostring(properties.managementGroupIds),
PolicyName = tostring(properties.policyDefinitionName),
Category = tostring(properties.policyDefinitionCategory)
| order by NonCompliantCount desc// Non-compliant resources with details
PolicyResources
| where type == 'microsoft.policyinsights/policystates'
| where properties.complianceState == 'NonCompliant'
| project
ResourceId = tostring(properties.resourceId),
ResourceType = tostring(properties.resourceType),
PolicyName = tostring(properties.policyDefinitionName),
Subscription = tostring(properties.subscriptionId),
Timestamp = todatetime(properties.timestamp)
| order by Timestamp desc
| take 100Automated Compliance Report
# azure-pipelines.yml — weekly compliance report
- script: |
az graph query -q "
PolicyResources
| where type == 'microsoft.policyinsights/policystates'
| summarize
Compliant = countif(properties.complianceState == 'Compliant'),
NonCompliant = countif(properties.complianceState == 'NonCompliant'),
Exempt = countif(properties.complianceState == 'Exempt')
| extend ComplianceRate = round(100.0 * Compliant / (Compliant + NonCompliant), 2)
" --output table
displayName: 'Generate compliance summary'Remediation Tasks
Policies with Modify or DeployIfNotExists effects can auto-remediate non-compliant resources:
# remediation/tls-remediation.tf
resource "azurerm_resource_group_policy_remediation" "tls_remediation" {
name = "remediate-tls-storage"
resource_group_id = data.azurerm_resource_group.example.id
policy_assignment_id = azurerm_management_group_policy_assignment.security_baseline.id
policy_definition_reference_id = "enforceTlsStorage"
resource_discovery_mode = "ReEvaluateCompliance"
}Caution with remediation: Always test remediation tasks in a non-production subscription first. A Modify policy that updates the wrong field can break running services. Use resource_discovery_mode = "ReEvaluateCompliance" to get fresh compliance data before remediating.
CI/CD Pipeline for Policy Deployment
# azure-pipelines.yml — policy deployment
trigger:
branches:
include: [main]
paths:
include: [policies/**, initiatives/**, assignments/**]
stages:
- stage: Validate
jobs:
- job: PolicyTest
pool:
vmImage: 'ubuntu-latest'
steps:
- script: |
# Validate all policy JSON is syntactically correct
for f in policies/**/policy-rule.json; do
jq empty "$f" || exit 1
done
displayName: 'Validate policy JSON'
- script: terraform plan -target=module.policies
displayName: 'Terraform plan — policies only'
- stage: DeployTest
dependsOn: Validate
jobs:
- deployment: DeployToTestMG
environment: policy-test
strategy:
runOnce:
deploy:
steps:
- script: |
terraform apply -target=module.policies -auto-approve
displayName: 'Deploy policies to test management group'
- stage: DeployProd
dependsOn: DeployTest
jobs:
- deployment: DeployToProdMG
environment: policy-production
strategy:
runOnce:
deploy:
steps:
- script: |
terraform apply -target=module.policies -auto-approve
displayName: 'Deploy policies to production management groups'Common Pitfalls
1. Starting with Deny effect. Always start with Audit. Measure impact. Then enforce. A Deny policy deployed without testing will block deployments and generate emergency tickets.
2. Not versioning policies. Include a version field in policy metadata. When you update a policy, increment the version. This makes it possible to track which version of a policy a compliance finding relates to.
3. Ignoring evaluation delay. Azure Policy evaluation is not instant. New resources are evaluated within 15 minutes. Existing resources are re-evaluated every 24 hours. Do not expect real-time compliance data.
4. Too many custom policies. Before writing a custom policy, check the 800+ built-in policies. Many common governance rules already exist and are maintained by Microsoft.
5. No non-compliance messages. A generic "Policy denied the request" error wastes developer time. Always include a non_compliance_message that explains what is wrong and how to fix it.
Conclusion
Azure Policy as Code transforms governance from an ad-hoc portal activity into a disciplined, reviewable, testable process. The investment in setting up the repository structure, CI/CD pipeline, and exemption process pays for itself the first time a policy change goes through proper review instead of being clicked into production at 4pm on a Friday.
Start with three to five policies that address your highest-risk misconfigurations. Deploy them in Audit mode. Build the compliance dashboard. Then gradually expand coverage and switch to Deny as teams adapt.
If you need help designing your Azure governance framework or implementing Policy as Code for your landing zone, contact us at mbrahim@conceptualise.de. We help enterprises build governance that scales without creating bottlenecks.
Topics