GitOps with Flux on AKS: A Practical Implementation Guide

GitOps sounds simple: store your desired cluster state in Git, and let a controller reconcile reality to match. In practice, getting a production-grade GitOps setup working on AKS requires decisions about repository structure, secret management, multi-cluster promotion, and progressive delivery that the "getting started" tutorials never cover.

This guide walks through a real-world Flux v2 implementation on AKS, based on patterns we deploy for enterprise clients.

Why Flux (and Why Not ArgoCD)

Both Flux and ArgoCD are CNCF-graduated GitOps controllers. We use Flux for AKS deployments for several practical reasons:

Native Helm and Kustomize support without requiring a UI or additional server components.
Multi-tenancy via namespaces — Flux's controller-per-namespace model maps cleanly to team boundaries.
Azure integration — Flux is supported as a first-party AKS extension (microsoft.flux), which means Microsoft handles upgrades and support.
Pull-based architecture — no need to expose cluster APIs to your CI system.

ArgoCD excels when you need a rich UI for cluster visualization. If that is a priority, ArgoCD is a fine choice. For headless, API-driven GitOps, Flux is more lightweight.

Repository Structure

The single most impactful decision in a GitOps setup is how you structure your Git repositories. We recommend a two-repository pattern:

App Repository (per service)

Contains application source code and a deploy/ directory with raw manifests or Helm chart values.

Code

my-service/
  src/
  Dockerfile
  deploy/
    base/
      deployment.yaml
      service.yaml
      kustomization.yaml
    overlays/
      dev/
        kustomization.yaml
        patches.yaml
      staging/
        kustomization.yaml
      prod/
        kustomization.yaml

Fleet Repository (one per platform)

Contains the cluster configuration — which applications are deployed where, with what configuration.

Code

fleet/
  clusters/
    dev-westeurope/
      flux-system/
      infrastructure/
      apps/
    prod-westeurope/
      flux-system/
      infrastructure/
      apps/
  infrastructure/
    sources/
    cert-manager/
    ingress-nginx/
    external-secrets/
  apps/
    my-service/
      base/
        kustomization.yaml
        helmrelease.yaml
      overlays/
        dev/
        prod/

Why two repos? Separation of concerns. Application developers own the app repo and its deploy manifests. The platform team owns the fleet repo and controls what runs on each cluster. This prevents a single commit from accidentally affecting production.

Bootstrapping Flux on AKS

With the AKS Flux extension, bootstrapping is straightforward:

Bash

az k8s-configuration flux create \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-westeurope \
  --name flux-system \
  --namespace flux-system \
  --scope cluster \
  --url https://dev.azure.com/org/project/_git/fleet \
  --branch main \
  --kustomization name=cluster path=./clusters/prod-westeurope

For more control, bootstrap directly with the Flux CLI:

Bash

flux bootstrap git \
  --url=ssh://git@dev.azure.com/v3/org/project/fleet \
  --branch=main \
  --path=./clusters/prod-westeurope \
  --private-key-file=./identity

Either way, Flux installs its controllers and begins reconciling the cluster state against the Git repository.

Kustomize Overlays for Environment Promotion

Kustomize overlays are the backbone of environment-specific configuration. The base layer defines the common deployment spec, and overlays patch per environment.

A typical overlay for production might include:

YAML

# overlays/prod/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../base
patches:
  - path: patches.yaml

YAML

# overlays/prod/patches.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-service
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: my-service
          resources:
            requests:
              cpu: 500m
              memory: 512Mi
            limits:
              cpu: "1"
              memory: 1Gi

Key practice: Never promote by merging branches (e.g., dev branch into prod branch). Promote by updating the overlay values (image tag, replica count) in the fleet repo via a PR. This keeps your Git history linear and auditable.

Helm Releases with Flux

For third-party software and complex applications, Flux's HelmRelease CRD manages Helm chart deployments declaratively:

YAML

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: ingress-nginx
  namespace: ingress-system
spec:
  interval: 30m
  chart:
    spec:
      chart: ingress-nginx
      version: "4.9.x"
      sourceRef:
        kind: HelmRepository
        name: ingress-nginx
  values:
    controller:
      replicaCount: 2
      service:
        annotations:
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"

Pin chart versions with pessimistic constraints (4.9.x) to allow patches but prevent breaking changes.
Set the interval to control how frequently Flux checks for chart updates.
Use valuesFrom to pull values from ConfigMaps or Secrets for environment-specific overrides.

Secret Management with SOPS

Storing secrets in Git is a requirement for GitOps but a security concern if done naively. Mozilla SOPS with Azure Key Vault solves this elegantly.

Setup

Create an Azure Key Vault with an RSA key for SOPS encryption.
Grant the Flux service account (via Workload Identity) decrypt permissions on the key.
Configure a .sops.yaml in your fleet repo:

YAML

creation_rules:
  - path_regex: .*\.enc\.yaml$
    azure_keyvault: https://kv-flux-prod.vault.azure.net/keys/sops-key/abc123

Encrypt secrets before committing:

Bash

sops --encrypt --in-place secrets.enc.yaml

Flux automatically decrypts SOPS-encrypted files during reconciliation using the Kustomize controller's built-in SOPS support.

Important: Use separate SOPS keys per environment. A dev cluster should not be able to decrypt production secrets, even if someone accidentally copies the encrypted file.

Multi-Cluster Management

For multi-cluster setups (dev, staging, production; or multiple regions), the fleet repo's directory structure does the heavy lifting.

Each cluster gets its own directory under clusters/, which defines:

Infrastructure components common to all clusters (cert-manager, ingress) — referenced from a shared infrastructure/ directory.
Application deployments specific to that cluster's environment — referenced from environment-specific overlays.

YAML

# clusters/prod-westeurope/apps/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../../apps/my-service/overlays/prod
  - ../../../apps/api-gateway/overlays/prod

Promotion flow: A PR that updates the image tag in apps/my-service/overlays/dev triggers deployment to dev. Once validated, a second PR updates overlays/staging, then overlays/prod. Each step is a reviewed, auditable Git commit.

Progressive Delivery with Flagger

For production deployments, combine Flux with Flagger to enable canary releases or blue-green deployments.

Flagger watches a Deployment, creates a canary, gradually shifts traffic, and automatically promotes or rolls back based on metrics:

YAML

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: my-service
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-service
  progressDeadlineSeconds: 600
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m

This configuration gradually shifts traffic from 0% to 50% to the canary in 10% increments, rolling back automatically if the success rate drops below 99%.

Operational Practices

Monitoring Reconciliation

Use flux get kustomizations and flux get helmreleases to check reconciliation status.
Export Flux metrics to Prometheus and create Grafana dashboards for reconciliation latency, failures, and drift events.
Set up alerts for sustained reconciliation failures — these indicate either a broken manifest in Git or a cluster-side issue.

Handling Drift

Flux continuously reconciles, which means manual kubectl changes will be reverted. This is a feature, not a bug — but communicate it clearly to your teams.

Use Flux's field manager annotations to exclude specific fields from reconciliation when manual overrides are genuinely needed (e.g., HPA-managed replicas).
Audit reverted changes in Flux logs to identify teams or processes that are still making manual changes.

Disaster Recovery

Your fleet repo is your disaster recovery plan. To rebuild a cluster, bootstrap Flux against the same repo path.
Test this regularly. Spin up a fresh AKS cluster, bootstrap Flux, and verify that all workloads come up. If they do not, your fleet repo has implicit dependencies you need to make explicit.

Key Takeaways

GitOps with Flux on AKS is not just a deployment mechanism — it is an operational model that enforces auditability, reproducibility, and declarative infrastructure. The investment is in the repository structure and promotion workflows, not the tooling itself.

Planning a GitOps implementation on AKS? Contact our team — we design and implement GitOps platforms for enterprise Kubernetes environments.

GitOps with Flux on AKS: A Practical Implementation Guide

Why Flux (and Why Not ArgoCD)

Repository Structure

App Repository (per service)

Fleet Repository (one per platform)

Bootstrapping Flux on AKS

Kustomize Overlays for Environment Promotion

Helm Releases with Flux

Secret Management with SOPS

Setup

Multi-Cluster Management

Progressive Delivery with Flagger

Operational Practices

Monitoring Reconciliation

Handling Drift

Disaster Recovery

Key Takeaways

Need expert guidance?

Related articles

Supply Chain Security for Azure DevOps: SBOMs, Signing, and Attestation

Building an Internal Developer Portal with Backstage on Azure: A Practical Guide

GitHub Actions vs. Azure DevOps Pipelines in 2026: Migration Guide and Feature Comparison