Building Autonomous AI Agents on Azure: Patterns, Guardrails, and When Not To
Three production-ready AI agent patterns on Azure — single-agent, multi-agent orchestration, and human-in-the-loop — with guardrails for cost, security, and governance.
AI agents are the most over-hyped and under-engineered pattern in enterprise AI. The demos are impressive — an agent that books meetings, writes reports, and queries databases through natural language. The production reality is different: unpredictable costs, hallucination-driven failures, and security risks that traditional software does not have.
This post cuts through the hype. It describes three agent patterns that actually work in production, the guardrails that make them enterprise-safe, and — critically — when you should not use agents at all.
When Agents Add Value (and When They Do Not)
Use Agents When:
- The task requires dynamic planning — the steps are not known upfront and depend on intermediate results
- The task involves multiple tools that need to be selected and sequenced based on context
- Iteration and self-correction are valuable — the agent can evaluate its own output and retry
- The cost of human execution is high relative to the cost of agent errors
Do Not Use Agents When:
- The workflow is deterministic — If the steps are always the same, use a pipeline (Durable Functions, Logic Apps). Agents add non-determinism and cost for no benefit.
- Simple retrieval suffices — If the user asks a question and the answer is in a document, use RAG. An agent adds planning overhead without improving the answer.
- Latency matters — Agent planning loops add 2-10 seconds per step. For real-time user interactions, this is too slow.
- Accuracy must be 100% — Agents hallucinate. For financial calculations, regulatory reporting, or safety-critical systems, traditional software is more reliable.
Pattern 1: Single Agent with Tools
The simplest production pattern. One LLM agent with a set of tools it can call.
Implementation with Semantic Kernel
var kernel = Kernel.CreateBuilder()
.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o",
endpoint: config["AzureOpenAI:Endpoint"],
apiKey: config["AzureOpenAI:Key"])
.Build();
// Register tools as plugins
kernel.Plugins.AddFromType<OrderPlugin>();
kernel.Plugins.AddFromType<InventoryPlugin>();
kernel.Plugins.AddFromType<CustomerPlugin>();
// Create agent with automatic tool calling
var agent = new ChatCompletionAgent
{
Name = "OrderAssistant",
Instructions = """
You are an order management assistant. You can look up orders,
check inventory, and retrieve customer information.
Always verify data before making changes.
Never disclose internal pricing or margin data.
""",
Kernel = kernel,
Arguments = new KernelArguments(
new OpenAIPromptExecutionSettings
{
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
MaxTokens = 2000
})
};Tool Definition
public sealed class OrderPlugin
{
[KernelFunction, Description("Look up an order by its ID")]
public async Task<OrderDto> GetOrder(
[Description("The order ID (format: ORD-XXXXX)")] string orderId,
IOrderRepository repository)
{
// Input validation — never trust the LLM's input
if (!OrderId.TryParse(orderId, out var parsed))
return new OrderDto { Error = "Invalid order ID format" };
var order = await repository.GetByIdAsync(parsed);
return order ?? new OrderDto { Error = "Order not found" };
}
[KernelFunction, Description("Cancel an order — requires confirmation")]
public async Task<string> CancelOrder(
[Description("The order ID to cancel")] string orderId,
[Description("Reason for cancellation")] string reason,
IOrderRepository repository)
{
// High-impact action — log everything
_logger.LogWarning("Agent requesting order cancellation: {OrderId}, Reason: {Reason}",
orderId, reason);
return "Cancellation request logged. A human operator will review and confirm within 1 hour.";
}
}When to Use This Pattern
- Internal tools where an LLM selects the right data source based on the question
- Customer support assistants that look up orders, accounts, and knowledge base articles
- Data exploration where the agent queries different APIs based on the user's natural language request
Pattern 2: Multi-Agent Orchestration
Multiple specialised agents collaborate on a complex task.
Implementation with Semantic Kernel AgentGroupChat
var researchAgent = new ChatCompletionAgent
{
Name = "Researcher",
Instructions = "You research data from internal systems. Report findings factually.",
Kernel = researchKernel // Has database query tools
};
var analysisAgent = new ChatCompletionAgent
{
Name = "Analyst",
Instructions = "You analyse data provided by the Researcher. Identify trends and anomalies.",
Kernel = analysisKernel // Has calculation tools
};
var writerAgent = new ChatCompletionAgent
{
Name = "Writer",
Instructions = "You write executive summaries based on the Analyst's findings.",
Kernel = writerKernel // Has formatting tools
};
var chat = new AgentGroupChat(researchAgent, analysisAgent, writerAgent)
{
ExecutionSettings = new()
{
TerminationStrategy = new MaximumIterationTerminationStrategy(10),
SelectionStrategy = new SequentialSelectionStrategy()
}
};When to Use This Pattern
- Report generation that requires data gathering, analysis, and writing as distinct skills
- Code review where one agent analyses code quality and another checks security
- Complex research tasks where different agents specialise in different data sources
Pattern 3: Human-in-the-Loop
The agent works autonomously on low-risk steps but requests human approval for high-impact actions.
Risk Classification
public enum ActionRisk
{
Low, // Read operations, formatting, calculations
Medium, // Sending notifications, creating drafts
High, // Financial transactions, data deletion, external API calls
Critical // Production deployments, access changes, contract modifications
}
// In the tool definition
[KernelFunction]
[ActionRisk(ActionRisk.High)]
public async Task<string> ProcessRefund(string orderId, decimal amount)
{
// This will be intercepted by the guardrail middleware
// and queued for human approval
}Guardrails for Enterprise Agents
Cost Control
public sealed class CostGuardrail : IAgentMiddleware
{
private readonly decimal _maxCostPerSession = 5.00m; // EUR
private decimal _sessionCost = 0;
public async Task OnToolCall(ToolCallContext context)
{
_sessionCost += EstimateTokenCost(context);
if (_sessionCost > _maxCostPerSession)
{
context.Cancel("Session cost limit exceeded. Please start a new session.");
_logger.LogWarning("Agent session cost limit hit: {Cost}", _sessionCost);
}
}
}Content Safety
- Input filtering — Azure AI Content Safety to block prompt injection, harmful content, and PII in user inputs
- Output filtering — Validate agent responses do not contain internal data, PII, or harmful content
- Tool input validation — Never trust LLM-generated tool parameters. Validate all inputs in tool implementations.
Audit Logging
Every agent action should be logged:
{
"sessionId": "abc-123",
"timestamp": "2026-05-05T14:23:01Z",
"userId": "user@company.com",
"agentName": "OrderAssistant",
"action": "GetOrder",
"input": {"orderId": "ORD-12345"},
"output": {"status": "delivered"},
"tokenCost": 0.002,
"latencyMs": 450
}Iteration Limits
Always set maximum iteration counts. An agent without iteration limits can loop indefinitely, consuming tokens and time:
ExecutionSettings = new()
{
TerminationStrategy = new MaximumIterationTerminationStrategy(maxIterations: 10)
}Deployment on Azure
Recommended architecture:
- Azure Container Apps — Scale-to-zero for agent workloads, KEDA triggers for queue-based activation
- Azure OpenAI — Managed LLM endpoints with content filtering built in
- Azure Service Bus — For human-in-the-loop approval queues
- Application Insights — Tracing agent execution paths, token usage, and error rates
- Azure Key Vault — API keys and connection strings for agent tools
Ready to build production AI agents on Azure? Contact us — we help enterprises implement agent patterns with the guardrails that production demands.
Topics