What are AI agents in the enterprise context?

AI agents are LLM-powered systems that can plan, execute tools, evaluate results, and iterate autonomously toward a goal. Unlike simple chatbots that respond to single queries, agents can break down complex tasks, call APIs, query databases, and make decisions across multiple steps — with or without human approval at each step.

What is the best framework for building AI agents on Azure?

Semantic Kernel is the most Azure-native choice with deep integration into Azure OpenAI, Entra ID, and Azure services. AutoGen excels at multi-agent scenarios. LangChain/LangGraph offers the broadest ecosystem but is less Azure-optimised. For most enterprise Azure deployments, Semantic Kernel provides the best balance of capability and integration.

When should you NOT use AI agents?

Do not use agents for: deterministic workflows (use Logic Apps or Durable Functions), simple retrieval tasks (use RAG), latency-sensitive operations (agents add planning overhead), tasks requiring 100% accuracy (agents make mistakes), or when the same task runs identically every time (use traditional automation).

Building Autonomous AI Agents on Azure: Patterns, Guardrails, and When Not To

AI agents are the most over-hyped and under-engineered pattern in enterprise AI. The demos are impressive — an agent that books meetings, writes reports, and queries databases through natural language. The production reality is different: unpredictable costs, hallucination-driven failures, and security risks that traditional software does not have.

This post cuts through the hype. It describes three agent patterns that actually work in production, the guardrails that make them enterprise-safe, and — critically — when you should not use agents at all.

When Agents Add Value (and When They Do Not)

Use Agents When:

The task requires dynamic planning — the steps are not known upfront and depend on intermediate results
The task involves multiple tools that need to be selected and sequenced based on context
Iteration and self-correction are valuable — the agent can evaluate its own output and retry
The cost of human execution is high relative to the cost of agent errors

Do Not Use Agents When:

The workflow is deterministic — If the steps are always the same, use a pipeline (Durable Functions, Logic Apps). Agents add non-determinism and cost for no benefit.
Simple retrieval suffices — If the user asks a question and the answer is in a document, use RAG. An agent adds planning overhead without improving the answer.
Latency matters — Agent planning loops add 2-10 seconds per step. For real-time user interactions, this is too slow.
Accuracy must be 100% — Agents hallucinate. For financial calculations, regulatory reporting, or safety-critical systems, traditional software is more reliable.

Pattern 1: Single Agent with Tools

The simplest production pattern. One LLM agent with a set of tools it can call.

Loading diagram...

Implementation with Semantic Kernel

Csharp

var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(
        deploymentName: "gpt-4o",
        endpoint: config["AzureOpenAI:Endpoint"],
        apiKey: config["AzureOpenAI:Key"])
    .Build();

// Register tools as plugins
kernel.Plugins.AddFromType<OrderPlugin>();
kernel.Plugins.AddFromType<InventoryPlugin>();
kernel.Plugins.AddFromType<CustomerPlugin>();

// Create agent with automatic tool calling
var agent = new ChatCompletionAgent
{
    Name = "OrderAssistant",
    Instructions = """
        You are an order management assistant. You can look up orders, 
        check inventory, and retrieve customer information.
        Always verify data before making changes.
        Never disclose internal pricing or margin data.
        """,
    Kernel = kernel,
    Arguments = new KernelArguments(
        new OpenAIPromptExecutionSettings 
        { 
            ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
            MaxTokens = 2000
        })
};

Tool Definition

Csharp

public sealed class OrderPlugin
{
    [KernelFunction, Description("Look up an order by its ID")]
    public async Task<OrderDto> GetOrder(
        [Description("The order ID (format: ORD-XXXXX)")] string orderId,
        IOrderRepository repository)
    {
        // Input validation — never trust the LLM's input
        if (!OrderId.TryParse(orderId, out var parsed))
            return new OrderDto { Error = "Invalid order ID format" };
            
        var order = await repository.GetByIdAsync(parsed);
        return order ?? new OrderDto { Error = "Order not found" };
    }
    
    [KernelFunction, Description("Cancel an order — requires confirmation")]
    public async Task<string> CancelOrder(
        [Description("The order ID to cancel")] string orderId,
        [Description("Reason for cancellation")] string reason,
        IOrderRepository repository)
    {
        // High-impact action — log everything
        _logger.LogWarning("Agent requesting order cancellation: {OrderId}, Reason: {Reason}", 
            orderId, reason);
        
        return "Cancellation request logged. A human operator will review and confirm within 1 hour.";
    }
}

When to Use This Pattern

Internal tools where an LLM selects the right data source based on the question
Customer support assistants that look up orders, accounts, and knowledge base articles
Data exploration where the agent queries different APIs based on the user's natural language request

Pattern 2: Multi-Agent Orchestration

Multiple specialised agents collaborate on a complex task.

Loading diagram...

Implementation with Semantic Kernel AgentGroupChat

Csharp

var researchAgent = new ChatCompletionAgent
{
    Name = "Researcher",
    Instructions = "You research data from internal systems. Report findings factually.",
    Kernel = researchKernel // Has database query tools
};

var analysisAgent = new ChatCompletionAgent
{
    Name = "Analyst",
    Instructions = "You analyse data provided by the Researcher. Identify trends and anomalies.",
    Kernel = analysisKernel // Has calculation tools
};

var writerAgent = new ChatCompletionAgent
{
    Name = "Writer",
    Instructions = "You write executive summaries based on the Analyst's findings.",
    Kernel = writerKernel // Has formatting tools
};

var chat = new AgentGroupChat(researchAgent, analysisAgent, writerAgent)
{
    ExecutionSettings = new()
    {
        TerminationStrategy = new MaximumIterationTerminationStrategy(10),
        SelectionStrategy = new SequentialSelectionStrategy()
    }
};

When to Use This Pattern

Report generation that requires data gathering, analysis, and writing as distinct skills
Code review where one agent analyses code quality and another checks security
Complex research tasks where different agents specialise in different data sources

Pattern 3: Human-in-the-Loop

The agent works autonomously on low-risk steps but requests human approval for high-impact actions.

Loading diagram...

Risk Classification

Csharp

public enum ActionRisk
{
    Low,      // Read operations, formatting, calculations
    Medium,   // Sending notifications, creating drafts
    High,     // Financial transactions, data deletion, external API calls
    Critical  // Production deployments, access changes, contract modifications
}

// In the tool definition
[KernelFunction]
[ActionRisk(ActionRisk.High)]
public async Task<string> ProcessRefund(string orderId, decimal amount)
{
    // This will be intercepted by the guardrail middleware
    // and queued for human approval
}

Guardrails for Enterprise Agents

Cost Control

Csharp

public sealed class CostGuardrail : IAgentMiddleware
{
    private readonly decimal _maxCostPerSession = 5.00m; // EUR
    private decimal _sessionCost = 0;
    
    public async Task OnToolCall(ToolCallContext context)
    {
        _sessionCost += EstimateTokenCost(context);
        
        if (_sessionCost > _maxCostPerSession)
        {
            context.Cancel("Session cost limit exceeded. Please start a new session.");
            _logger.LogWarning("Agent session cost limit hit: {Cost}", _sessionCost);
        }
    }
}

Content Safety

Input filtering — Azure AI Content Safety to block prompt injection, harmful content, and PII in user inputs
Output filtering — Validate agent responses do not contain internal data, PII, or harmful content
Tool input validation — Never trust LLM-generated tool parameters. Validate all inputs in tool implementations.

Audit Logging

Every agent action should be logged:

JSON

{
    "sessionId": "abc-123",
    "timestamp": "2026-05-05T14:23:01Z",
    "userId": "user@company.com",
    "agentName": "OrderAssistant",
    "action": "GetOrder",
    "input": {"orderId": "ORD-12345"},
    "output": {"status": "delivered"},
    "tokenCost": 0.002,
    "latencyMs": 450
}

Iteration Limits

Always set maximum iteration counts. An agent without iteration limits can loop indefinitely, consuming tokens and time:

Csharp

ExecutionSettings = new()
{
    TerminationStrategy = new MaximumIterationTerminationStrategy(maxIterations: 10)
}

Deployment on Azure

Loading diagram...

Recommended architecture:

Azure Container Apps — Scale-to-zero for agent workloads, KEDA triggers for queue-based activation
Azure OpenAI — Managed LLM endpoints with content filtering built in
Azure Service Bus — For human-in-the-loop approval queues
Application Insights — Tracing agent execution paths, token usage, and error rates
Azure Key Vault — API keys and connection strings for agent tools

Ready to build production AI agents on Azure? Contact us — we help enterprises implement agent patterns with the guardrails that production demands.

Building Autonomous AI Agents on Azure: Patterns, Guardrails, and When Not To

When Agents Add Value (and When They Do Not)

Use Agents When:

Do Not Use Agents When:

Pattern 1: Single Agent with Tools

Implementation with Semantic Kernel

Tool Definition

When to Use This Pattern

Pattern 2: Multi-Agent Orchestration

Implementation with Semantic Kernel AgentGroupChat

When to Use This Pattern

Pattern 3: Human-in-the-Loop

Risk Classification

Guardrails for Enterprise Agents

Cost Control

Content Safety

Audit Logging

Iteration Limits

Deployment on Azure

Frequently Asked Questions

Need expert guidance?

Related articles

Building Enterprise RAG Pipelines: Architecture, Pitfalls, and Best Practices

RAG Is Not Enough: When to Use Fine-Tuning, Agents, or Knowledge Graphs

Agentic AI in Production: Three Patterns with Azure Functions and Databricks