Microsoft Fabric Data Agents: Architecture & Governance
How to architect and govern Microsoft Fabric Data Agents in 2026 — OneLake, medallion layering, autonomous workflows, and EU-compliant controls.
Microsoft Fabric spent its first two years convincing enterprises to unify their analytics on OneLake. In 2026 the conversation changed shape entirely: the question is no longer where does the data live but who — or what — is allowed to act on it. Fabric Data Agents, autonomous data workflows, and Copilot's 'Cowork' execution mode turn the platform from a place you query into a place that does work on your behalf. That is powerful, and it is exactly where architecture and governance decisions stop being optional.
This is the architecture we deploy at CC Conceptualise when European enterprises ask us to make Fabric Data Agents safe to run in production — grounded in delivery, not in a launch keynote. We work alongside our AI & data platform engineering practice and our certified architects, because an autonomous agent over a regulated data estate is as much a governance system as a technical one.
TL;DR / Key takeaways
- Data Agents are a governed execution layer over OneLake, not a chatbot. They plan, act, and verify across pipelines via Copilot 'Cowork' autonomous execution.
- The medallion architecture is your safety rail. Scope agent reads to silver/gold and route every write through a governed pipeline — never let an agent write straight into gold.
- Eventhouse remote MCP is the real-time boundary. It is where agents query KQL by natural language and where you enforce authorization and audit.
- Governance is three pillars: least-privilege identity, immutable logging, and human oversight gates — and they map directly onto the EU AI Act, GDPR, and NIS2.
- Earn autonomy. Start read-only and human-in-the-loop; widen write autonomy only once evaluation, logging, and rollback are proven.
What changed in Fabric in 2026
Three platform shifts make Data Agents materially different from the Copilot question-answering of 2024–2025.
First, autonomous execution. Copilot 'Cowork' lets an agent decompose a goal into steps, run them across Fabric items, and check its own output — closer to a junior data engineer working a ticket than to an assistant answering one prompt. Data Agents are the named, governable units that carry this out.
Second, real-time reach via MCP. The Eventhouse remote MCP capability lets agents query real-time data through the Model Context Protocol using natural language that resolves to KQL. An agent can now reason over streaming and operational telemetry next to the curated lake, which is transformative for operational analytics — and a new attack and audit surface you must own. We covered the runtime mechanics of this in Fabric Eventhouse MCP for real-time AI.
Third, a unified, version-controllable platform. OneLake remains the single data lake; cross-workspace MLflow logging brings models and experiments from Azure Databricks and Azure Machine Learning into Fabric for end-to-end MLOps; and the Power BI Copilot Tooling Format reached GA in May 2026, giving the semantic layer a Git-friendly, text-based metadata representation. Together these mean the assets an agent reasons over — pipelines, models, semantic definitions — can finally be reviewed and tested like code.
Reference architecture for Fabric Data Agents
A production Data Agent deployment is best modelled as four planes. Treating them as one undifferentiated "Fabric tenant" is how governance gaps appear.
1. The data plane: OneLake and the medallion layers
Everything starts with disciplined medallion architecture. Bronze holds raw ingested data, silver holds cleaned and conformed tables, and gold holds the curated, business-ready models. Agents must interact with these layers through explicit, role-scoped contracts:
- Read access to silver and gold for analytical reasoning.
- Write access only into a dedicated agent-output zone — never directly into gold.
- No standing access to bronze unless a specific pipeline requires it.
The single most common mistake we correct in audits is agents granted broad write access "to keep things simple," which collapses the lineage and review guarantees the medallion model exists to provide.
2. The semantic plane: governed meaning
An agent that can query everything but understands nothing returns confident, wrong answers. The semantic model — now expressible in the Copilot Tooling Format — is where business definitions live: what revenue means, which customers are active, how churn is calculated. Putting this metadata under version control lets you review changes in pull requests and test them in CI, so the definitions an agent reasons over are deliberate, not accidental.
3. The execution plane: agents, Cowork, and MCP
This is where Data Agents and 'Cowork' autonomous execution run, and where Eventhouse remote MCP exposes real-time data. Two principles govern it:
- Every tool and data connection is an enforced authorization boundary, not a convenience. The MCP server is the right place to authenticate the agent identity, scope the query, and emit an audit record.
- Treat all instructions as untrusted. An agent acting on data it pulled from a document or a real-time feed can be steered by content in that data; least privilege is what turns a prompt-injection attempt into a non-event rather than a breach.
4. The governance plane: identity, logging, oversight
The governance plane wraps the other three. It is covered in depth in OneLake governance and security, and it is non-negotiable for European enterprises.
A decision table: how much autonomy, when
Autonomy is not binary. We classify agent capabilities by blast radius and gate each level on a control maturity bar.
| Mode | What the agent may do | Required controls before go-live | Typical use |
|---|---|---|---|
| Read-only | Query silver/gold and real-time KQL via MCP | Identity, least-privilege read, query audit | Self-service analytics, investigation |
| Suggest | Propose transformations or writes for human approval | Above + evaluation suite + diff preview | Pipeline drafting, data prep |
| Human-in-the-loop write | Write to agent-output zone after explicit approval | Above + immutable action log + rollback | Curated dataset generation |
| Supervised autonomous | Execute multi-step workflows with post-hoc review | Above + anomaly alerting + spend caps | Routine, well-bounded operations |
| Full autonomous | Plan and act end-to-end without per-action approval | All above + proven track record + sign-off | Rare; only mature, low-risk paths |
The mistake is starting near the bottom of this table. Every deployment we run begins at read-only and climbs only as the evidence — evaluation results, clean audit logs, successful rollbacks — justifies the next rung.
Governing Data Agents under EU regulation
For a German or wider European enterprise, autonomous action over data is squarely in scope of the regulatory stack, and "the model decided" is not a defence. The board (Geschäftsleitung) remains accountable, which means the controls below must produce retained evidence (Nachweispflichten), not good intentions.
- EU AI Act. Many agentic data use cases require documented purpose, logging, and human oversight. Map each Data Agent to its risk classification and keep a conformity-assessment-ready record (Konformitätsbewertung) of its behaviour and controls.
- GDPR. Purpose limitation and data minimisation apply to what an agent may read and combine. Least-privilege medallion access is the technical expression of this principle.
- NIS2 and DORA. For essential entities and financial firms, agent actions touching critical systems fall under incident, supply-chain (Lieferkette), and risk-management obligations (Risikomanagementmaßnahmen). Immutable logging and tested rollback are part of operational resilience, not extras.
A practical governance checklist
- Assign every agent an identity with least-privilege access to specific OneLake layers and MCP endpoints — no shared or standing credentials.
- Log every action immutably — prompts, resolved queries, tool calls, MCP hops, and writes — so any decision is reconstructable for an auditor.
- Gate consequential writes behind human approval until the agent has earned higher autonomy in the table above.
- Version and test the semantic model in the Copilot Tooling Format so the definitions agents reason over are reviewed in CI.
- Run an evaluation suite as a release gate, scoring groundedness, correctness, and resistance to injection before any change ships.
- Cap and monitor spend per agent, tying token and capacity cost to delivered value, with alerts on anomalies.
- Document risk classification and oversight for each agent so regulatory evidence exists from day one rather than being retrofitted.
What we have learned delivering this
The hardest part of a Fabric Data Agent programme is rarely the agent. It is the discipline underneath it: a clean medallion model, a governed semantic layer, an MCP boundary that authenticates and audits, and an organisation willing to start read-only. In our delivery work the engagements that struggle are the ones that flipped on autonomous execution to impress a steering committee and inherited an unaccountable system. The ones that succeed treat the agent as the last thing they turn on, after the governance plane is real.
Fabric in 2026 genuinely lets autonomous agents act on enterprise data at scale. Whether that is an asset or a liability is decided entirely by the architecture and governance you put around it before go-live.
FAQ
What are Microsoft Fabric Data Agents? Fabric Data Agents are autonomous agents that operate over your Fabric data estate — querying OneLake, running medallion transformations, and executing multi-step data workflows on instruction rather than on a fixed schedule. In 2026 they pair with Copilot 'Cowork' autonomous execution, so an agent can plan, act, and verify across a pipeline instead of merely answering a single question. They are best understood as a governed execution layer on top of OneLake, not a chatbot.
How do Fabric Data Agents query real-time data? Through the Eventhouse remote MCP capability, an agent can query real-time KQL data over the Model Context Protocol using natural language. This lets an agent reason over streaming and operational data alongside the curated medallion layers in OneLake, without a human writing the KQL by hand. The MCP boundary is also the right place to enforce authorization and audit on every real-time query.
Where do Fabric Data Agents fit in a medallion architecture? Agents should read from and write to explicit medallion layers rather than roaming freely. A typical pattern grants read access to curated silver and gold tables for analytical reasoning, and write access only through governed pipelines into a dedicated agent-output zone. Letting agents write directly into gold without a review gate is the most common governance mistake we see.
How are Fabric Data Agents governed for the EU AI Act and GDPR? Governance rests on three pillars: identity and least-privilege access on every OneLake and MCP call, immutable logging of every agent action for traceability, and human oversight gates for consequential writes. These controls map directly onto EU AI Act documentation duties, GDPR purpose limitation, and NIS2 accountability. The board remains accountable, so evidence must be retained, not assumed.
What is the Power BI Copilot Tooling Format and why does it matter for agents? The Copilot Tooling Format reached general availability in May 2026 as a Git-friendly, text-based metadata format for Power BI semantic models. It matters for agents because it makes the semantic layer — the business definitions an agent reasons over — versionable, reviewable in pull requests, and testable in CI. A governed semantic model is what stops an agent from confidently returning wrong numbers.
Can Fabric Data Agents work with Azure Databricks and Azure ML assets? Yes. Cross-workspace MLflow logging lets you bring models and experiments from Azure Databricks and Azure Machine Learning into Fabric for end-to-end MLOps, so a Data Agent can use models trained elsewhere without copying data out of OneLake. This keeps the data lake unified while letting teams use the training platform they prefer.
Should we let Data Agents run autonomously in production from day one? No. Start with read-only and human-in-the-loop modes, instrument every action, and only widen autonomy for write paths once your evaluation, logging, and rollback controls are proven. Autonomous execution is a capability you earn through governance maturity, not a switch you flip on go-live.
If you are designing a Fabric Data Agent platform that has to stand up to European regulators as well as your data, our AI & data platform engineering practice can help you get the architecture and governance right before you turn autonomy on.
Topics