Skip to main content
All posts
Cybersecurity7 min read

Building a Modern SOC with Microsoft Sentinel: Architecture and Playbooks

How to architect a modern SOC with Microsoft Sentinel — data connectors, KQL analytics rules, SOAR automation, cost control, and alert fatigue reduction.

Building a Security Operations Centre (SOC) used to mean seven-figure investments in on-premises hardware, a team of 20, and months of integration work. Microsoft Sentinel changes that equation dramatically — but only if you architect it correctly. A poorly designed Sentinel deployment leads to runaway costs, alert fatigue, and a false sense of security. This guide covers how to get it right.

Architecture Decisions That Matter

Workspace Design

Your Log Analytics workspace topology is the most consequential architectural decision. Get it wrong and you will spend months restructuring.

Recommended approach for most enterprises:

  • Single workspace for all security data — Sentinel works best with correlated data in one place
  • Use Azure Lighthouse or multi-workspace queries if you must support multiple tenants (MSSP scenarios)
  • Separate operational workspaces (for non-security IT operations) from the Sentinel workspace to control cost and access
  • Implement workspace-level RBAC and table-level RBAC for data access segregation

Avoid the common mistake of creating one workspace per subscription or per team. This fragments your security data and makes cross-correlation nearly impossible.

Data Connector Strategy

Not all data is equally valuable. The key is to ingest what matters for detection and investigation while controlling cost.

Tier 1 — Essential (enable immediately):

  • Microsoft Entra ID sign-in and audit logs
  • Microsoft Defender XDR incidents and raw alerts
  • Microsoft Defender for Cloud security alerts
  • Azure Activity logs
  • Office 365 audit logs (Exchange, SharePoint, Teams)

Tier 2 — High value (enable within 30 days):

  • Microsoft Defender for Endpoint raw events (DeviceProcessEvents, DeviceNetworkEvents)
  • Azure Firewall and NSG Flow Logs (for network threat detection)
  • DNS logs (critical for C2 detection)
  • Azure Key Vault audit logs

Tier 3 — Contextual (enable as detection matures):

  • Syslog/CEF from on-premises firewalls and network devices
  • Threat Intelligence connectors (TAXII feeds, Microsoft TI)
  • AWS CloudTrail or GCP audit logs (for multi-cloud environments)
  • Custom application logs via the Log Analytics agent or Data Collection Rules

Cost principle: Ingest data that you will actively use for detection rules or investigation. If you are ingesting data with no analytics rules or hunting queries referencing it, you are paying for storage, not security.

Building Effective Analytics Rules

Analytics rules are the heart of Sentinel. They transform raw log data into actionable alerts. The difference between a useful SOC and an overwhelmed one comes down to rule quality.

Rule Types and When to Use Them

  • Scheduled rules: Run KQL queries on a defined interval. Use for custom detections specific to your environment.
  • Microsoft incident creation rules: Automatically create Sentinel incidents from Defender XDR, Defender for Cloud, or other Microsoft security products. Use as your baseline — these leverage Microsoft's detection engineering.
  • Fusion rules: ML-based multi-stage attack detection. Enable the built-in Fusion rule — it correlates signals across data sources automatically.
  • Near-real-time (NRT) rules: Run every minute for critical detections that cannot wait for scheduled intervals.

Writing KQL That Works

Good detection rules are specific, tested, and tuned. Here are patterns that work in production:

Impossible travel detection (custom, more flexible than built-in):

Kql
SigninLogs
| where ResultType == 0
| summarize Locations = make_set(Location), 
            IPs = make_set(IPAddress),
            MinTime = min(TimeGenerated), 
            MaxTime = max(TimeGenerated) 
  by UserPrincipalName, bin(TimeGenerated, 1h)
| where array_length(Locations) > 1
| extend TimeDiffMinutes = datetime_diff('minute', MaxTime, MinTime)
| where TimeDiffMinutes < 60

Anomalous process execution on servers:

Kql
DeviceProcessEvents
| where Timestamp > ago(1h)
| where DeviceName has_any ("srv", "server", "dc")
| where FileName !in~ ("svchost.exe", "services.exe", "lsass.exe", "csrss.exe")
| summarize ExecutionCount = count() by FileName, DeviceName
| join kind=leftanti (
    DeviceProcessEvents
    | where Timestamp between (ago(30d) .. ago(1d))
    | summarize by FileName, DeviceName
) on FileName, DeviceName
| where ExecutionCount < 5

Sensitive Azure role assignments:

Kql
AzureActivity
| where OperationNameValue == "Microsoft.Authorization/roleAssignments/write"
| where ActivityStatusValue == "Success"
| extend RoleDefinitionId = tostring(parse_json(Properties).requestbody)
| where RoleDefinitionId has_any ("Owner", "Contributor", "User Access Administrator")
| project TimeGenerated, Caller, ResourceGroup, RoleDefinitionId

Tuning: The Ongoing Discipline

Every analytics rule should have:

  • A documented threshold that was tuned against at least two weeks of baseline data
  • Entity mapping (account, host, IP) so incidents can be enriched and correlated
  • A severity that reflects business impact, not technical interestingness
  • Suppression configured to prevent duplicate incidents for the same event

Review your analytics rules monthly. If a rule generates more than 50 alerts per week and fewer than 5% result in true positive investigations, it needs tuning or removal.

SOAR Automation: Playbooks That Save Hours

Automation is what separates a sustainable SOC from a burnout factory. Sentinel integrates with Azure Logic Apps for playbook automation.

High-Impact Playbooks to Implement First

1. Automated enrichment on incident creation:

  • Look up IP reputation via VirusTotal or AbuseIPDB
  • Query Microsoft Graph for user details (department, manager, recent sign-ins)
  • Add enrichment as comments to the incident
  • ROI: Saves 5–10 minutes per incident on manual lookups

2. Automated response to confirmed compromised account:

  • Disable the user account in Entra ID
  • Revoke all refresh tokens
  • Block the user's IP via Conditional Access named location
  • Notify the user's manager via email
  • Create a ServiceNow ticket
  • ROI: Reduces response time from 30+ minutes to under 60 seconds

3. Automated triage for low-severity alerts:

  • Check if the alert entity (IP, user, host) has been seen in previous closed-as-benign incidents
  • If yes, auto-close with a comment referencing the prior investigation
  • If no, escalate to analyst queue
  • ROI: Reduces alert volume by 20–40% for mature environments

4. Threat intelligence matching:

  • When a new IOC is received from TI feed, retroactively search Sentinel logs
  • If matches are found, create an incident with full context
  • ROI: Catches threats that entered the environment before the IOC was published

Cost Optimisation

Sentinel costs are driven by data ingestion volume. Here is how to control them without sacrificing detection capability:

  • Use Basic Logs for high-volume, low-security-value tables (e.g., NetFlow data, verbose application logs). Basic Logs cost significantly less but support limited KQL and no analytics rules.
  • Configure Data Collection Rules (DCRs) to filter data at ingestion — drop fields you never query, transform verbose logs into concise records
  • Set retention policies per table: 90 days interactive for security tables, 30 days for operational tables, archive to cold storage for compliance requirements
  • Use the Commitment Tier pricing if your daily ingestion is predictable — 100 GB/day commitment tier saves ~30% vs. pay-as-you-go
  • Monitor ingestion via the Usage table: Usage | summarize sum(Quantity) by DataType | sort by sum_Quantity desc — know exactly which data sources drive your costs

Rule of thumb: Your Sentinel cost should be proportional to the security value you extract. If you are spending 60% of your budget ingesting firewall traffic logs but only 10% of your detection rules reference them, restructure.

Combating Alert Fatigue

Alert fatigue is the number one SOC killer. When analysts are overwhelmed, they start ignoring alerts — including real threats.

Strategies that work:

  • Incident grouping: Configure analytics rules to group related alerts into a single incident (by entity, by time window)
  • Automation rules: Auto-close known false positives, auto-assign incidents based on MITRE ATT&CK tactic, auto-set severity based on asset criticality
  • Watchlists: Maintain exception lists (known scanner IPs, service accounts, maintenance windows) and reference them in your KQL queries to exclude noise at the detection layer
  • Severity discipline: Reserve "High" severity for alerts that require human investigation within 4 hours. If everything is high, nothing is.

Measuring SOC Effectiveness

Track these metrics to understand whether your Sentinel deployment is delivering value:

  • Mean time to detect (MTTD): Time from threat occurrence to alert creation
  • Mean time to respond (MTTR): Time from alert creation to containment action
  • True positive rate: Percentage of incidents that result in genuine security action
  • Automation rate: Percentage of incidents handled entirely by playbooks
  • Coverage: Percentage of MITRE ATT&CK techniques covered by your analytics rules

Final Thought

Microsoft Sentinel gives you enterprise-grade SIEM/SOAR without the infrastructure overhead of traditional solutions. But the technology is only as good as the architecture, detection logic, and operational processes you build around it. Start with high-value data sources, write precise analytics rules, automate aggressively, and treat alert tuning as a continuous discipline — not a one-time setup task.

Ready to build or optimise your Sentinel deployment? Contact our team — we design and operate SOC environments for enterprises that need real security outcomes, not just dashboards.

Microsoft SentinelSIEM SOARSOC architectureKQL analytics rulessecurity automation playbooks

Need expert guidance?

Our team specializes in cloud architecture, security, AI platforms, and DevSecOps. Let's discuss how we can help your organization.

Related articles