From noisy operations data to replayable incident simulations.
MazeLabs turns raw logs, tickets, RCAs, runbooks, and system behavior into compressed evidence, masked AI context, deterministic simulation state, actor-driven war rooms, and measurable readiness scores.
Ingest real operational context.
MazeLabs starts with the operational data teams already have: runbooks, SOPs, RCAs, incident tickets, CloudWatch logs, application logs, topology notes, alert payloads, and historical troubleshooting records.
Raw data stays in the vault.
Raw operational documents and logs are stored inside a controlled MazeLabs vault. The vault keeps source material, metadata, provenance, and extracted evidence separate.
The AI layer does not need unrestricted access to raw sensitive documents.
- Tracks source provenance
- Separates raw data from AI-consumable evidence
- Keeps sensitive operational context controlled
Compress telemetry into operational evidence.
Generic LLMs struggle with high-volume logs because they are repetitive, noisy, and full of sensitive identifiers. MazeLabs uses a Log Compressor to turn raw telemetry into structured incident signals.
Mask sensitive context before AI reasoning.
Before evidence is passed to AI agents or model providers, MazeLabs applies masking and redaction policies. The AI does not need the raw secret to understand that a timeout occurred.
Build an EvidenceTree from messy inputs.
MazeLabs combines compressed telemetry, incident tickets, RCAs, and runbooks into an EvidenceTree. This graph organizes the incident into symptoms, signals, checks, hypotheses, and valid actions.
This makes simulations replayable and auditable. Instead of relying on uncontrolled LLM memory, the runtime uses structured evidence packs and known state transitions.
Realistic actor behavior driven by state.
Simulations include actors like SREs, DBREs, and Stakeholders. They do not randomly hallucinate; their behavior is tightly bounded by scenario state, hidden evidence, and time pressure.
Adversarial Reviewer
Challenges weak assumptions and unsafe actions during the incident.
Customer Stakeholder
Brings business impact urgency when time elapsed exceeds thresholds.
P1 War Room, Scoring & Debrief.
The War Room is an operational decision surface, not just a chat screen. Every action is evaluated, scored, and mapped to a replayable timeline for post-incident learning.
Agent-Driven P1 War Room
Presents evidence, transcript, stakeholder pressure, and action ledgers in one surface.
Readiness Scoring
- Investigation: +12
- Ignored Signal: -8
- Unsafe Failover: -15
- Valid Recovery: +10
Debrief & Replay
Generates a structured debrief showing what evidence was missed, correct paths, and team capability gaps.
Why this architecture matters.
Most teams already have the knowledge needed to train better incident responders. Generic LLMs cannot safely reason over raw context. MazeLabs creates a controlled middle layer to compress, mask, and structure evidence before simulation.
1. Ingest
Runbooks, RCAs, tickets, CloudWatch logs, application logs, topology notes
2. Vault
Store raw operational context under customer control
3. Compress
Deduplicate logs, cluster errors, extract entities, build timelines, rank signals
4. Mask
Redact secrets, IPs, hostnames, customer IDs, emails, tokens, internal URLs
5. Evidence
Build source-linked EvidenceTrees and bounded EvidencePacks
6. Simulate
Create scenario phases, hidden evidence, valid checks, wrong paths, recovery criteria
7. Actors
Drive incident commander, SRE, DBRE, support, customer, coach, and reviewer behavior from scenario state
8. Bridge
Run P1/P0 incident drills with transcript, evidence, action ledger, stakeholder pressure, and timeline
9. Score
Measure evidence use, decisions, communication, escalation, recovery, and validation
10. Debrief
Replay missed signals, wrong assumptions, recovery quality, and follow-up labs
Turn your incident history into a simulation engine.
MazeLabs converts the logs, tickets, runbooks, and RCAs you already have into private, evidence-driven war-room simulations.