Operational Simulation Runtime
Compile real incidents into interactive P1/P0 simulations with deterministic state, evidence gating, and branching operational decisions.
What It Is
The Operational Simulation Runtime is the execution layer of MazeLabs. It takes a compressed, redacted EvidencePack produced by the Evidence Compression Engine and compiles it into a fully interactive incident simulation.
Unlike generic scenario builders or quiz-based training tools, the Runtime drives deterministic state transitions based on operator decisions, evidence inspection, and escalation behavior. Every simulation is bounded by real incident data — not scripted dialogues or hallucinated scenarios.
Runtime Architecture Flow
Scenario Compiler
The Scenario Compiler transforms a validated EvidencePack into a structured simulation scenario definition. It extracts causal chains, identifies decision points, plants hidden evidence, maps valid investigation checks, defines wrong paths, and establishes recovery criteria.
{
"scenario_id": "inc-2026-0510-db-timeout",
"phases": ["triage", "investigation", "hypothesis", "action", "validation"],
"evidence_gates": 12,
"hidden_evidence_count": 4,
"wrong_paths": ["blame_network", "restart_app_server"],
"recovery_criteria": {
"primary": "identify_connection_pool_exhaustion",
"validation": "confirm_pool_resize_and_monitor"
}
}Simulation State Machine
The simulation progresses through deterministic operational states. Each state has entry conditions, available actions, evidence requirements, and transition rules. There are no random outcomes — every result is derived from operator behavior.
Triage
Initial alert assessment and incident classification
Investigation
Evidence collection, log correlation, and pattern identification
Hypothesis
Formulate root-cause theories based on available evidence
Action
Execute mitigation steps, apply fixes, and coordinate response
Validation
Verify that mitigation resolved the issue without regressions
Escalation
Route to specialists, notify stakeholders, and escalate severity
Recovery
Restore full service, confirm stability, and document resolution
Retrospective
Review decisions, missed signals, and process improvements
Actor Runtime
Each simulation runs multiple AI-driven actors that behave according to their role, the scenario state, and available evidence. Actors generate realistic pressure, provide relevant context, and respond to operator decisions — not from scripts, but from bounded scenario logic.
SRE
Drives investigation, runs checks, correlates system signals
DBA
Investigates database health, queries, replication, and storage
DevOps
Checks deployments, config changes, CI/CD pipeline state
Incident Commander
Coordinates response, manages timeline, and tracks decisions
Support Engineer
Manages customer communication and impact reporting
Stakeholder
Applies pressure, requests ETAs, and demands status updates
Evidence Inspection
During a simulation, operators interact with evidence panels to inspect real operational data — logs, timelines, alerts, tickets, and topology. Evidence is revealed progressively as the investigation deepens, just like a real P1 incident.
Decision Branching
The simulation engine supports full decision branching. Different operator choices lead to different outcomes. Choosing to restart a service before checking logs produces a different simulation path than correlating metrics first. Wrong paths are tracked, scored, and reviewed in the debrief.
Debrief Integration
Every simulation session produces a structured debrief. Missed signals, wrong assumptions, evidence usage, escalation timing, and recovery quality are all captured and replayed for organizational learning.