Agentic CI/CD Incident Responder

Select a scenario

Agent pipeline

πŸ‘ Monitor
Waiting for CI failure…
πŸ” Triage
Idle
πŸ”§ Fix
Idle
πŸ”” Notify
Idle

Results

Draft Pull Request

β€”

Slack Notification

β€”

How it works

ops-pilot runs an agentic pipeline: Monitor β†’ Triage β†’ Fix (or Escalate) β†’ Notify.

MonitorAgent polls GitHub Actions, GitLab CI, or Jenkins every 30 seconds. When a failure lands it builds a typed Failure model β€” log tail, diff summary, pipeline metadata β€” and hands off to triage.

TriageAgent runs a tool-use loop instead of a single prompt. The model calls get_file to read source at the failing line, get_more_log to fetch earlier log sections (the root cause is often 50–100 lines above the tail), and get_commit_diff to read the actual diff hunks β€” until it decides it has enough evidence to conclude. You can see those tool calls above as each scenario runs. For complex incidents a CoordinatorAgent spawns parallel workers (log / source / diff) and aggregates their findings.

If triage confidence is HIGH or MEDIUM, FixAgent generates a minimal patch, commits it to a branch, and opens a draft PR β€” nothing merges without a human. If confidence is LOW, ops-pilot generates an escalation summary instead: what was investigated, what was inconclusive, and the recommended next step. Try the ⚠️ OOM scenario above to see this path.

NotifyAgent posts either a fix-ready alert with the PR link, or a human-review-required escalation alert to Slack. Every tool call is written to a structured JSONL audit trail. Destructive actions β€” file commits and PR opens β€” get a pre-action LLM explanation logged before execution.

Additional capabilities: incident memory with weighted similarity retrieval surfaces past fixes before triage begins; context budgeting compacts tool results when the context window fills; per-tenant isolation with rate limiting for enterprise deployments. A weekly consolidation job distills raw incident logs into durable fix patterns.

This demo replays pre-recorded runs β€” click any scenario above to see the full pipeline in action.