Autonomous incident remediation,
in single-digit minutes.
Remediate Labs is a multi-agent system that detects production errors from CloudWatch, diagnoses the root cause from your codebase, generates a fix, and opens a PR — all before anyone gets paged. Built with the Anthropic SDK on FastAPI + Postgres + pgvector.
Time-to-first-PR
~6 min
from CloudWatch alarm
Triage accuracy
92%
100-case golden eval
False-positive rate
< 8%
noise filtered before triage
Sample size
100+
production incidents
Numbers refresh from scripts/measure_mttr.py against the live demo DB.
See the methodology for how each is computed.
How it works
Architecture deep-dive →01
Detect
CloudWatch alarms fire on production errors and POST to a webhook. Push-based — no polling load on the production server.
02
Triage
Haiku-class classifier filters real incidents from noise and duplicates. Ground truth from a 100-case golden dataset.
03
Diagnose
Diagnosis agent grounds every claim against the actual repo via GitHub Code Search. No fabricated function names.
04
Fix
Fix-generation agent writes a patch, runs it in a Docker sandbox against the real test suite, retries up to 3× on failure.
05
Approve
HIGH/CRITICAL actions queue for human approval before merging. Approvals also feed an RLHF preference dataset.
Recent writing
All posts →RAG Finds the Candidate. The Live Store Confirms the Truth.
A search index can't be a source of truth — that's a category error, not just a bug. The general rule for any vector or search index sitting in front of a live data store.
Why the Same Bug Kept Creating New Incidents (And What That Taught Me About RAG)
Three layers of dedup. Four independent failure modes that all had to fire simultaneously. The compound bug that exposed them, and the principle that makes it not happen again.
Why My AI Agent Kept Adding Null Checks Instead of Fixing the Bug
Five PRs to teach a fix-generation pipeline that the crash site is almost never the fix site. The producer/consumer distinction, RAG's structural blind spot, and what it took to find the actual bug.