Production-grade agent platform

Autonomous incident remediation,
in single-digit minutes.

Remediate Labs is a multi-agent system that detects production errors from CloudWatch, diagnoses the root cause from your codebase, generates a fix, and opens a PR — all before anyone gets paged. Built with the Anthropic SDK on FastAPI + Postgres + pgvector.

Request a walkthrough → See the code on GitHub →

Time-to-first-PR

~6 min

from CloudWatch alarm

Triage accuracy

92%

100-case golden eval

False-positive rate

< 8%

noise filtered before triage

Sample size

100+

production incidents

Numbers refresh from scripts/measure_mttr.py against the live demo DB. See the methodology for how each is computed.

How it works

Architecture deep-dive →

Detect

CloudWatch alarms fire on production errors and POST to a webhook. Push-based — no polling load on the production server.

Triage

Haiku-class classifier filters real incidents from noise and duplicates. Ground truth from a 100-case golden dataset.

Diagnose

Diagnosis agent grounds every claim against the actual repo via GitHub Code Search. No fabricated function names.

Fix

Fix-generation agent writes a patch, runs it in a Docker sandbox against the real test suite, retries up to 3× on failure.

Approve

HIGH/CRITICAL actions queue for human approval before merging. Approvals also feed an RLHF preference dataset.

The Three Camps of Retrieval Architecture: A Practical Guide

Improved RAG, GraphRAG, Ragless. Each is right about something. Each is wrong applied to problems it wasn't designed for. A decision framework for picking the right default and extending deliberately.

Read the post →

Recent writing

All posts →

May 2, 2026 · 7 min

RAG Finds the Candidate. The Live Store Confirms the Truth.

A search index can't be a source of truth — that's a category error, not just a bug. The general rule for any vector or search index sitting in front of a live data store.

architectureragagents

May 2, 2026 · 8 min

Why the Same Bug Kept Creating New Incidents (And What That Taught Me About RAG)

Three layers of dedup. Four independent failure modes that all had to fire simultaneously. The compound bug that exposed them, and the principle that makes it not happen again.

debuggingragagents