Give your AI actual evidence. llmdebug captures structured snapshots at failure time — so LLMs diagnose instead of guess.
Three steps. No configuration. Evidence-first debugging.
# test_pipeline.py
def test_transform():
result = transform(data)
# ← fails here
assert result.shape == (100, 5)
{
"exception": "ValueError: shape mismatch",
"closest_frame": {
"file": "pipeline.py",
"line": 47,
"locals": {
"data": "ndarray (100,3)",
"result": "ndarray (100,4)"
}
}
}
1. Shape mismatch: expected (100,5)
got (100,4)
→ Check transform() dimensions
2. Missing feature column
after preprocessing step
→ Verify feature_engineering()
returns 5 cols
"No more guessing. Evidence first, every time."
Six integration patterns. Choose the one that fits your workflow.
# Zero configuration needed. Just install and run.
$ pip install llmdebug
$ pytest
# ✓ Failures automatically create .llmdebug/latest.json
# ✓ Read with: llmdebug show
# ✓ Hypotheses: llmdebug hypothesize
# ✓ Compare: llmdebug diff
from llmdebug import debug_snapshot
@debug_snapshot()
def process_batch(data: list) -> list:
return [transform(item) for item in data]
# Snapshot is captured automatically if the function raises.
# Pass config= to customize detail level, PII redaction, etc.
from llmdebug import snapshot_section
with snapshot_section("feature_engineering"):
features = build_features(raw_data)
# Pinpoint exactly where in a pipeline something breaks.
# Locals at the failure boundary are captured for you.
%load_ext llmdebug # Auto-captures on cell errors
%llmdebug # Rich HTML snapshot in notebook
%llmdebug hypothesize # Ranked debugging hypotheses
%llmdebug diff # Compare with previous run
%llmdebug list # List recent snapshots
import llmdebug
llmdebug.install_hooks()
# Captures: sys.excepthook, threading.excepthook,
# sys.unraisablehook
# Includes: rate limiting + PII redaction
# For web apps — WSGI/ASGI middleware:
from llmdebug import LLMDebugWSGIMiddleware
app = LLMDebugWSGIMiddleware(app) # Flask, Django
# (use LLMDebugASGIMiddleware for FastAPI)
{
"mcpServers": {
"llmdebug": {
"command": "uvx",
"args": ["llmdebug[mcp]", "serve"]
}
}
}
Works with Claude Code, Cursor, and any MCP-compatible IDE. 10 tools: show_snapshot, hypothesize, diff_snapshots, and more.
Built for modern Python development workflows.
Install and run. Failures captured automatically via pytest plugin. No setup required.
10 pattern detectors auto-rank debugging leads — empty arrays, shape mismatches, None values, off-by-one errors.
Compact JSON (~40% smaller). TOON format for ~50% token savings when pasting into an AI chat.
Tensor shapes, NaN/Inf detection, device tracking, requires_grad — captured out of the box for PyTorch & NumPy.
Direct integration with Claude Code, Cursor, and any MCP-compatible IDE. 10 tools ready to use.
Exception hooks with rate limiting and automatic PII redaction — safe to deploy in production environments.
Compare runs to see exactly what changed between two failures — pinpoint regressions instantly.
WSGI/ASGI middleware for Flask, FastAPI, and Django. Zero-config crash capture for web applications.
Three commands. That's it.