Default failure capture
Pytest failures can emit snapshots without additional project instrumentation.
Research Software | Open Source | MIT
llmdebug captures structured execution evidence at failure time, including exception
context, stack frames, and local state, and makes this evidence available through CLI,
notebook, and MCP interfaces.
Problem
LLM debugging without runtime evidence is often underdetermined.
Contribution
Reproducible snapshots that prioritize crash-site signal over verbose traces.
Context. LLM-assisted debugging pipelines often
receive insufficient runtime evidence, which limits diagnosis quality.
Method. llmdebug captures structured snapshots at
exception boundaries with crash-frame prioritization and local-state summaries.
Output. The captured evidence is exposed through
machine-readable and human-readable interfaces (CLI, notebook, MCP) to support iterative analysis.
Scope. The project provides evidence transport and
inspection; it does not provide formal guarantees of root-cause correctness.
This section summarizes the core capabilities that make snapshot-based debugging reproducible and inspectable.
Pytest failures can emit snapshots without additional project instrumentation.
JSON snapshots preserve exception context, frames, locals, and environment metadata.
The same evidence can be queried from terminal, notebook, production hooks, and MCP.
Snapshot diffing enables run-to-run comparisons for regression diagnosis.
A pattern-based hypothesis engine ranks common failure mechanisms for faster triage.
Redaction policies and rate limiting support safer operation in production contexts.
This section outlines the failure-triggered pipeline from exception boundary to evidence consumption.
Step 1
def test_transform():
result = transform(data)
assert result.shape == (100, 5)
Step 2
{
"exception": "ValueError: shape mismatch",
"closest_frame": {
"file": "pipeline.py",
"line": 47
}
}
Step 3
$ llmdebug show
$ llmdebug hypothesize
$ llmdebug diff
This section lists currently available capabilities with links to canonical documentation.
| Capability | Status | Documentation |
|---|---|---|
| Pytest failures produce snapshots by default | Available | README: Quick Start |
CLI inspection (show, list, frames, diff, hypothesize) |
Available | README: CLI |
Detail levels (crash, full, context) for evidence size control |
Available | README: Detail Levels |
| Production hooks with rate limiting and redaction controls | Available | README: Production Hooks |
| MCP server with evidence tools and RCA state tools | Available | README: MCP Server |
This section provides a minimal procedure for capturing and inspecting a failure artifact.
$ pip install llmdebug[cli]
$ pytest
$ llmdebug show
This section shows equivalent access patterns for the same snapshot evidence across integration surfaces.
# zero additional setup after installation
$ pip install llmdebug
$ pytest
# failure artifact:
# .llmdebug/latest.json
from llmdebug import debug_snapshot
@debug_snapshot()
def run_job(payload: list[int]) -> list[int]:
return [transform(x) for x in payload]
from llmdebug import snapshot_section
with snapshot_section("feature_pipeline"):
features = build_features(raw_data)
%load_ext llmdebug
%llmdebug
%llmdebug hypothesize
%llmdebug diff
import llmdebug
llmdebug.install_hooks()
# captures:
# sys.excepthook
# threading.excepthook
# sys.unraisablehook
{
"mcpServers": {
"llmdebug": {
"command": "uvx",
"args": ["llmdebug[mcp]", "serve"]
}
}
}
This section summarizes data handling safeguards and operational constraints for practical deployments.
Built-in redaction controls help reduce accidental leakage of sensitive fields in stored snapshots.
Exception hooks apply rate limits to avoid artifact floods during repeated failures.
Snapshots are stored locally by default, enabling offline and air-gapped debugging workflows.
The system provides structured context for diagnosis but does not establish causal correctness.
This section states interpretation limits and known boundaries of the current implementation.
evals/ and are separate from this page.This section provides a software citation template and release-traceable project references.
@software{vadasz2026llmdebug,
author = {Vadasz, Nicolas},
title = {llmdebug: Structured Debug Snapshots for LLM-Assisted Debugging},
year = {2026},
url = {https://github.com/nicholasvadasz/llmdebug},
license = {MIT}
}