Metadata-Version: 2.4
Name: beadloom
Version: 0.5.0
Summary: Context Oracle + Doc Sync Engine for AI-assisted development
Author: Beadloom Contributors
License-Expression: MIT
License-File: LICENSE
Keywords: ai,context,documentation,knowledge-graph,mcp
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Documentation
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: mcp>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: tree-sitter-python>=0.23
Requires-Dist: tree-sitter>=0.23
Provides-Extra: all
Requires-Dist: mypy>=1.13; extra == 'all'
Requires-Dist: pytest-cov>=5.0; extra == 'all'
Requires-Dist: pytest>=8.0; extra == 'all'
Requires-Dist: ruff>=0.8; extra == 'all'
Requires-Dist: tree-sitter-go>=0.23; extra == 'all'
Requires-Dist: tree-sitter-rust>=0.23; extra == 'all'
Requires-Dist: tree-sitter-typescript>=0.23; extra == 'all'
Requires-Dist: types-pyyaml>=6.0; extra == 'all'
Provides-Extra: dev
Requires-Dist: mypy>=1.13; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.8; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0; extra == 'dev'
Provides-Extra: languages
Requires-Dist: tree-sitter-go>=0.23; extra == 'languages'
Requires-Dist: tree-sitter-rust>=0.23; extra == 'languages'
Requires-Dist: tree-sitter-typescript>=0.23; extra == 'languages'
Description-Content-Type: text/markdown

# Beadloom

> Read this in other languages: [Русский](README.ru.md)

**Your architecture shouldn't live in one person's head.**

[![License: MIT](https://img.shields.io/github/license/zoologov/beadloom)](LICENSE)
[![GitHub release](https://img.shields.io/github/v/release/zoologov/beadloom)](https://github.com/zoologov/beadloom/releases)
[![PyPI](https://img.shields.io/pypi/v/beadloom)](https://pypi.org/project/beadloom/)
[![Python](https://img.shields.io/pypi/pyversions/beadloom)](https://pypi.org/project/beadloom/)
[![CI](https://img.shields.io/github/actions/workflow/status/zoologov/beadloom/ci.yml?label=CI)](https://github.com/zoologov/beadloom/actions)
[![mypy: strict](https://img.shields.io/badge/mypy-strict-blue)](https://mypy-lang.org/)
[![code style: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![coverage: 80%+](https://img.shields.io/badge/coverage-80%25%2B-green)](pyproject.toml)

---

Beadloom is a knowledge management tool for codebases. It turns scattered architecture knowledge into an explicit, queryable graph that lives in your Git repository — accessible to both humans and AI agents.

> IDE finds code. Beadloom tells you what that code means in the context of your system.

**Platforms:** macOS, Linux, Windows &nbsp;|&nbsp; **Python:** 3.10+

## Why Beadloom?

Large codebases have a knowledge problem that code search alone doesn't solve:

- **"Only two people understand how this system works."** Architecture knowledge lives in heads, not in the repo. When those people leave, the knowledge goes with them.
- **"The docs are lying."** Documentation goes stale within weeks. Nobody notices until an agent or a new hire builds on top of outdated specs.
- **"AI agents reinvent context every session."** Each agent run starts from scratch — grepping, reading READMEs, guessing which files matter. Most of the context window burns on orientation, not on actual work.

Beadloom solves this with two primitives:

1. **Context Oracle** — a knowledge graph (YAML in Git) that maps your domains, features, services, and their relationships. Query any node and get a deterministic, compact context bundle in <20ms. Same query, same result, every time.

2. **Doc Sync Engine** — tracks which docs correspond to which code. Detects stale documentation on every commit. No more "the spec says X but the code does Y".

### Deterministic context, not probabilistic guessing

IDE indexers use semantic search — an LLM decides what's relevant. This works for "find similar code", but fails for "explain this feature in the context of the whole system".

Beadloom uses **deterministic graph traversal**: your team defines the architecture graph, and BFS produces the same context bundle every time. The graph is YAML in Git — reviewable in PRs, auditable, version-controlled.

|  | Semantic search (IDE) | Beadloom |
|---|---|---|
| **Answers** | "Where is this class?" | "What is this feature and how does it fit?" |
| **Method** | Embeddings + LLM ranking | Explicit graph + BFS traversal |
| **Result** | Probabilistic file list | Deterministic context bundle |
| **Docs** | Doesn't track freshness | Catches stale docs on every commit |
| **Knowledge** | Dies with the session | Lives in Git, survives team changes |

Beadloom doesn't replace your IDE. It gives your IDE — and your agents — the architectural context they can't infer from code alone.

## Install

```bash
uv tool install beadloom        # recommended
pipx install beadloom            # alternative
```

## Quick start

```bash
# 1. Scan your codebase and generate a knowledge graph
beadloom init --bootstrap

# 2. Review the generated graph (edit domains, rename nodes, add edges)
vi .beadloom/_graph/services.yml

# 3. Build the index and start using it
beadloom reindex
beadloom ctx AUTH-001              # get context for a feature
beadloom sync-check                # check if docs are up to date
```

No documentation required to start — Beadloom bootstraps from code structure alone.

### Connect AI agents via MCP

```bash
beadloom setup-mcp                 # creates .mcp.json automatically
```

Agents call `get_context("AUTH-001")` and receive a ready-made bundle — zero search tokens:

```json
{
  "mcpServers": {
    "beadloom": {
      "command": "beadloom",
      "args": ["mcp-serve"]
    }
  }
}
```

Works with Claude Code, Cursor, and any MCP-compatible tool.

## Who is it for?

**Tech Lead / Architect** — You want architecture knowledge to be explicit, versionable, and survive team rotation. Beadloom makes the implicit explicit: domains, features, services, dependencies — all in YAML, all in Git.

**Platform / DevEx Engineer** — You build tooling for the team. Beadloom gives your agents structured context out of the box (via MCP), and your CI pipeline a doc freshness check that actually works.

**Individual Developer** — You're tired of spending the first hour on every task figuring out "how does this part of the system work?" `beadloom ctx FEATURE-ID` gives you the answer in seconds.

## Key features

- **Context Oracle** — deterministic graph traversal, compact JSON bundle in <20ms
- **Doc Sync Engine** — tracks code↔doc relationships, detects stale documentation, integrates with git hooks
- **Code-first onboarding** — bootstrap a knowledge graph from code structure alone; no docs needed to start
- **Doc import** — classify and link existing scattered documentation (`init --import`)
- **MCP server** — native integration with Claude Code, Cursor, and other MCP-compatible agents
- **Local-first** — single CLI + single SQLite file, no Docker, no cloud dependencies

## How it works

Beadloom maintains a **knowledge graph** defined in YAML files under `.beadloom/_graph/`. The graph consists of **nodes** (features, services, domains, entities, ADRs) connected by **edges** (part_of, uses, depends_on, etc.).

The indexing pipeline merges three sources into a single SQLite database:

1. **Graph YAML** — nodes and edges that describe the project architecture
2. **Documentation** — Markdown files linked to graph nodes, split into searchable chunks
3. **Code** — source files parsed with tree-sitter to extract symbols and `# beadloom:feature=AUTH-001` annotations

When you request context for a node, the Context Oracle runs a breadth-first traversal, collects the relevant subgraph, documentation, and code symbols, and returns a compact bundle.

The Doc Sync Engine tracks which documentation files correspond to which code files. On every commit (via a git hook), it detects stale docs and either warns or blocks the commit.

## CLI commands

| Command | Description |
|---------|-------------|
| `init --bootstrap` | Scan code and generate an initial knowledge graph |
| `init --import DIR` | Import and classify existing documentation |
| `reindex` | Rebuild the SQLite index from graph, docs, and code |
| `ctx REF_ID` | Get a context bundle (Markdown or `--json`) |
| `graph [REF_ID]` | Visualize the knowledge graph (Mermaid or JSON) |
| `status` | Project index statistics and documentation coverage |
| `doctor` | Validate the knowledge graph |
| `sync-check` | Check doc↔code synchronization status |
| `sync-update REF_ID` | Review and update stale docs |
| `install-hooks` | Install the beadloom pre-commit hook |
| `setup-mcp` | Configure MCP server for AI agents |
| `mcp-serve` | Run the MCP server (stdio transport) |

## MCP tools

| Tool | Description |
|------|-------------|
| `get_context` | Context bundle for a ref_id (graph + docs + code symbols) |
| `get_graph` | Subgraph around a node (nodes and edges as JSON) |
| `list_nodes` | List graph nodes, optionally filtered by kind |
| `sync_check` | Check if documentation is up-to-date with code |
| `get_status` | Documentation coverage and index statistics |

## Configuration

All project data lives under `.beadloom/` in your repository root:

- **`.beadloom/config.yml`** — scan paths, languages, sync engine settings
- **`.beadloom/_graph/*.yml`** — knowledge graph definition (YAML, version-controlled)
- **`.beadloom/beadloom.db`** — SQLite index (auto-generated, add to `.gitignore`)

Link code to graph nodes with annotations:

```python
# beadloom:feature=AUTH-001
# beadloom:service=user-service
def authenticate(user_id: str) -> bool:
    ...
```

## Documentation structure

Beadloom uses a domain-first layout:

```
docs/
  architecture.md
  decisions/
    ADR-001-cache-strategy.md
  domains/
    auth/
      README.md                  # domain overview, invariants
      features/
        AUTH-001/
          SPEC.md
    billing/
      README.md
  _imported/                     # unclassified docs from import
```

## Beads integration

*A context loom for your [beads](https://github.com/steveyegge/beads).*

Beadloom complements [Beads](https://github.com/steveyegge/beads) by providing structured context to planner/coder/reviewer agents. Beads workers call `get_context(feature_id)` via MCP and receive a ready-made bundle instead of searching the codebase from scratch.

Beadloom works independently of Beads — the integration is optional.

## Development

```bash
uv sync --dev              # install with dev dependencies
uv run pytest              # run tests
uv run ruff check src/     # lint
uv run ruff format src/    # format
uv run mypy                # type checking (strict mode)
```

## Docs

| Document | Description |
|----------|-------------|
| [architecture.md](docs/architecture.md) | System design and component overview |
| [getting-started.md](docs/getting-started.md) | Quick start guide |
| [context-oracle.md](docs/context-oracle.md) | BFS algorithm and context assembly |
| [cli-reference.md](docs/cli-reference.md) | CLI commands reference |
| [mcp-server.md](docs/mcp-server.md) | MCP integration guide |
| [sync-engine.md](docs/sync-engine.md) | Doc sync engine details |
| [graph-format.md](docs/graph-format.md) | YAML graph format specification |

## License

MIT
