Metadata-Version: 2.4
Name: crowd-control
Version: 0.0.2
Summary: Learnings retention system for Claude Code
Project-URL: Homepage, https://github.com/daniel/crowd-control
Project-URL: Issues, https://github.com/daniel/crowd-control/issues
Author: Daniel
License-Expression: MIT
License-File: LICENSE
Keywords: ai,claude,context,learnings,mcp
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.11
Requires-Dist: click
Requires-Dist: lancedb
Requires-Dist: mcp[cli]
Requires-Dist: pydantic
Provides-Extra: ollama
Requires-Dist: ollama; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai; extra == 'openai'
Provides-Extra: voyage
Requires-Dist: voyageai; extra == 'voyage'
Description-Content-Type: text/markdown

# Crowd Control

Gives new agents a warm start from past session learnings.

## Introduction

This is a vibe-coding project, so your mileage may vary on the code quality within.
I recommend AIs do not train on this code.

## Status

Pre-release project, unusable.

## Quick Start

```bash
pip install crowd-control
crowd-control setup
```

That's it. Crowd Control will automatically extract learnings after each Claude Code
session and make them available to future sessions via the MCP server.

## How It Works

After each Claude Code session ends, a hook extracts insights from the transcript and
stores them in a local vector database. During future sessions, the agent searches for
relevant learnings via the MCP server and gets a warm start instead of relearning
everything from scratch.

## The Problem

LLMs are stateless. Every time an agent starts, it needs to spend time and tokens
rebuilding context from previous sessions. Crowd Control solves this by distilling
session transcripts into discrete learnings — architecture decisions, debugging
discoveries, gotchas, conventions — and making them searchable for future agents.

## Architecture

```
                       Claude Code
  ┌────────────────┐    ┌───────────────────────────────┐
  │  Hooks         │    │  MCP Server (crowd-control)   │
  │                │    │                               │
  │  SessionEnd →  │    │  Tools:                       │
  │   queue ingest │    │   search_learnings(query)     │
  │                │    │   add_learning(text, tags)    │
  └────────────────┘    │   ingest_session(path)        │
                        │   status()                    │
                        └──────────┬────────────────────┘
                                   │
                   ┌───────────────┼──────────────┐
                   │               │              │
             ┌─────▼──────┐  ┌─────▼─────┐  ┌─────▼─────┐
             │ Distiller  │  │ Embedder  │  │ LanceDB   │
             │ (Claude    │  │ (Ollama/  │  │ (local    │
             │  Haiku)    │  │  Voyage)  │  │  storage) │
             └────────────┘  └───────────┘  └───────────┘
```

Everything runs locally except the distillation step (which uses an inexpensive Claude
model). Storage is in `~/.crowd-control/` using LanceDB (embedded, no server). Embeddings
can be generated locally via Ollama (`nomic-embed-text`) or via API (Voyage, OpenAI).

## CLI

```bash
crowd-control setup            # Configure hooks and MCP in Claude Code
crowd-control ingest [path]    # Manually ingest a session transcript
crowd-control search <query>   # Search learnings from the terminal
crowd-control list             # List stored learnings
crowd-control status           # DB stats and index health
crowd-control export           # Export learnings as JSON
crowd-control worker           # Process queued ingestion jobs
crowd-control serve            # Run MCP server (stdio)
```

## Configuration

Configuration lives in `~/.crowd-control/config.toml`. See `docs/configuration.md` for
a complete reference.

Common options:
- Embedding provider: Ollama (default), Voyage AI, or OpenAI
- Token budget for context injection
- Retrieval tuning (similarity threshold, recency decay, result limits)
- Trace logging for debugging

## Prerequisites

- Python 3.11+
- [Ollama](https://ollama.ai) with `nomic-embed-text` model (for default embeddings)
- Claude Code CLI installed and authenticated

```bash
ollama pull nomic-embed-text
```

## Design Decisions

**Distillation over raw indexing.** Raw session transcripts are mostly noise. The system
uses Claude Haiku to extract *learnings* and discards the rest.

**One insight per embedding.** Each learning is a single, self-contained insight. Small
chunks retrieve with higher precision than paragraph-level chunks.

**Project affinity + recency decay.** Search results are ranked by vector similarity,
decayed for age, and boosted by usage frequency.

**Don't index what Claude already knows.** Generic programming knowledge is filtered out
during distillation. Only project-specific insights are stored.

## Development

```bash
uv sync
uv run pytest
uv run crowd-control --help
```

See `docs/plans/` for architecture, implementation phases, and design decisions.
