Metadata-Version: 2.4
Name: tokennuke
Version: 0.1.0
Summary: Nuke your token usage. Intelligent code indexer with O(1) symbol retrieval, hybrid search, and call graph analysis.
Project-URL: Homepage, https://github.com/BigJai/tokennuke
Project-URL: Issues, https://github.com/BigJai/tokennuke/issues
Author-email: Jai Dunlop <jaidunlop85@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,code-indexing,code-search,mcp,tokens,tree-sitter
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.11
Requires-Dist: fastembed>=0.6.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: mcp[cli]>=1.6.0
Requires-Dist: pathspec>=0.12.0
Requires-Dist: sqlite-vec>=0.1.0
Requires-Dist: tree-sitter-language-pack>=0.7.0
Description-Content-Type: text/markdown

# TokenNuke

**Nuke your token usage.** Intelligent code indexer MCP server with O(1) symbol retrieval, hybrid search, and call graph analysis.

TokenNuke indexes your codebase using tree-sitter, extracts every function, class, method, constant, and type with byte offsets, then serves them to AI agents via MCP. Instead of reading entire files, agents retrieve exactly the symbols they need — saving 80-95% of tokens.

## Features

- **O(1) Symbol Retrieval** — Byte-offset reads. No file scanning.
- **10 Languages** — Python, JavaScript, TypeScript, Go, Rust, Java, C, C++, C#, Ruby
- **Hybrid Search** — FTS5 keyword search + vector embeddings via Reciprocal Rank Fusion
- **Call Graph** — Trace "who calls this?" and "what does this call?" with depth traversal
- **Incremental Indexing** — SHA-256 per file. Only re-indexes what changed.
- **Full-Text Search** — Search raw file contents (strings, comments, config values)
- **No File Limit** — SQLite backend, not JSON. Index repos with 100K+ files.
- **Dual Transport** — stdio (local) or streamable-http (remote)
- **Security** — Path traversal protection, secret file detection, binary filtering

## Install

```bash
pip install tokennuke
```

## Quick Start

### Claude Code / Claude Desktop

Add to your MCP config (`~/.claude.json` or `claude_desktop_config.json`):

```json
{
  "mcpServers": {
    "tokennuke": {
      "command": "tokennuke",
      "args": []
    }
  }
}
```

### CLI

```bash
# Start in stdio mode (default)
tokennuke

# Start as HTTP server
tokennuke --transport streamable-http --port 5100
```

## 13 MCP Tools

| Tool | Description |
|------|-------------|
| `index_folder` | Index a local directory |
| `index_repo` | Clone and index a Git repo (GitHub/GitLab/Bitbucket) |
| `list_repos` | List all indexed repositories |
| `invalidate_cache` | Force full re-index |
| `file_tree` | Directory tree with file counts |
| `file_outline` | All symbols in a single file |
| `repo_outline` | All symbols in repo (summary) |
| `get_symbol` | Full source of one symbol (O(1) byte seek) |
| `get_symbols` | Batch get multiple symbols |
| `search_symbols` | Hybrid FTS5 + vector search |
| `search_text` | Full-text search in file contents |
| `get_callees` | What does this function call? |
| `get_callers` | Who calls this function? |

## How It Works

1. **Index**: Tree-sitter parses source files into ASTs. We walk each AST extracting symbols (functions, classes, methods, constants, types) with their exact byte offsets.
2. **Store**: Symbols go into per-repo SQLite databases with FTS5 indexes and sqlite-vec embeddings.
3. **Search**: Hybrid search combines BM25 keyword matching with semantic vector similarity via Reciprocal Rank Fusion.
4. **Retrieve**: `get_symbol` seeks to the exact byte offset in the source file and reads only that symbol's bytes. No wasted tokens.
5. **Graph**: Call expressions are extracted from function bodies and stored as edges. Callee names are resolved to symbol IDs after indexing.

## Token Savings Example

| Method | Tokens | Savings |
|--------|--------|---------|
| Read entire file (500 lines) | ~4,000 | — |
| `file_outline` + `get_symbol` | ~200-400 | **90-95%** |

## Development

```bash
git clone https://github.com/BigJai/tokennuke
cd tokennuke
pip install -e ".[dev]"
pytest tests/
```

## License

MIT
