Metadata-Version: 2.4
Name: tokenette
Version: 2.0.0
Summary: The Ultimate All-in-One AI Coding Enhancement MCP - Zero-Loss Token Optimization · Intelligent Model Routing · Quality Amplification
Project-URL: Homepage, https://github.com/itsmeadarsh2008/tokenette
Project-URL: Documentation, https://tokenette.dev
Project-URL: Repository, https://github.com/itsmeadarsh2008/tokenette
Project-URL: Issues, https://github.com/itsmeadarsh2008/tokenette/issues
Author-email: Adarsh Gourab Mahalik <adarshgourabmahalik@outlook.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai-coding,claude,copilot,fastmcp,llm,mcp,model-context-protocol,token-optimization
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: aiofiles>=24.0.0
Requires-Dist: cachetools>=5.3.0
Requires-Dist: diskcache>=5.6.0
Requires-Dist: fastmcp>=2.0.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: orjson>=3.9.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: rich>=13.7.0
Requires-Dist: typer>=0.12.0
Requires-Dist: watchfiles>=0.21.0
Requires-Dist: xxhash>=3.4.0
Provides-Extra: dev
Requires-Dist: mypy>=1.9.0; extra == 'dev'
Requires-Dist: pre-commit>=3.6.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.3.0; extra == 'dev'
Requires-Dist: twine>=5.0.0; extra == 'dev'
Provides-Extra: full
Requires-Dist: mypy>=1.9.0; extra == 'full'
Requires-Dist: numpy>=1.26.0; extra == 'full'
Requires-Dist: pre-commit>=3.6.0; extra == 'full'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'full'
Requires-Dist: pytest-cov>=4.1.0; extra == 'full'
Requires-Dist: pytest>=8.0.0; extra == 'full'
Requires-Dist: ruff>=0.3.0; extra == 'full'
Requires-Dist: sentence-transformers>=2.5.0; extra == 'full'
Requires-Dist: twine>=5.0.0; extra == 'full'
Provides-Extra: vector
Requires-Dist: numpy>=1.26.0; extra == 'vector'
Requires-Dist: sentence-transformers>=2.5.0; extra == 'vector'
Description-Content-Type: text/markdown

# 🪙 Tokenette

**The Ultimate All-in-One AI Coding Enhancement MCP**

[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![FastMCP](https://img.shields.io/badge/FastMCP-2.0+-green.svg)](https://github.com/jlowin/fastmcp)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

> **Zero-Loss Token Optimization · Intelligent Model Routing · Quality Amplification**

Tokenette makes any AI model perform like GPT-4.5 quality at GPT-4o cost. It achieves **90-99% token reduction** without sacrificing code quality through intelligent caching, semantic compression, and dynamic tool discovery.

## ✨ Key Features

### 🧠 Intelligent Model Routing
- **Complexity Detection**: Automatically routes tasks to the cheapest model that can handle them
- **Budget Tracking**: Tracks premium request usage (300/month for Pro)
- **Auto-Mode Discount**: Exploits 10% discount when using auto model selection
- **Adaptive Learning**: Improves routing based on past interactions

### 📦 Token Optimization Stack
- **Multi-Layer Cache (L1-L4)**: 99.8% savings on repeated data
- **Minification Engine**: 20-61% savings (JSON/Code/TOON formats)
- **Semantic Compression**: 30-50% additional savings
- **Cross-File Deduplication**: 40-60% savings on shared code

### ✨ Quality Amplification
- **Expert Role Framing**: Makes cheap models think like senior engineers
- **Chain-of-Thought Injection**: Adds reasoning steps automatically
- **Few-Shot Examples**: Injects category-specific examples
- **Structured Output**: Enforces consistent response formats

### 🔧 Smart File Operations
- **AST-Based Reading**: Extract structure without full content
- **Diff-Based Writing**: 97% savings vs. full file rewrites
- **Batch Operations**: Combine multiple reads/writes into one request
- **Semantic Search**: Find code by meaning, not just text

### 📚 Context7 Integration
- **Up-to-Date Docs**: Fetches current library documentation
- **Intelligent Caching**: Caches docs with appropriate TTLs
- **Semantic Search**: Find relevant docs across libraries

## 📊 Real Model Costs (GitHub Copilot Pro)

| Model | Multiplier | Effective Uses/Month | Best For |
|-------|------------|---------------------|----------|
| GPT-5 mini | 0× (FREE) | ∞ | Quick edits, prototyping |
| GPT-4.1 | 0× (FREE) | ∞ | General coding, boilerplate |
| GPT-4o | 0× (FREE) | ∞ | Multimodal, general tasks |
| Gemini 2.0 Flash | 0.25× | 1,200 | Speed-critical tasks |
| o4-mini | 0.33× | 900 | Cost-efficient reasoning |
| Claude Sonnet 4 | 1× (0.9× auto) | 300 | Complex logic, multi-file |
| Gemini 2.5 Pro | 1× | 300 | Large context, architecture |
| Claude Opus 4.5 | 3× | 100 | Critical reasoning |
| Claude Opus 4 | 10× | 30 | Expert-level tasks |
| GPT-4.5 | 50× | 6 | AVOID |

## 🚀 Quick Start

### Installation

```bash
# Clone the repository
git clone https://github.com/yourusername/tokenette.git
cd tokenette

# Install with pip
pip install -e .

# Or with uv (recommended)
uv pip install -e .
```

### Basic Usage

```bash
# Start the MCP server (stdio transport - default)
tokenette run

# Start with SSE transport
tokenette run --transport sse --port 8000

# View metrics
tokenette metrics

# Analyze code
tokenette analyze src/
```

### VS Code / GitHub Copilot Integration

Add to your `mcp.json` or settings:

```json
{
  "mcpServers": {
    "tokenette": {
      "command": "tokenette",
      "args": ["run"]
    }
  }
}
```

### Python API

```python
from tokenette import mcp, TaskRouter, QualityAmplifier

# Get optimal model for a task
router = TaskRouter()
decision = router.route("refactor authentication module", {"affected_files": 5})
print(f"Use: {decision.model} ({decision.multiplier}×)")

# Amplify a prompt for cheaper models
amplifier = QualityAmplifier()
result = amplifier.amplify(
    "Write a user authentication service",
    boosters=["expert_role_framing", "chain_of_thought_injection"],
    category="generation",
    context={}
)
print(result.enhanced_prompt)
```

## 🛠️ Available Tools

### Meta Tools (Start Here!)
- `tokenette_discover_tools` - List available tools efficiently (96% token savings)
- `tokenette_get_tool_details` - Get full schema for a specific tool

### File Operations
- `tokenette_read_file` - Smart file reading with multiple strategies
- `tokenette_write_file` - Diff-based file writing (97% savings)
- `tokenette_search_code` - Semantic code search
- `tokenette_get_structure` - AST-based file structure
- `tokenette_batch_read` - Read multiple files with deduplication

### Code Analysis
- `tokenette_analyze` - Full code analysis (complexity, security, style)
- `tokenette_find_bugs` - Bug and security issue detection
- `tokenette_complexity` - Cyclomatic complexity metrics

### Documentation (Context7)
- `tokenette_resolve_lib` - Resolve library names to Context7 IDs
- `tokenette_get_docs` - Fetch library documentation
- `tokenette_search_docs` - Search across library docs

### Optimization
- `tokenette_optimize` - Apply full optimization pipeline
- `tokenette_route_task` - Get model routing recommendation
- `tokenette_amplify` - Enhance prompts for cheaper models
- `tokenette_metrics` - View session statistics

## ⚙️ Configuration

Create `.tokenette.json` in your project root:

```json
{
  "cache": {
    "l1_max_size": 104857600,
    "l1_ttl": 1800,
    "l2_max_size": 2147483648,
    "l2_ttl": 14400,
    "l2_path": ".tokenette/cache/l2"
  },
  "router": {
    "monthly_budget": 300,
    "prefer_free_models": true,
    "auto_mode_discount": 0.1
  },
  "compression": {
    "min_size": 1000,
    "quality_threshold": 0.95
  }
}
```

Or use environment variables:

```bash
export TOKENETTE_CACHE__L1_MAX_SIZE=104857600
export TOKENETTE_ROUTER__MONTHLY_BUDGET=300
```

## 📁 Project Structure

```
tokenette/
├── src/tokenette/
│   ├── __init__.py           # Package exports
│   ├── config.py             # Pydantic configuration
│   ├── server.py             # FastMCP server
│   ├── cli.py                # Typer CLI
│   ├── core/
│   │   ├── cache.py          # Multi-layer cache (L1-L4)
│   │   ├── minifier.py       # JSON/Code/TOON minification
│   │   ├── compressor.py     # Semantic compression
│   │   ├── optimizer.py      # Full pipeline orchestrator
│   │   ├── router.py         # Task routing engine
│   │   └── amplifier.py      # Quality amplification
│   └── tools/
│       ├── meta.py           # Dynamic tool discovery
│       ├── file_ops.py       # File operations
│       ├── analysis.py       # Code analysis
│       └── context7.py       # Documentation integration
├── tests/
├── pyproject.toml
└── README.md
```

## 🔬 How It Works

### The Three Pillars

1. **Route Right**: Assign tasks to the cheapest model that can handle them
2. **Amplify Low**: Make free/cheap models produce premium-quality output
3. **Shrink Everything**: Minify, compress, cache, batch, deduplicate

### Token Optimization Pipeline

```
Input Data
    ↓
┌─────────────────────────────────────┐
│  Stage 1: Cache Check (L1→L4)      │  99.8% savings on cache hit
└─────────────────────────────────────┘
    ↓ (cache miss)
┌─────────────────────────────────────┐
│  Stage 2: Minification             │  20-61% savings
│  • JSON → compact                   │
│  • Code → remove comments/blanks    │
│  • Arrays → TOON format             │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Stage 3: Deduplication            │  40-60% savings
│  • Remove repeated structures       │
│  • Cross-file shared code refs      │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Stage 4: Reference Extraction      │  20-40% savings
│  • Replace repeated objects w/ refs │
│  • Nested object flattening         │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Stage 5: Semantic Compression      │  30-50% savings
│  • Large text summarization         │
│  • Quality threshold: 0.95          │
└─────────────────────────────────────┘
    ↓
Optimized Output (cache & transmit)
```

### TOON Format (Token-Optimized Object Notation)

For homogeneous arrays, TOON achieves **61% token savings**:

**Before (JSON):**
```json
[{"file":"auth.js","func":"validate","line":45},
 {"file":"auth.js","func":"refresh","line":67}]
```

**After (TOON):**
```
items[2]{file,func,line}:
auth.js,validate,45
auth.js,refresh,67
```

## 🧪 Development

```bash
# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy src/tokenette

# Linting
ruff check src/tokenette
```

## 📜 License

MIT License - see [LICENSE](LICENSE) for details.

## 🤝 Contributing

Contributions are welcome! Please read the contributing guidelines first.

---

**Made with ❤️ for the AI coding community**

*"Make any model perform like GPT-4.5 quality at GPT-4o cost."*
