Metadata-Version: 2.4
Name: echomine
Version: 1.1.0
Summary: Library-first tool for parsing AI conversation exports with search, filtering, and markdown export
Author: Echomine Contributors
License: AGPL-3.0-or-later
Project-URL: Homepage, https://github.com/echomine/echomine
Project-URL: Documentation, https://github.com/echomine/echomine
Project-URL: Repository, https://github.com/echomine/echomine
Project-URL: Issues, https://github.com/echomine/echomine/issues
Keywords: ai,chatgpt,conversation,parser,export,search
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing
Classifier: Typing :: Typed
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic<3.0.0,>=2.6.0
Requires-Dist: ijson<4.0,>=3.2.0
Requires-Dist: typer>=0.9.0
Requires-Dist: rich>=13.0.0
Requires-Dist: structlog>=23.0.0
Requires-Dist: python-slugify>=8.0.0
Requires-Dist: python-dateutil>=2.8.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-mock>=3.11.0; extra == "dev"
Requires-Dist: pytest-benchmark>=4.0.0; extra == "dev"
Requires-Dist: psutil>=5.9.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: pre-commit>=3.4.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.6.0; extra == "docs"
Requires-Dist: mkdocs-material>=9.0.0; extra == "docs"
Requires-Dist: mkdocstrings[python]>=1.0.0; extra == "docs"
Requires-Dist: pymdown-extensions>=10.0.0; extra == "docs"
Requires-Dist: ruff>=0.1.0; extra == "docs"
Dynamic: license-file

# Echomine

**Library-first tool for parsing AI conversation exports with search, filtering, and markdown export**

[![PyPI Downloads](https://img.shields.io/pepy/dt/echomine)](https://pepy.tech/project/echomine)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![Type Checked](https://img.shields.io/badge/mypy-strict-blue.svg)](https://mypy.readthedocs.io/)
[![Code Style: Ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
[![codecov](https://codecov.io/gh/aucontraire/echomine/graph/badge.svg)](https://codecov.io/gh/aucontraire/echomine)
[![Documentation](https://img.shields.io/badge/docs-mkdocs-blue.svg)](https://aucontraire.github.io/echomine/)

## Overview

Echomine is a Python library and CLI tool for parsing, searching, and exporting AI conversation exports. Initially designed for ChatGPT exports, it uses a multi-provider adapter pattern to support future AI platforms (Claude, Gemini, etc.).

### Key Features

- **Memory Efficient**: Stream-based parsing handles 1GB+ files with constant memory usage
- **Advanced Search**: BM25 relevance ranking with exact phrase matching, boolean logic, role filtering, and keyword exclusion
- **Message Snippets**: Automatic preview generation for search results with match context
- **Type Safe**: Strict typing with Pydantic v2 and mypy --strict compliance
- **Library First**: All CLI capabilities available as importable Python library
- **Multi-Provider Ready**: Adapter pattern supports multiple AI export formats

### Design Principles

1. **Library-First Architecture**: CLI built on top of library, not vice versa
2. **Strict Type Safety**: mypy --strict, no `Any` types in public API
3. **Memory Efficiency**: Stream-based parsing, never load entire file into memory
4. **Test-Driven Development**: All features test-first validated
5. **YAGNI**: Simple solutions, no speculative features

See [Constitution](.specify/memory/constitution.md) for complete design principles.

## Installation

### From Source

```bash
# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks (optional)
pre-commit install
```

### From PyPI (when published)

```bash
pip install echomine
```

## Quick Start

### Library API (Primary Interface)

```python
from echomine import OpenAIAdapter, SearchQuery
from pathlib import Path

# Initialize adapter (stateless, reusable)
adapter = OpenAIAdapter()
export_file = Path("conversations.json")

# 1. List all conversations (discovery)
for conversation in adapter.stream_conversations(export_file):
    print(f"[{conversation.created_at.date()}] {conversation.title}")
    print(f"  Messages: {len(conversation.messages)}")

# 2. Search with keywords (BM25 ranking)
query = SearchQuery(keywords=["algorithm", "design"], limit=10)
for result in adapter.search(export_file, query):
    print(f"{result.conversation.title} (score: {result.score:.2f})")
    print(f"  Preview: {result.snippet}")  # v1.1.0: automatic snippets

# 3. Advanced search with filters (v1.1.0+)
from datetime import date
query = SearchQuery(
    keywords=["refactor"],
    phrases=["algo-insights"],  # Exact phrase matching
    match_mode="all",  # Require ALL keywords (AND logic)
    exclude_keywords=["test"],  # Filter out unwanted results
    role_filter="user",  # Search only user messages
    from_date=date(2024, 1, 1),
    to_date=date(2024, 3, 31),
    limit=5
)
for result in adapter.search(export_file, query):
    print(f"[{result.score:.2f}] {result.conversation.title}")
    print(f"  Snippet: {result.snippet}")

# 4. Get specific conversation by ID
conversation = adapter.get_conversation_by_id(export_file, "conv-abc123")
if conversation:
    print(f"Found: {conversation.title}")
```

### CLI Usage (Built on Library)

```bash
# List all conversations
echomine list export.json

# Search by keywords
echomine search export.json --keywords "algorithm,design" --limit 10

# Search by exact phrase (v1.1.0+)
echomine search export.json --phrase "algo-insights"

# Boolean match mode: require ALL keywords (v1.1.0+)
echomine search export.json -k "python" -k "async" --match-mode all

# Exclude unwanted results (v1.1.0+)
echomine search export.json -k "python" --exclude "django" --exclude "flask"

# Role filtering: search only user/assistant messages (v1.1.0+)
echomine search export.json -k "refactor" --role user

# Combine all filters (v1.1.0+)
echomine search export.json --phrase "api" -k "python" --exclude "test" --role user --match-mode all

# Search by title (fast, metadata-only)
echomine search export.json --title "Project"

# Filter by date range
echomine search export.json --from-date "2024-01-01" --to-date "2024-03-31"

# Get conversation by ID
echomine get conversation export.json conv-abc123

# Get message by ID (with conversation hint for performance)
echomine get message export.json msg-def456 -c conv-abc123

# Export conversation to markdown (default)
echomine export export.json conv-abc123 --output algo.md

# Export as JSON
echomine export export.json conv-abc123 --format json --output algo.json

# JSON to stdout for piping
echomine export export.json conv-abc123 -f json | jq '.messages | length'

# JSON output for search results
echomine search export.json --keywords "python" --json | jq '.results[].title'

# Version info
echomine --version
```

**Search Filter Logic:** Content matching (phrases OR keywords) happens first, then post-filtering (--exclude, --role, --title, dates) is applied. See [CLI Usage](https://aucontraire.github.io/echomine/cli-usage/#how-search-filters-combine) for details.

See [Quickstart Guide](docs/quickstart.md) for detailed examples.

## Development

### Prerequisites

- Python 3.12 or higher
- Git

### Setup Development Environment

```bash
# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install
```

### Running Tests

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=echomine --cov-report=html

# Run specific test categories
pytest -m unit           # Unit tests only
pytest -m integration    # Integration tests only
pytest -m contract       # Contract tests only
pytest -m performance    # Performance benchmarks
```

### Code Quality

```bash
# Type checking (strict mode)
mypy src/

# Linting and formatting
ruff check .
ruff format .

# Run pre-commit hooks manually
pre-commit run --all-files
```

### Project Structure

```
echomine/
├── src/echomine/           # Library source code
│   ├── models/             # Pydantic data models
│   ├── adapters/           # Provider adapters (OpenAI, etc.)
│   ├── parsers/            # Streaming JSON parsers
│   ├── search/             # Search and ranking logic
│   ├── exporters/          # Export formatters (markdown, JSON)
│   └── cli/                # CLI commands
├── tests/                  # Test suite
│   ├── unit/               # Unit tests
│   ├── integration/        # Integration tests
│   ├── contract/           # Protocol contract tests
│   └── performance/        # Performance benchmarks
└── specs/                  # Design documents
    └── 001-ai-chat-parser/ # Feature specification
```

## Documentation

**[Full Documentation](https://aucontraire.github.io/echomine/)** - Comprehensive guides, API reference, and examples

### Quick Links

- [Getting Started](https://aucontraire.github.io/echomine/quickstart/)
- [Library Usage](https://aucontraire.github.io/echomine/library-usage/)
- [CLI Reference](https://aucontraire.github.io/echomine/cli-usage/)
- [API Reference](https://aucontraire.github.io/echomine/api/)

### Spec Documents

- [Feature Specification](specs/001-ai-chat-parser/spec.md)
- [Implementation Plan](specs/001-ai-chat-parser/plan.md)
- [CLI Interface Contract](specs/001-ai-chat-parser/contracts/cli_spec.md)
- [Data Model](specs/001-ai-chat-parser/data-model.md)

## Performance

Echomine is designed for memory efficiency and speed:

- **Memory**: O(1) memory usage regardless of file size (streaming-based)
- **Search**: <30 seconds for 1.6GB files (10K conversations, 50K messages)
- **Listing**: <5 seconds for 10K conversations

See [Performance Requirements](specs/001-ai-chat-parser/spec.md#performance-requirements) for benchmarks.

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for:

- Development setup and prerequisites
- TDD workflow (RED-GREEN-REFACTOR cycle mandatory)
- Testing guidelines (pytest, mypy --strict, ruff)
- Code quality standards and conventions
- Commit message format (conventional commits)
- Pull request process

## License

AGPL-3.0 License - See [LICENSE](LICENSE) file for details

## Acknowledgments

Built with:
- [Pydantic](https://docs.pydantic.dev/) - Data validation and type safety
- [ijson](https://github.com/ICRAR/ijson) - Streaming JSON parser
- [Typer](https://typer.tiangolo.com/) - CLI framework
- [Rich](https://rich.readthedocs.io/) - Terminal formatting
- [structlog](https://www.structlog.org/) - Structured logging
