Metadata-Version: 2.4
Name: takeseek
Version: 0.1.0
Summary: Index and search a local video library using metadata, transcripts, and AI-generated visual descriptions
Project-URL: Homepage, https://github.com/ahumanflourish/takeseek
Project-URL: Repository, https://github.com/ahumanflourish/takeseek
Project-URL: Issues, https://github.com/ahumanflourish/takeseek/issues
Author-email: Alexander Antholzner <ahumanflourish@gmail.com>
License-Expression: LicenseRef-Proprietary
License-File: LICENSE
Keywords: indexing,mcp,search,transcription,video,whisper
Classifier: Development Status :: 3 - Alpha
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Multimedia :: Video
Requires-Python: >=3.10
Requires-Dist: faster-whisper>=1.0.0
Requires-Dist: mcp[cli]>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Description-Content-Type: text/markdown

# TakeSeek

An MCP server that indexes and searches a local video library. It extracts metadata with ffprobe, transcribes audio with [faster-whisper](https://github.com/SYSTRAN/faster-whisper), and uses Claude's vision to describe video frames — all searchable via full-text search.

## How it works

TakeSeek gives Claude Code tools to manage a video library:

1. **Index** — Scans a folder for video files, extracts metadata (resolution, codec, duration), pre-extracts frames, and starts audio transcription in the background
2. **See** — Returns pre-extracted frames to Claude Code as images. Claude describes what it sees in each frame — while transcription runs in parallel
3. **Search** — Full-text search across transcripts and frame descriptions using SQLite FTS5

The key design choice: there's no separate vision API call. Claude Code *is* the vision model. The `extract_frames` tool returns actual images that Claude sees natively, then Claude writes descriptions and saves them with `save_frame_descriptions`.

## Requirements

- Python 3.10+
- [ffmpeg](https://ffmpeg.org/) (for metadata extraction, audio extraction, and frame capture)

Install ffmpeg:
```bash
# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt install ffmpeg

# Windows
winget install ffmpeg
```

## Install

```bash
pip install takeseek
```

Or from source:
```bash
git clone https://github.com/ahumanflourish/takeseek.git
cd takeseek
pip install .
```

## Configure with Claude Code

Add the MCP server to Claude Code:

```bash
claude mcp add takeseek -- takeseek-server
```

To set a default video folder and database location:

```bash
claude mcp add takeseek \
  -e FOOTAGE_LIBRARY_PATH=/path/to/your/videos \
  -e FOOTAGE_DB_PATH=/path/to/takeseek.db \
  -- takeseek-server
```

## Environment variables

| Variable | Default | Description |
|---|---|---|
| `FOOTAGE_LIBRARY_PATH` | *(none)* | Default directory to scan for videos |
| `FOOTAGE_DB_PATH` | `takeseek.db` | Path to the SQLite database |
| `FOOTAGE_WHISPER_MODEL` | `base` | Whisper model size (`tiny`, `base`, `small`, `medium`, `large`) |
| `FOOTAGE_FRAME_INTERVAL` | `30` | Seconds between extracted frames |

## Usage

Once configured, just ask Claude Code to work with your videos:

```
> Index the videos in /path/to/footage
> Search my videos for "sunset over the ocean"
> What clips mention machine learning?
> Tag clip 3 as "interview"
> Show me what's in clip 5
```

### MCP tools

| Tool | Description |
|---|---|
| `index_new_files` | Scan a directory, extract metadata, pre-extract frames, and start background transcription |
| `extract_frames` | Return pre-extracted frames from a clip as images for Claude to see |
| `save_frame_descriptions` | Store Claude's descriptions of what it saw in extracted frames |
| `search_footage` | Full-text search across transcripts and frame descriptions |
| `get_clip_details` | Get full metadata, tags, transcript, and frame descriptions for a clip |
| `get_transcript` | Get the full transcript for a clip |
| `browse_folder` | Browse clips in the library, optionally filtered by folder |
| `index_status` | Get indexing status, dependency check, and background transcription progress |
| `list_tags` | List all tags with clip counts |
| `add_tag` | Add a tag to a clip |

### Indexing pipeline

The full indexing pipeline runs automatically when you ask Claude to index a folder:

1. `index_new_files` scans for videos, extracts metadata with ffprobe, pre-extracts frames to disk, and starts Whisper transcription in a background thread
2. For each clip, `extract_frames` returns the pre-extracted frames instantly as images
3. Claude describes each frame (scene, subjects, actions, lighting, text)
4. `save_frame_descriptions` stores the descriptions in the database
5. Background transcription finishes while Claude works on frame descriptions
6. Everything becomes searchable via `search_footage`

Transcription and frame descriptions run in parallel — Claude doesn't wait for Whisper to finish before starting frame descriptions, and Whisper doesn't wait for Claude. This roughly halves the total indexing time.

The first time Whisper runs, it downloads the model (~139 MB for `base`). This is cached for subsequent runs.

## Using with other MCP clients

TakeSeek is a standard MCP server and works with any MCP-compliant client — not just Claude Code. Most tools (search, browse, transcripts, tags) return plain JSON and work everywhere.

The `extract_frames` tool returns images via MCP's image content type. Your client needs to support image content in tool responses for the frame description workflow to work. This is part of the MCP spec, so most compliant clients should handle it.

## CLI

TakeSeek also includes a standalone CLI for indexing without Claude Code:

```bash
# Index a folder (metadata + transcription)
takeseek-index /path/to/videos --db takeseek.db

# Metadata only (no transcription)
takeseek-index /path/to/videos --metadata-only

# Skip already-indexed files
takeseek-index /path/to/videos --new-only

# Use a larger Whisper model for better accuracy
takeseek-index /path/to/videos --whisper-model medium
```

## Development

```bash
git clone https://github.com/ahumanflourish/takeseek.git
cd takeseek
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
```

## License

All rights reserved. See [LICENSE](LICENSE) for details.
