Metadata-Version: 2.4
Name: claudetube
Version: 0.1.1
Summary: Let Claude watch YouTube videos - transcripts + on-demand frame extraction
Author: Daniel Barrett
License: MIT
Keywords: youtube,video,summarization,whisper,transcription
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: yt-dlp
Requires-Dist: faster-whisper>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: build; extra == "dev"
Dynamic: license-file

<p align="center">
  <img src="logo.png" alt="claudetube" width="500">
</p>

<h1 align="center">claudetube</h1>

<p align="center">
  <strong>Let AI watch and understand online videos.</strong>
</p>

<p align="center">
  <a href="https://github.com/thoughtpunch/claudetube/actions/workflows/ci.yml"><img src="https://github.com/thoughtpunch/claudetube/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="https://pypi.org/project/claudetube/"><img src="https://img.shields.io/pypi/v/claudetube.svg" alt="PyPI"></a>
  <a href="https://pypi.org/project/claudetube/"><img src="https://img.shields.io/pypi/pyversions/claudetube.svg" alt="Python"></a>
  <a href="https://github.com/thoughtpunch/claudetube/blob/main/LICENSE"><img src="https://img.shields.io/github/license/thoughtpunch/claudetube.svg" alt="License"></a>
  <a href="https://github.com/thoughtpunch/claudetube/stargazers"><img src="https://img.shields.io/github/stars/thoughtpunch/claudetube.svg?style=social" alt="Stars"></a>
</p>

---

claudetube downloads online videos, transcribes them with [faster-whisper](https://github.com/SYSTRAN/faster-whisper), and lets AI "see" specific moments by extracting frames on-demand. Built for [Claude Code](https://docs.anthropic.com/en/docs/claude-code) but works as a standalone Python library with any AI tool.

## Quick Start

### Prerequisites

- **Python 3.10+**
- **ffmpeg** (system package)
  ```bash
  # macOS
  brew install ffmpeg

  # Ubuntu/Debian
  sudo apt install ffmpeg
  ```

### Install

```bash
git clone https://github.com/thoughtpunch/claudetube
cd claudetube
./install.sh
```

Or via pip (once published):

```bash
pip install claudetube
```

The installer does three things:
1. Creates a Python venv at `~/.claudetube/venv/`
2. Installs the `claudetube` package + dependencies (yt-dlp, faster-whisper)
3. Copies slash commands to `~/.claude/commands/` (global to all Claude Code sessions)

Restart Claude Code after installing.

### Works from any Claude Code session

The installer puts slash commands in `~/.claude/commands/`, which is the global commands directory. Every Claude Code instance on your machine will have `/yt` available -- no per-project setup needed.

### Why not a pre-built binary?

claudetube depends on faster-whisper (C++ transcription engine) and ffmpeg (system media tool). These have platform-specific native code that can't be bundled into a single static binary. The install script handles all of this automatically.

## Usage with Claude Code

```
/yt https://youtube.com/watch?v=abc123 how did they make the sprites?
```

Claude will:
1. Download and transcribe the video (~60s first time, cached after)
2. Read the transcript
3. If needed, extract frames to "see" specific moments
4. Answer your question

### Commands

| Command | Purpose |
|---------|---------|
| `/yt <url> [question]` | Analyze a video |
| `/yt:see <id> <timestamp>` | Quick frames (general visuals) |
| `/yt:hq <id> <timestamp>` | HQ frames (code, text, diagrams) |
| `/yt:transcript <id>` | Read cached transcript |
| `/yt:list` | List all cached videos |

## Python API

```python
from claudetube import process_video, get_frames_at

# Transcribe a video
result = process_video("https://youtube.com/watch?v=VIDEO_ID")
print(result.transcript_srt.read_text())

# Extract frames at a specific timestamp
frames = get_frames_at("VIDEO_ID", start_time=120, duration=10)
```

## How It Works

1. **Download** -- Fetches lowest quality video (144p) for speed
2. **Transcribe** -- Uses faster-whisper with batched inference
3. **Cache** -- Stores everything at `~/.claude/video_cache/{VIDEO_ID}/`
4. **Drill-in** -- Extract frames on-demand when visual context is needed

### Cache Structure

```
~/.claude/video_cache/
└── dYP2V_nK8o0/
    ├── state.json     # Metadata (title, description, tags, etc.)
    ├── audio.mp3      # Extracted audio
    ├── audio.srt      # Timestamped transcript
    ├── audio.txt      # Plain text transcript
    ├── drill/         # Quick frames (480p)
    └── hq/            # High-quality frames (1280p)
```

## Architecture

claudetube uses a **provider-based architecture**. Video downloading is handled through `yt-dlp`, which currently supports YouTube and [1000+ other sites](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md). The transcription and frame extraction pipeline is provider-agnostic -- it works with any video source that yt-dlp supports, and the architecture is designed to accommodate additional providers in the future.

## Development

```bash
git clone https://github.com/thoughtpunch/claudetube
cd claudetube
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
```

### Linting

```bash
black src/ tests/
isort --profile black src/ tests/
flake8 src/ tests/
```

## Contributing

Contributions are welcome! Please open an issue or submit a pull request.

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/my-feature`)
3. Run tests and linting before committing
4. Open a pull request against `main`

## Legal

This project is **not affiliated with, endorsed by, or associated with YouTube, Google, or Alphabet Inc.** "YouTube" is a trademark of Google LLC. This software is an independent, open-source tool that interacts with publicly available video content through third-party libraries ([yt-dlp](https://github.com/yt-dlp/yt-dlp)). Users are solely responsible for ensuring their use of this software complies with all applicable terms of service and laws.

## License

[MIT](LICENSE) -- free to use, modify, and distribute.
