Metadata-Version: 2.4
Name: ripd
Version: 0.1.0
Summary: GPU frame ripper. Kill ffmpeg.
Author-email: M80AI <contact@m80ai.com>
License: MIT
Project-URL: Homepage, https://github.com/M80AI/ripd
Project-URL: Repository, https://github.com/M80AI/ripd
Project-URL: Issues, https://github.com/M80AI/ripd/issues
Keywords: video,gpu,nvdec,frame-extraction,machine-learning,ffmpeg
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Video
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: opencv-python-headless>=4.8
Requires-Dist: numpy>=1.24
Provides-Extra: deploy
Requires-Dist: modal>=0.60; extra == "deploy"
Provides-Extra: url
Requires-Dist: yt-dlp>=2024.3; extra == "url"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: modal>=0.60; extra == "dev"
Requires-Dist: yt-dlp>=2024.3; extra == "dev"
Dynamic: license-file

<<<<<<< HEAD
# ripd

**GPU frame ripper. Kill ffmpeg.**

Hardware-accelerated video frame extraction using NVIDIA NVDEC. Zero subprocess overhead. Native resolution. 6× faster than ffmpeg on a T4.

```bash
pip install ripd
ripd video.mp4 --output ./frames
```

---

## The problem with ffmpeg

Every ML team extracting training data from video does this:

```python
subprocess.Popen(["ffmpeg", "-i", video, "-vf", "scale=448:256", ...])
```

Three compounding problems:

1. **Subprocess overhead** — one OS process per video clip. At 1,000 clips per tar, that's 1,000 process creations.
2. **CPU decode** — ffmpeg decodes on CPU. Your GPU sits idle while one CPU core maxes out.
3. **Forced downscale** — scripts hardcode a target resolution because that's the only sane way to normalize mixed-resolution datasets when you don't have a proper crop strategy. You permanently discard the original detail.

Result: ~14 hours to extract 100 Kinetics tars. One CPU core saturated. A dataset of 448×256 JPEGs that's already thrown away everything above that resolution.

## The fix

`ripd` uses **PyNvVideoCodec 2.0.2** — NVIDIA's own Python binding for the NVDEC hardware decoder — to decode frames directly on-GPU with zero subprocess overhead and no forced resize.

| Pipeline | Time (100 tars) | CPU impact | Output resolution | Concurrent training |
|---|---|---|---|---|
| ffmpeg subprocess | ~14 hours | 100% one core | Forced 448×256 | Degrades training speed |
| **ripd** | **~2.25 hours** | **~5% (NVDEC only)** | **Native** | **No impact** |

**6× faster. ~0 CPU cost. Full-resolution output.**

The NVDEC engine on any modern NVIDIA GPU is completely separate from the CUDA compute cores. Running it at 100% has zero measurable impact on a concurrently training neural network.

---

## Install

```bash
pip install ripd
```

**Requirements:**
- NVIDIA GPU with NVDEC support (Turing/RTX 20xx or newer recommended)
- CUDA driver ≥ 525
- Python 3.10+
- `PyNvVideoCodec`: `pip install PyNvVideoCodec`

---

## Usage

### CLI

```bash
# Extract frames from a single video at 10 FPS
ripd video.mp4 --output ./frames

# Extract at 5 FPS, PNG format
ripd video.mp4 --output ./frames --fps 5 --format png

# Extract training triplets (im1/im2/im3)
ripd video.mp4 --output ./triplets --mode triplet --max_triplets 10

# Extract from a URL (requires yt-dlp)
ripd --url "https://example.com/clip.mp4" --output ./frames

# Extract from a directory of videos
ripd --videos_dir ./raw_videos --output ./frames

# Extract from Kinetics .tar.gz archives
ripd --tars_dir ./kinetics_tars --output ./triplets --mode triplet

# Cap resolution (aspect-preserving)
ripd video.mp4 --output ./frames --max_size 720

# Dry run — count frames without writing
ripd video.mp4 --output ./frames --dry_run
```

### Python API

```python
import ripd

# Extract frames
n = ripd.extract_frames("clip.mp4", "./frames", fps=10)
print(f"Extracted {n} frames")

# Extract triplets for VSR training
n = ripd.extract_triplets("clip.mp4", "./triplets", max_triplets=5)
print(f"Extracted {n} triplets")

# Download from URL first
video_path = ripd.download_url("https://example.com/clip.mp4", "./tmp")
n = ripd.extract_frames(video_path, "./frames")
```

### Triplet output format

Compatible with PyTorch DataLoader with no intermediate processing:

```
triplets/
  clip_00/
    im1.jpg   ← frame at t₀
    im2.jpg   ← frame at t₁  ← training target (center)
    im3.jpg   ← frame at t₂
  clip_01/
    ...
```

---

## Cloud API

Don't have a local NVIDIA GPU? Use the hosted API:

```bash
# Extract triplets from a video file
curl -X POST "https://seiferm80--ripd-fastapi-app.modal.run/v1/extract/triplets?max_triplets=5" \
  -F "file=@clip.mp4" \
  --output triplets.zip

# Extract frames at 5 FPS
curl -X POST "https://seiferm80--ripd-fastapi-app.modal.run/v1/extract/frames?fps=5" \
  -F "file=@clip.mp4" \
  --output frames.zip
```

Interactive docs: https://seiferm80--ripd-fastapi-app.modal.run/docs

---

## Deploy your own instance

```bash
pip install ripd[deploy]
modal deploy deploy/modal_app.py
```

---

## Why native resolution matters

Pre-scaling to 448×256 before saving means:
- The dataset captures bicubic interpolation artifacts, not real video content
- You permanently lose the ability to train at higher crop sizes later

With native resolution, a 256×256 random crop from a 1080p frame is a window into genuine high-frequency detail — motion blur, film grain, compression artifacts — that the model learns to resolve.

---

## License

Apache 2.0 — see [LICENSE](LICENSE)

Built at [M80AI](https://m80ai.com).
=======
# Ripd 🗡️
**Rips video frames 6x faster than ffmpeg** — built for ML training datasets.

No more waiting. No more CLI hell. Just clean, lossless frames in seconds.
>>>>>>> ade075770daeef017ccd0892d3db0a7a88288a6f
