Metadata-Version: 2.4
Name: whisper-wrap
Version: 0.1.0
Summary: Lightweight faster-whisper batch transcription CLI for researchers
Requires-Python: >=3.10
Requires-Dist: faster-whisper>=1.0.0
Requires-Dist: typer>=0.12.0
Description-Content-Type: text/markdown

# whisper-wrap

Lightweight Python CLI wrapper around `faster-whisper` for bulk transcription.

## Features

- Batch transcribe audio/video files in a directory
- Output formats: `txt`, `srt`, `vtt`, `json`
- Always writes per-file output
- Optional merged output via `--merge`
- Metadata header included (file, duration if available, model, language, device, compute type)
- Progress + summary + per-file errors

## Install (uv)

```bash
uv sync
```

Run without installing globally:

```bash
uv run whisper-wrap --help
```

## Usage

```bash
uv run whisper-wrap /path/to/media \
  --model small \
  --language en \
  --device cpu \
  --compute-type int8 \
  --threads 8 \
  --output-dir outputs \
  --format txt \
  --recursive \
  --ext mp3 --ext wav --ext mp4
```

## Publish

```bash
cp .env.example .env
# add PYPI_API_TOKEN to .env
./scripts/publish.sh
```

### Harvard sentences

```bash
uv run whisper-wrap harvard --format text
uv run whisper-wrap harvard --format json --limit 10
```

Attribution: Harvard Sentences (1965 revised list) from
https://www.cs.columbia.edu/~hgs/audio/harvard.html (IEEE Transactions on Audio and
Electroacoustics, Vol. 17, Issue 3, 1969).

### Common flags

- `--model` default: `small`
- `--language` optional; omitted means auto-detect
- `--device` one of `cpu|cuda`
- `--compute-type` one of `float16|int8`
- `--threads` CPU threads
- `--output-dir` output folder
- `--merge` also write a single `merged.<format>` file
- `--format` one of `txt|srt|vtt|json`
- `--metadata/--no-metadata` include metadata headers (default: on)
- `--ext` repeatable extension filter (with or without `.`)
- `--recursive` scan subdirectories

## Notes

- `faster-whisper` runtime behavior depends on local FFmpeg and model availability.
- For `cuda`, ensure compatible CUDA runtime/drivers are installed.
