Metadata-Version: 2.4
Name: auto-subs
Version: 0.3.0
Summary: A powerful, local-first library and CLI for video transcription and subtitle generation using Whisper.
Project-URL: Homepage, https://github.com/mateusz-kow/auto-subs
Project-URL: Repository, https://github.com/mateusz-kow/auto-subs
Author-email: Mateusz Kowalski <kowalski.mateusz.1lo1@gmail.com>
License: Copyright 2025 Mateusz Kowalski
        
        Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Multimedia :: Video
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing
Requires-Python: >=3.11
Requires-Dist: pydantic~=2.12.0
Requires-Dist: typer~=0.19.2
Provides-Extra: dev
Requires-Dist: mypy~=1.18.2; extra == 'dev'
Requires-Dist: pre-commit~=4.3.0; extra == 'dev'
Requires-Dist: pytest-cov~=7.0.0; extra == 'dev'
Requires-Dist: pytest~=8.4.2; extra == 'dev'
Requires-Dist: ruff~=0.14.0; extra == 'dev'
Provides-Extra: transcribe
Requires-Dist: openai-whisper; extra == 'transcribe'
Description-Content-Type: text/markdown

<div align="center">
  <p>
    <a href="README.md"><strong>README</strong></a> &nbsp;&middot;&nbsp;
    <a href="CONTRIBUTING.md"><strong>Contributing</strong></a> &nbsp;&middot;&nbsp;
    <a href="LICENSE"><strong>License</strong></a>
  </p>
  <br>
  <img src="https://raw.githubusercontent.com/mateusz-kow/auto-subs/refs/heads/assets/assets/logo.png" alt="Auto-Subs Logo" width="150">
  <h1>Auto-Subs</h1>
  <strong>Effortless Subtitle Generation from Whisper Transcriptions.</strong>
  <p>A powerful, local-first library and CLI for generating subtitles with precise, word-level accuracy.</p>
</div>

<div align="center">

[![PyPI Version](https://img.shields.io/pypi/v/auto-subs?color=blue&logo=pypi&logoColor=white)](https://pypi.org/project/auto-subs/)
[![CI Status](https://github.com/mateusz-kow/auto-subs/actions/workflows/ci.yml/badge.svg)](https://github.com/mateusz-kow/auto-subs/actions/workflows/ci.yml)
[![Code Coverage](https://codecov.io/gh/mateusz-kow/auto-subs/graph/badge.svg)](https://codecov.io/gh/mateusz-kow/auto-subs)
<br />
[![Code style: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Types: Mypy](https://img.shields.io/badge/Types-Mypy-blue.svg)](https://mypy-lang.org/)
[![License: MIT](https://img.shields.io/pypi/l/auto-subs)](https://opensource.org/licenses/MIT)

</div>

---

**Auto-Subs** bridges the gap between raw transcription data and perfectly formatted subtitles. Whether you're a developer integrating transcription into your application or a content creator needing quick subtitles, `auto-subs` provides a robust, simple, and reliable solution.

## Key Features

- **🎯 Intelligent Word Segmentation**: Automatically splits word-level transcriptions into perfectly timed subtitle lines based on character limits and natural punctuation breaks.
- **⚙️ Simple & Powerful API**: Use it as a library with a clean, dictionary-based input that requires no complex objects, or as a feature-rich command-line tool.
- **🛡️ Robust Validation**: Automatically handles common data issues, like inverted timestamps (`start > end`), ensuring your process never breaks on imperfect data.
- **📄 Multiple Formats**: Generate subtitles in the most popular formats: **SRT**, **ASS**, and plain **TXT**.
- **✅ High Quality & Tested**: Strictly typed with Mypy, linted with Ruff, and rigorously tested to ensure reliability.

## Installation

```bash
pip install auto-subs
```

## Quickstart

### As a Command-Line Tool (CLI)

The fastest way to generate a subtitle file from a Whisper-compatible JSON.

```bash
# Generate an SRT file with default settings
auto-subs generate path/to/transcription.json

# Generate a styled ASS file with a custom character limit
auto-subs generate input.json -f ass -o styled.ass --max-chars 42
```

**CLI Options:**
- `--output, -o`: Specify the output file path. (Defaults to the input filename with a new extension)
- `--format, -f`: Choose the output format (`srt`, `ass`, `txt`). (Defaults to `srt`)
- `--max-chars`: Set the maximum characters per subtitle line. (Defaults to `35`)

### As a Python Library

Integrate `auto-subs` directly into your application for full control.

```python
import json
from auto_subs import generate

# 1. Load your Whisper-compatible transcription data (as a dict)
with open("path/to/transcription.json", "r", encoding="utf-8") as f:
    transcription_data = json.load(f)

try:
    # 2. Generate SRT content with a 40-character limit per line
    srt_content = generate(transcription_data, "srt", max_chars=40)

    # 3. Save the content to a file
    with open("output.srt", "w", encoding="utf-8") as f:
        f.write(srt_content)

    print("Successfully generated subtitles!")

except ValueError as e:
    # Handle validation errors for malformed input data
    print(f"Error: {e}")
```

## API Design: Simplicity First

The public API of `auto-subs` is designed to be as simple as possible. All functions, like `auto_subs.generate()`, accept a standard Python dictionary (`dict`).

This approach was chosen intentionally to:
- **Reduce Friction:** You can directly use the JSON output from Whisper after loading it into a dictionary, without needing to import and instantiate our internal Pydantic models.
- **Decouple Your Code:** Your project doesn't need to depend on our internal data structures, making your code more resilient to future updates.

While the input is a simple dictionary, `auto-subs` performs robust internal validation to ensure the data is well-formed, giving you the best of both worlds: a simple API and the safety of strong data validation.

## Contributing

Contributions are welcome! If you find a bug or have a feature request, please open an issue. If you'd like to contribute code, please open a pull request.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
