Metadata-Version: 2.1
Name: youscribe
Version: 0.0.1
Summary: 
Author: Mat Bettinson
Author-email: mat.bettinson@gmail.com
Requires-Python: >=3.10,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: beautifulsoup4 (>=4.12.2,<5.0.0)
Requires-Dist: faster-whisper (>=0.10.0,<0.11.0)
Requires-Dist: requests (>=2.31.0,<3.0.0)
Requires-Dist: yt-dlp (>=2023.11.16,<2024.0.0)
Description-Content-Type: text/markdown

# Transcribe YouTube videos using Whisper models

Adopts [faster_whisperer](https://github.com/SYSTRAN/faster-whisper), a cTransformer's based model for faster transcription.

## Usage

```python
from youtescribe import transcribe

transcript = transcribe(url="https://www.youtube.com/watch?v=9bZkp7q19f0")

transcript.text()
```

### Prompting

By default, the video title and description are used as prompts to the transcription model. But you can also specify your own prompt:

```python
transcript = transcribe(
    url="https://www.youtube.com/watch?v=9bZkp7q19f0",
    prompt="Enter prompt here"
)
```

You can also choose not to include prompt by setting `prompt=False`.

```python
transcript = transcribe(
    url="https://www.youtube.com/watch?v=9bZkp7q19f0",
    prompt=False
)
```

### Working with `WhisperTranscript` objects

The `transcribe()` function, if executed successfully, will return a `WhisperTranscript` object. You can view the transcript as plain text, SRT-formatted text, or a Python dictionary.

```python
transcript = transcribe(
    url="https://www.youtube.com/watch?v=9bZkp7q19f0",
    prompt=False
)

transcript.text()
transcript.srt()
transcript.json()
transcript.segment
```

### Customise Whisper model

In the transcribe function, you can pass your own custom Whisper model:

```python
from youtescribe import WhisperTranscriber
from youtescribe import models

custom_transcriber = WhisperTranscriber(model_size = models.TINY_EN, cpu_threads=6, device="auto")

transcript = transcribe(
    url="https://www.youtube.com/watch?v=9bZkp7q19f0",
    transcriber=custom_transcriber
)
transcript.text()
```



