Metadata-Version: 2.4
Name: ytdebunk
Version: 1.0.3
Summary: A CLI tool to download audio from a YouTube video, transcribe it, and refine the transcription using AI.
Home-page: https://github.com/hissain/youtuber-debunked
Author: Md. Sazzad Hissain Khan
Author-email: hissain.khan@gmail.com
Keywords: youtube,transcription,audio,refinement,ai,bangla,bengali,geminai,librosa
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<2
Requires-Dist: python-dotenv==1.0.1
Requires-Dist: google-generativeai==0.8.4
Requires-Dist: yt-dlp==2025.3.21
Requires-Dist: torch==2.1.0
Requires-Dist: torchaudio==2.1.0
Requires-Dist: librosa==0.11.0
Requires-Dist: transformers==4.36.2
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license-file
Dynamic: requires-dist
Dynamic: summary

# ytdebunk  

## Overview  
`ytdebunk` is a command-line tool designed to:  
- Download audio from YouTube videos.  
- Transcribe the audio content.  
- Optionally enhance the transcription using the **Gemini API**.  
- Optionally detect logical faults in the transctiption using the **Gemini API**.  

This tool is particularly useful for analyzing transcriptions to identify **logical fallacies** and **incorrect claims** made by YouTubers.  

## Installation 

For avoiding conflicts better create a virtual environment and start working on it:

```sh
python3.11 -m venv .venv
source .venv/bin/activate
```

Now, you can install from PyPI using,

```sh
pip install ytdebunk
```

Alternatively, for latest updated please try installing directly from Github using:

```sh
pip install git+https://github.com/hissain/youtuber-debunked.git
```

## Usage  

The `ytdebunk.py` script provides a **command-line interface (CLI)** with several options.  

### **Arguments**  
- `video_url` (**str**) – URL of the YouTube video to download audio from.  

### **Options**  
| Option                  | Description |
|-------------------------|-------------|
| `-e, --enhance` (bool) | Enhance the transcription using the **Gemini API**. *(Default: False)* |
| `-d, --detect` (bool) | Detect logical faults in the transcription using **Gemini API**. *(Default: False)* |
| `-v, --verbose` (bool) | Increase output verbosity. |
| `-t, --token` (str) | API token for the **Gemini API** *(Required if `--enhance` or `--detect`is enabled)*. |
| `-st, --start_time` (float) | Start time of the audio clip in seconds |
| `-et, --end_time` (float) | End time of the audio clip in seconds |

### **Example Usage**  

```bash
ytdebunk "https://www.youtube.com/watch?v=example" -e -d -v -t YOUR_GEMINI_API_TOKEN
```


```bash
export GEMINI_API_TOKEN="your_api_key"
ytdebunk "https://www.youtube.com/watch?v=example" -e -d -v #when Gemini API key is in environment
```

See an example notebook [Example Notebook](experiment/exp.ipynb) file for details usage.  

## **Environment Variables**  
If preferred, you can set the **Gemini API token** as an environment variable instead of passing it as a CLI argument:

```sh
export GEMINI_API_TOKEN="your_api_key"
```

## **Detailed Process**  

1. **Download Audio**  
   - Uses the `download_audio` function from `ytdebunk.downloader` to download audio from the given YouTube URL.  

2. **Transcribe Audio**  
   - Uses the `transcribe_audio` function from `ytdebunk.transcriber` to generate a text transcription.  

3. **Enhance Transcription** *(Optional)*  
   - If `--enhance` is enabled, the script uses `enhance_transcription` from `ytdebunk.refiner` to refine the transcription using the **Gemini API**.  
   - The API token must be provided via `--token` or as an **environment variable**.  

3. **Detect Logical Faults** *(Optional)*  
   - If `--detect` is enabled, the script uses `detect_logical_faults` from `ytdebunk.philosopher` to detect logical fults, fallacies, bias, irony and so on in the transcription using the **Gemini API**.  
   - The API token must be provided via `--token` or as an **environment variable**.  

5. **Save Transcription**  
   - The final transcription and logical faults (raw or enhanced) are saved to the ./download folder.  

## **Error Handling**  
- If `--enhance` or `--detect` are enabled but no **Gemini API token** is provided, the script prints an **error message** and exits.

## **License**  
This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.  


## Contribution and Contact

You can fork this project and submit pull request in the project. 
Please contact to the author at hissain.khan@gmail.com
