Metadata-Version: 2.4
Name: podangelex_JustAnotherCoderTheThird
Version: 1.0.2
Summary: Automatically remove profanity and toxic content from audio files using Whisper and Detoxify
Author-email: Dante Edmiston <dante.edmiston@gmail.com>
License: CC0
Project-URL: Homepage, https://github.com/IDKCoding-commits/PodangelEX
Project-URL: Repository, https://github.com/IDKCoding-commits/PodangelEX
Project-URL: Issues, https://github.com/IDKCoding-commits/PodangelEX/issues
Keywords: audio,profanity,toxicity,whisper,detoxify,speech-to-text
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: whisper-timestamped>=1.15.9
Requires-Dist: openai-whisper>=20231114
Requires-Dist: detoxify>=0.5.1
Requires-Dist: torch>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: isort>=5.0; extra == "dev"
Requires-Dist: flake8>=6.0; extra == "dev"
Dynamic: license-file

# PodangelEX

**Automatically clean audio files by removing profanity and toxic content.**

PodangelEX uses OpenAI's Whisper speech-to-text model combined with the Detoxify toxicity detector to automatically identify and remove profane, toxic, and harmful content from audio files.

## Installation

Requirements: Python 3.8 or later, FFmpeg

```bash
pip install podangelex_JustAnotherCoderTheThird
```

### System Requirements

- **FFmpeg**: Install via:
  - **macOS**: `brew install ffmpeg`
  - **Ubuntu/Debian**: `sudo apt-get install ffmpeg`
  - **Windows**: Download from [ffmpeg.org](https://ffmpeg.org/download.html)

## Quick Start

1. **Install the package** (see above)
2. **Run the setup wizard**:
   ```bash
   podangel
   ```
3. **First run**: The app will auto-create:
   - Configuration directory at `~/.podangelex/`
   - Workspace at `~/podangelex_data/` with folders:
     - `Input/` - Place audio files here
     - `Output/` - Cleaned audio files appear here
     - `.bridge/` - Temporary processing files
4. **Add audio files** to the `Input/` folder
5. **Run again**: `podangel` to clean your files
6. **Get results** from the `Output/` folder

## How It Works

### Step 1: Transcription
Whisper transcribes your audio file to text with word-level timestamps.

### Step 2: Toxicity Detection
Two-phase approach:
- **Exact matching**: Checks transcribed words against a built-in profanity list
- **Context-aware**: Uses machine learning to detect toxic phrases even if not on the word list

### Step 3: Audio Cutting
FFmpeg extracts only the clean portions of audio and concatenates them.

## Configuration

On first run, you'll be asked to configure:

### Model Size
- **tiny** (1GB) - Fastest, ~60% accuracy
- **base** (1GB) - Fast, ~70% accuracy  
- **small** (2GB) - Balanced, ~75% accuracy (recommended)
- **medium** (5GB) - Better, ~80% accuracy
- **large** (10GB) - Best, ~85% accuracy
- **turbo** (6GB) - Latest, ~80% accuracy

### Workers
Number of parallel files to process. Use more workers if you have lots of VRAM and many files.

### Thresholds
Fine-tune what gets flagged as toxic (0-1 scale, higher = stricter):
- **toxicity (t)**: General profanity
- **severe_toxicity (st)**: Severe language
- **obscene (o)**: Obscene content
- **threat (th)**: Threats
- **insult (i)**: Insults
- **identity_attack (id)**: Slurs/hate speech

## Environment Variables

Optional customization:
- `PODANGELEX_HOME` - Custom config directory (default: `~/.podangelex/`)
- `PODANGELEX_WORKSPACE` - Custom workspace location (default: `~/podangelex_data/`)

## Troubleshooting

### "ffmpeg: command not found"
Install FFmpeg using the commands above.

### "ModuleNotFoundError: No module named 'whisper'"
Reinstall the package: `pip install --upgrade podangelex_JustAnotherCoderTheThird`

### Audio not being cleaned properly
Adjust toxicity thresholds by re-running `podangel` and selecting option (1) to reconfigure.

## License

CC0 1.0 Universal - Public Domain

## AI Declaration

I used some AI to help debug the code, provide commit messages on Github, and to organize the files for package uploading
