Metadata-Version: 2.4
Name: fonadalabs
Version: 1.0.1
Summary: Unified Python SDK for FonadaLabs Text-to-Speech, Automatic Speech Recognition, and Audio Denoising APIs
Home-page: https://github.com/fonadalabs/fonadalabs-sdk
Author: FonadaLabs
Author-email: FonadaLabs <support@fonadalabs.com>
License: MIT
Project-URL: Homepage, https://github.com/fonadalabs/fonadalabs-sdk
Project-URL: Documentation, https://github.com/fonadalabs/fonadalabs-sdk#readme
Project-URL: Repository, https://github.com/fonadalabs/fonadalabs-sdk
Project-URL: Bug Tracker, https://github.com/fonadalabs/fonadalabs-sdk/issues
Keywords: text-to-speech,speech-recognition,audio-denoising,tts,asr,denoise,fonadalabs,speech,audio,ai,machine-learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx<1.0,>=0.24
Requires-Dist: websockets<13,>=11
Requires-Dist: loguru<1.0,>=0.7
Requires-Dist: requests<3.0,>=2.28
Requires-Dist: numpy<2.0,>=1.21
Provides-Extra: ws
Requires-Dist: soundfile<0.14,>=0.12; extra == "ws"
Requires-Dist: websocket-client<2.0,>=1.5; extra == "ws"
Provides-Extra: denoise
Requires-Dist: soundfile<0.14,>=0.12; extra == "denoise"
Requires-Dist: librosa<1.0,>=0.10; extra == "denoise"
Requires-Dist: websocket-client<2.0,>=1.5; extra == "denoise"
Provides-Extra: dev
Requires-Dist: pytest<8.0,>=7.0; extra == "dev"
Requires-Dist: black<24.0,>=23.0; extra == "dev"
Requires-Dist: isort<6.0,>=5.0; extra == "dev"
Requires-Dist: python-dotenv<2.0,>=1.0; extra == "dev"
Requires-Dist: nest-asyncio<2.0,>=1.5; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Provides-Extra: all
Requires-Dist: soundfile<0.14,>=0.12; extra == "all"
Requires-Dist: librosa<1.0,>=0.10; extra == "all"
Requires-Dist: websocket-client<2.0,>=1.5; extra == "all"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# FonadaLabs SDK

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Version 1.0.0](https://img.shields.io/badge/version-1.0.0-green.svg)](https://github.com/fonadalabs/fonadalabs-sdk)

Unified Python SDK for FonadaLabs **Text-to-Speech (TTS)**, **Automatic Speech Recognition (ASR)**, and **Audio Denoising** APIs.

## Table of Contents

- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
  - [Text-to-Speech (TTS)](#text-to-speech-tts)
  - [Automatic Speech Recognition (ASR)](#automatic-speech-recognition-asr)
  - [Audio Denoising](#audio-denoising)
- [Authentication](#authentication)
- [Advanced Features](#advanced-features)
- [Error Handling](#error-handling)
- [Security Features](#security-features)
- [Documentation](#documentation)
- [Examples](#examples)
- [Package Structure](#package-structure)
- [Importing](#importing)
- [Requirements](#requirements)
- [License](#license)
- [Support](#support)

## Features

### Text-to-Speech (TTS)
- 🎙️ High-quality text-to-speech generation with multiple voices
- 🚀 HTTP POST and WebSocket support
- 📊 Real-time progress tracking
- ⚡ Async support for concurrent requests
- 🎵 Audio streaming with chunk callbacks
- 🔒 Secure API key authentication
- ⚠️ Built-in error handling for rate limits and credit exhaustion

### Automatic Speech Recognition (ASR)
- 🎤 Audio file transcription
- 🌐 WebSocket streaming for real-time transcription
- 🔄 Concurrent batch processing
- 🌍 Multi-language support (50+ languages)
- 🔒 Secure API key authentication
- ⚠️ Comprehensive error handling

### Audio Denoising
- 🔇 High-quality audio denoising (DeepFilterNet + CMGAN)
- 🎯 Full audio and streaming chunk processing
- ⚡ Real-time WebSocket streaming with progress callbacks
- 📦 Batch processing support
- 🔒 Secure API key authentication
- ⚠️ Built-in rate limit and credit management

## Installation

### From PyPI (Recommended)

```bash
pip install fonadalabs
```

### From Source (Development)

```bash
git clone https://github.com/fonadalabs/fonadalabs-sdk.git
cd fonadalabs-sdk
pip install -e .
```

### With Optional Dependencies

```bash
# For WebSocket support (TTS + ASR streaming)
pip install fonadalabs[ws]

# For audio denoising features
pip install fonadalabs[denoise]

# Install everything
pip install fonadalabs[all]

# For development
pip install fonadalabs[dev]
```

## Quick Start

### Text-to-Speech (TTS)

```python
from fonadalabs import TTSClient, TTSError, TTSCreditsExhaustedError, TTSRateLimitError

# Initialize with API key (or set FONADALABS_API_KEY env variable)
client = TTSClient(api_key="your-api-key-here")

try:
    # Generate audio
    audio_data = client.generate_audio(
        text="Hello! Welcome to FonadaLabs TTS.",
        voice="Anuradha",
        output_file="output.mp3"
    )
    print(f"✓ Generated {len(audio_data)} bytes")
    
except TTSCreditsExhaustedError:
    print("⚠️ API credits exhausted. Please add more credits.")
except TTSRateLimitError:
    print("⚠️ Rate limit exceeded. Please try again later.")
except TTSError as e:
    print(f"❌ TTS Error: {e}")
```

### Automatic Speech Recognition (ASR)

```python
from fonadalabs import ASRClient, ASRCreditsExhaustedError, ASRRateLimitError

# Initialize with API key (or set FONADALABS_API_KEY env variable)
asr_client = ASRClient(api_key="your-api-key-here")

try:
    # Transcribe audio file
    result = asr_client.transcribe(
        audio_path="audio.wav",
        language="en"
    )
    print(f"✓ Transcription: {result.text}")
    
except ASRCreditsExhaustedError:
    print("⚠️ API credits exhausted. Please add more credits.")
except ASRRateLimitError:
    print("⚠️ Rate limit exceeded. Please try again later.")
except Exception as e:
    print(f"❌ ASR Error: {e}")
```

### Audio Denoising

```python
from fonadalabs import (
    DenoiseHttpClient, 
    DenoiseStreamingClient,
    DenoiseCreditsExhaustedError,
    DenoiseRateLimitError
)

try:
    # Full audio denoising (HTTP)
    http_client = DenoiseHttpClient(api_key="your-api-key-here")
    denoised = http_client.denoise_file("noisy.wav", "clean.wav")
    print("✓ Denoised audio saved to clean.wav")
    
    # Streaming denoising with progress
    streaming_client = DenoiseStreamingClient(api_key="your-api-key-here")
    
    def progress_callback(current, total):
        percent = (current / total) * 100
        print(f"Progress: {current}/{total} chunks ({percent:.1f}%)")
    
    denoised = streaming_client.denoise_file(
        "noisy.wav", 
        "clean.wav",
        progress_callback=progress_callback
    )
    print("✓ Streaming denoising complete!")
    
except DenoiseCreditsExhaustedError:
    print("⚠️ API credits exhausted. Please add more credits.")
except DenoiseRateLimitError:
    print("⚠️ Rate limit exceeded. Please try again later.")
except Exception as e:
    print(f"❌ Denoise Error: {e}")
```

## Authentication

All FonadaLabs APIs require API key authentication. You can obtain your API key from the [FonadaLabs Dashboard](https://fonadalabs.com/dashboard).

### Method 1: Environment Variable (Recommended)

```bash
# Set environment variable
export FONADALABS_API_KEY=your-api-key-here

# Or add to .env file
echo "FONADALABS_API_KEY=your-api-key-here" >> .env
```

Then use the SDK without passing the key:

```python
from fonadalabs import TTSClient, ASRClient, DenoiseHttpClient

# API key is automatically loaded from environment
tts_client = TTSClient()
asr_client = ASRClient()
denoise_client = DenoiseHttpClient()
```

### Method 2: Pass Directly in Code

```python
from fonadalabs import TTSClient, ASRClient, DenoiseHttpClient

tts_client = TTSClient(api_key="your-api-key")
asr_client = ASRClient(api_key="your-api-key")
denoise_client = DenoiseHttpClient(api_key="your-api-key")
```

**⚠️ Security Note:** Never hardcode API keys in your source code. Always use environment variables or secure key management systems.

## Advanced Features

### WebSocket Streaming (TTS)

Stream audio with real-time progress updates:

```python
from fonadalabs import TTSClient

client = TTSClient(api_key="your-api-key")

def on_progress(progress_data):
    print(f"Progress: {progress_data['percent']}%")

def on_chunk(audio_chunk):
    print(f"Received chunk: {len(audio_chunk)} bytes")

audio = client.generate_audio_ws(
    text="Long text for streaming...",
    voice="Anuradha",
    output_file="output.wav",
    on_progress=on_progress,
    on_chunk=on_chunk
)
```

### Async Operations (TTS)

Use async methods for concurrent requests:

```python
import asyncio
from fonadalabs import TTSClient

client = TTSClient(api_key="your-api-key")

async def generate_multiple():
    tasks = [
        client.generate_audio_async("Text 1", "Anuradha", "output1.mp3"),
        client.generate_audio_async("Text 2", "Ravi", "output2.mp3"),
        client.generate_audio_async("Text 3", "Anuradha", "output3.mp3"),
    ]
    results = await asyncio.gather(*tasks)
    return results

audio_files = asyncio.run(generate_multiple())
```

### WebSocket Streaming (ASR)

Real-time transcription with WebSocket:

```python
import asyncio
from fonadalabs import ASRWebSocketClient

# Initialize with token (or set FONADALABS_API_KEY env variable)
ws_client = ASRWebSocketClient(
    url="wss://your-websocket-endpoint/v1/asr/stream",
    token="your-api-key",
    use_ssl=True
)

# Transcribe using async method
async def transcribe():
    result = await ws_client.transcribe_file(
        file_path="audio.wav",
        language_id="en"
    )
    print(f"Transcription: {result}")

asyncio.run(transcribe())
```

### Batch Processing (ASR)

Process multiple audio files concurrently:

```python
from fonadalabs import ASRClient

client = ASRClient(api_key="your-api-key")

# List of audio files to transcribe
file_paths = ["audio1.wav", "audio2.wav", "audio3.wav"]

# Batch transcribe with custom concurrency
results = client.batch_transcribe(
    file_paths=file_paths,
    language_id="en",
    concurrency=3
)

# Process successful transcriptions
for result in results.successful:
    print(f"✓ {result.file_path}: {result.text}")

# Handle failed transcriptions
for failed in results.failed:
    print(f"✗ {failed.file_path}: {failed.error}")
```

## Error Handling

The SDK provides specific exception types for different error scenarios:

### TTS Exceptions

```python
from fonadalabs import (
    TTSError,                    # Base exception
    TTSCreditsExhaustedError,    # Credits exhausted (402)
    TTSRateLimitError            # Rate limit exceeded (429)
)
```

### ASR Exceptions

```python
from fonadalabs import (
    ASRSDKError,                 # Base exception
    AuthenticationError,         # Invalid API key
    ValidationError,             # Invalid parameters
    HTTPRequestError,            # HTTP request failed
    ServerError,                 # Server error (500+)
    ASRRateLimitError,          # Rate limit exceeded
    ASRTimeoutError,            # Request timeout
    ASRCreditsExhaustedError    # Credits exhausted
)
```

### Denoise Exceptions

```python
from fonadalabs import (
    DenoiseError,                    # Base exception
    DenoiseCreditsExhaustedError,    # Credits exhausted
    DenoiseRateLimitError            # Rate limit exceeded
)
```

## Security Features

### 🔒 Base URL Lockdown

All SDK clients use **hardcoded, secure base URLs** that cannot be overridden. This prevents:
- URL injection attacks
- Data exfiltration attempts
- Man-in-the-middle attacks

```python
# ✅ SECURE: Base URLs are locked
client = TTSClient(api_key="your-key")

# ❌ PREVENTED: Cannot override base URL
# client = TTSClient(api_key="key", base_url="http://malicious.com")  # Not allowed
```

Base URLs can only be configured via environment variables by authorized administrators:
```bash
export FONADALABS_API_URL=https://your-secure-endpoint.com
```

### 🔐 API Key Validation

All API requests are validated:
- API keys are required for all endpoints
- Invalid keys return `401 Unauthorized`
- Keys are transmitted securely via HTTPS
- Never logged or exposed in error messages

## Documentation

- **TTS Documentation:** See [TEXT_TO_SPEECH_QUICKSTART.md](tts_sdk/TEXT_TO_SPEECH_QUICKSTART.md)
- **ASR Documentation:** See [ASR_AUTHENTICATION.md](ASR_AUTHENTICATION.md)
- **Denoise Documentation:** See [denoise_sdk/README.md](denoise_sdk/README.md)
- **Security Audit:** See [SECURITY_AUDIT_REPORT.md](SECURITY_AUDIT_REPORT.md)

## Examples

### TTS Examples
Located in `tts_sdk/examples/`:
- `basic_usage.py` - Simple HTTP generation
- `websocket_usage.py` - WebSocket with progress tracking
- `async_usage.py` - Concurrent requests
- `streaming_usage.py` - Audio chunk streaming
- `auth_usage.py` - Authentication examples

### ASR Examples
Located in `asr_sdk/examples/`:
- `single_transcribe.py` - Single file transcription
- `concurrent_transcribe.py` - Batch processing
- `ws_transcribe.py` - WebSocket streaming
- `cli.py` - Command-line interface

### Denoise Examples
Located in `denoise_sdk/`:
- `sdk_test.py` - Quick start examples for HTTP and WebSocket denoising

## Package Structure

```
fonadalabs/
├── __init__.py                    # Unified package exports
├── tts/                           # TTS submodule
│   ├── __init__.py
│   └── client.py                 # TTSClient
├── asr/                           # ASR submodule
│   ├── __init__.py
│   ├── client.py                 # ASRClient
│   ├── ws_client.py              # ASRWebSocketClient
│   ├── config.py                 # Configuration
│   ├── exceptions.py             # ASR exceptions
│   ├── languages.py              # Language utilities
│   ├── utils.py                  # Utility functions
│   └── models/                   # Data models
│       └── types.py
└── denoise/                       # Denoise submodule
    ├── __init__.py
    ├── http_client.py            # DenoiseHttpClient
    ├── streaming_client.py       # DenoiseStreamingClient
    └── exceptions.py             # Denoise exceptions
```

## Importing

### All Three SDKs
```python
from fonadalabs import (
    TTSClient,
    ASRClient,
    DenoiseHttpClient,
    DenoiseStreamingClient
)

tts = TTSClient(api_key="your-key")
asr = ASRClient(api_key="your-key")
denoise = DenoiseHttpClient(api_key="your-key")
```

### TTS Only
```python
from fonadalabs import TTSClient, TTSError, TTSCreditsExhaustedError
# or explicitly from submodule
from fonadalabs.tts import TTSClient, TTSError
```

### ASR Only
```python
from fonadalabs import ASRClient, ASRWebSocketClient
# or explicitly from submodule
from fonadalabs.asr import ASRClient, ASRWebSocketClient
```

### Denoise Only
```python
from fonadalabs import DenoiseHttpClient, DenoiseStreamingClient
# or explicitly from submodule
from fonadalabs.denoise import DenoiseHttpClient, DenoiseStreamingClient
```

## Requirements

### Core Dependencies
- **Python** >= 3.9
- **httpx** >= 0.24, < 1.0 (HTTP client)
- **websockets** >= 11, < 13 (WebSocket support)
- **loguru** >= 0.7, < 1.0 (Logging)
- **requests** >= 2.28, < 3.0 (HTTP requests)
- **numpy** >= 1.21, < 2.0 (Audio processing)

### Optional Dependencies

**For WebSocket features (`pip install fonadalabs[ws]`):**
- soundfile >= 0.12, < 0.14
- websocket-client >= 1.5, < 2.0

**For Audio Denoising (`pip install fonadalabs[denoise]`):**
- soundfile >= 0.12, < 0.14
- librosa >= 0.10, < 1.0
- websocket-client >= 1.5, < 2.0

**For Development (`pip install fonadalabs[dev]`):**
- pytest >= 7.0, < 8.0
- black >= 23.0, < 24.0
- isort >= 5.0, < 6.0
- python-dotenv >= 1.0, < 2.0
- nest-asyncio >= 1.5, < 2.0

## Contributing

We welcome contributions! Please see our contributing guidelines and feel free to submit pull requests.

## License

MIT License - see [LICENSE](LICENSE) file for details.

Copyright (c) 2025 FonadaLabs

## Support

- 📧 **Email:** support@fonadalabs.com
- 🐛 **Issues:** [GitHub Issues](https://github.com/fonadalabs/fonadalabs-sdk/issues)
- 📖 **Documentation:** 
  - [TTS Quickstart](tts_sdk/TEXT_TO_SPEECH_QUICKSTART.md)
  - [ASR Authentication](ASR_AUTHENTICATION.md)
  - [Denoise SDK](denoise_sdk/README.md)
- 🌐 **Website:** https://fonadalabs.com
- 💬 **Community:** [Discord](https://discord.gg/fonadalabs) (if available)

## Version

**Current version:** 1.0.0 (Unified SDK)

### Version History
- **v1.0.0** (2025-10-16): Unified package with TTS, ASR, and Denoise
  - Base URL security lockdown
  - Required API key authentication for all endpoints
  - Comprehensive error handling with specific exception types
  - WebSocket streaming support for all services
  - Async/await support
  - Batch processing capabilities

---

**Made with ❤️ by FonadaLabs**


