Metadata-Version: 2.4
Name: soundhub_utils
Version: 0.0.6
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: tf_keras
Requires-Dist: numpy>=1.26.0
Requires-Dist: pandas>=2.3.3
Requires-Dist: boto3<2,>=1.40.65
Requires-Dist: matplotlib<4,>=3.10.8
Requires-Dist: google-cloud-storage<4,>=3.4.1
Requires-Dist: librosa<0.12,>=0.11.0
Requires-Dist: typing_extensions<5,>=4.15.0
Requires-Dist: referencing<0.38,>=0.37.0
Requires-Dist: tensorflow<3,>=2.16.0
Requires-Dist: duckdb<2,>=1.4.1
Provides-Extra: dev
Requires-Dist: pycodestyle<3,>=2.14.0; extra == "dev"
Requires-Dist: ipykernel<7,>=6.30.1; extra == "dev"
Requires-Dist: jupyterlab<5,>=4.4.9; extra == "dev"
Requires-Dist: pytest<10,>=9.0.2; extra == "dev"
Dynamic: license-file

# Soundhub Utils

Audio processing utilities library for SoundHub model integration. This package provides core functionality for:

1. Reading audio files from AWS S3, Google Cloud Storage, HTTPS URLs, and local filesystems
2. Standard audio preprocessing (resampling, segmentation, format conversion)
3. Spectrogram generation using multiple backends (TensorFlow, PyTorch, librosa)
4. Unified I/O interface across different storage platforms

**Note**: This is a utilities library. For running models (like OWL), see [`soundhub_model_runner`](https://github.com/your-org/soundhub_model_runner).

## Table of Contents

- [Installation](#installation)
- [Core Modules](#core-modules)
- [Usage Examples](#usage-examples)
- [Style Guide](#style-guide)

---

## Installation

### Using Pixi (Recommended for Development)

Requirements are managed through [Pixi](https://pixi.sh/latest). Install pixi, then:

```bash
# Run commands in the pixi environment
pixi run python -c "import soundhub_utils; print(soundhub_utils.__version__)"

# Launch jupyter for development
pixi run jupyter lab .
```

The `pyproject.toml` includes `soundhub_utils = { path = ".", editable = true }`, so no separate installation is needed.

### Using pip

```bash
# Install from local directory
pip install -e .

# Or install specific version (when published to PyPI)
pip install soundhub-utils
```

---

## Core Modules

### I/O (soundhub_utils.io)

Unified interface for reading/writing audio files across multiple storage platforms:

- **`soundhub_utils.io.aws`**: AWS S3 integration with partial FLAC download support
- **`soundhub_utils.io.gcs`**: Google Cloud Storage operations
- **`soundhub_utils.io.local`**: Local filesystem operations
- **`soundhub_utils.io.url`**: HTTP/HTTPS streaming and downloads
- **`soundhub_utils.io.io`**: Unified interface that auto-routes based on URI scheme

### Audio Processing (soundhub_utils.audio)

Audio processing backends for spectrogram generation and preprocessing:

- **`soundhub_utils.audio.tensorflow_audio`**: TensorFlow-native audio processing

### Utilities (soundhub_utils.utils)

Helper functions for audio processing:

- **`soundhub_utils.utils.audio`**: Format conversion and validation
- **`soundhub_utils.utils.flac`**: FLAC metadata extraction and time-range processing
- **`soundhub_utils.names`**: File naming conventions for audio and spectrograms

---

## Usage Examples

### Reading Audio from Different Sources

```python
from soundhub_utils.io import io

# Read from S3
audio_data = io.read_flac("s3://bucket/path/audio.flac")

# Read from Google Cloud Storage
audio_data = io.read_flac("gs://bucket/path/audio.flac")

# Read from HTTPS URL
audio_data = io.read_flac("https://example.com/audio.flac")

# Read from local file
audio_data = io.read_flac("/path/to/audio.flac")

# Read partial FLAC (time range)
audio_data = io.read_partial_flac(
    "s3://bucket/audio.flac",
    start_time=10.0,  # seconds
    duration=30.0      # seconds
)
```

### Generating Spectrograms

```python
from soundhub_utils.audio import tensorflow_audio

# Generate spectrograms from audio file
spectrograms = tensorflow_audio.generate_spectrograms(
    audio_path="/path/to/audio.flac",
    sample_rate=8000,
    segment_duration=12.0,
    spectrogram_shape=[257, 1000]
)
```

---

## Style Guide

Following PEP8. See [setup.cfg](./setup.cfg) for exceptions. Use `pycodestyle .` to check compliance.

---

## License

BSD 3-Clause
