Metadata-Version: 2.4
Name: streamsad
Version: 0.1.4
Summary: A light-weight, streaming speech activity detection (SAD) module, designed for real-time speech-only classification
Author-email: Mohammad Hassan Sohan Ajini <mohamad.hasan.sohan.ajini@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/mohamad-hasan-sohan-ajini/streamsad
Project-URL: Repository, https://github.com/mohamad-hasan-sohan-ajini/streamsad
Project-URL: Issues, https://github.com/mohamad-hasan-sohan-ajini/streamsad/issues
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Multimedia :: Sound/Audio :: Analysis
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy==1.26.3
Requires-Dist: onnxruntime==1.20.1
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: soundfile; extra == "test"
Dynamic: license-file

# streamsad

`streamsad` is a streaming-oriented Speech Activity Detection (SAD) module that operates frame by frame, without requiring access to the full audio signal (unlike batch processing). Unlike simple energy-based Voice Activity Detection (VAD), it accurately detects human speech while ignoring music, background noise, and silence. Powered by an efficient ONNX model and a post-processing algorithm inspired by WebRTC (using ring buffer smoothing), it runs entirely on the CPU with minimal overhead, making it ideal for real-time voice interfaces, ASR frontends, and low-resource deployments.

# Dependencies

This module has been tested and works correctly with Python 3.10 through 3.13.

The following third-party dependencies are required to use `streamsad`:

- `numpy`
- `onnxruntime`

# How to Use

Here is an example of how to use the `streamsad` module:

```python
import numpy as np
from streamsad import SAD

# Initialize the SAD model
sad = SAD()

# Create an example audio stream (e.g., 2 seconds of random audio at 16kHz)
audio_np_array = np.random.randn(32000).astype(np.float32)

# Detect speech segments
segments = sad(audio_np_array)

# Print the detected segments
print(segments)
```

# Installation

You can install `streamsad` using pip:

```bash
pip install streamsad
```

# Testing

After installing the module, you can run unit tests using `pytest`:

```bash
pytest -s tests/test_sad.py
```
