Metadata-Version: 2.4
Name: msig
Version: 0.1.3
Summary: Statistical Significance Criteria for Multivariate Time Series Motifs
Author-email: "Miguel G. Silva" <mmsilva@ciencias.ulisboa.pt>
Maintainer-email: "Miguel G. Silva" <mmsilva@ciencias.ulisboa.pt>
License-Expression: MIT
Project-URL: Homepage, https://github.com/MiguelGarcaoSilva/msig
Project-URL: Repository, https://github.com/MiguelGarcaoSilva/msig
Project-URL: Documentation, https://github.com/MiguelGarcaoSilva/msig#readme
Project-URL: Bug Tracker, https://github.com/MiguelGarcaoSilva/msig/issues
Keywords: time-series,motif-discovery,statistical-significance,multivariate
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.20.0
Requires-Dist: scipy>=1.7.0
Provides-Extra: experiments
Requires-Dist: pandas>=1.3.0; extra == "experiments"
Requires-Dist: matplotlib>=3.4.0; extra == "experiments"
Requires-Dist: stumpy>=1.11.0; extra == "experiments"
Requires-Dist: librosa>=0.9.0; extra == "experiments"
Requires-Dist: statsmodels>=0.13.0; extra == "experiments"
Requires-Dist: jinja2>=3.0.0; extra == "experiments"
Requires-Dist: leitmotif; extra == "experiments"
Requires-Dist: psutil>=5.8.0; extra == "experiments"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0.0; extra == "dev"
Provides-Extra: all
Requires-Dist: msig[dev,experiments]; extra == "all"
Dynamic: license-file

# MSig

**Statistical Significance Testing for Multivariate Time Series Motifs**

MSig evaluates whether discovered motifs occur more frequently than expected by chance, using rigorous statistical methods.

## Installation

### From PyPI (recommended)

```bash
# Core package only
pip install msig

# With experiment dependencies (includes STUMPY and LAMA)
pip install "msig[experiments]"
```

### From source with uv

```bash
# Clone the repository
git clone https://github.com/MiguelGarcaoSilva/msig.git
cd msig

# Sync dependencies (includes STUMPY and LAMA)
uv sync

# Optional: Install MOMENTI (for MOMENTI experiments - Linux/Windows only)
uv pip install git+https://github.com/aidaLabDEI/MOMENTI-motifs
```

### From source with pip

```bash
# Clone repository
git clone https://github.com/MiguelGarcaoSilva/msig.git
cd msig

# Install with experiment dependencies
pip install -e ".[experiments]"
```

**Notes**: 
- MOMENTI has platform-specific dependencies and may not install on macOS.
- Audio experiments require **ffmpeg** for MP3 processing: `brew install ffmpeg` (macOS) or `apt-get install ffmpeg` (Linux)

## Quick Start

```python
from msig import Motif, NullModel
import numpy as np

# Create sample multivariate time series (3 sensors × 100 time points)
np.random.seed(42)
t = np.linspace(0, 10, 100)
sensor1 = 10 + 2 * np.sin(2 * np.pi * t) + np.random.randn(100) * 0.5
sensor2 = 5 + 1.5 * np.cos(2 * np.pi * t) + np.random.randn(100) * 0.3
sensor3 = 15 + 3 * np.sin(2 * np.pi * t + np.pi/4) + np.random.randn(100) * 0.7
data = np.stack([sensor1, sensor2, sensor3])

# Create null model (assumes Gaussian distributions)
model = NullModel(data, dtypes=[float, float, float], model="gaussian_theoretical")

# Define a motif: length 10, all 3 sensors, 8 occurrences
motif_length = 10
motif_pattern = data[:, 5:15]  # Extract pattern from position 5
motif_vars = np.array([0, 1, 2])  # Use all sensors
delta_thresholds = np.array([0.3, 0.3, 0.3])  # Tolerance for matching

# Create motif and test significance
motif = Motif(motif_pattern, motif_vars, delta_thresholds, n_matches=8)
prob = motif.set_pattern_probability(model, vars_indep=True)
pvalue = motif.set_significance(
    max_possible_matches=100 - motif_length + 1,
    data_n_variables=3,
    idd_correction=False
)

print(f"Pattern probability: {prob:.6e}")
print(f"P-value: {pvalue:.6e}")
print(f"Significant at α=0.01? {pvalue <= 0.01}")
```

See the `examples/` folder for more examples (`simple_example.py` and `example.ipynb`).

## Running Experiments

The repository includes case studies on three datasets with three discovery methods (STUMPY, LAMA, MOMENTI):

```bash
# Run individual experiments
uv run python experiments/audio/run_stumpy.py
uv run python experiments/audio/run_lama.py
uv run python experiments/audio/run_momenti.py

uv run python experiments/populationdensity/run_stumpy.py
uv run python experiments/populationdensity/run_lama.py
uv run python experiments/populationdensity/run_momenti.py

uv run python experiments/washingmachine/run_stumpy.py
uv run python experiments/washingmachine/run_lama.py
uv run python experiments/washingmachine/run_momenti.py
```

Results are saved to `results/<dataset>/<method>/`.

## Citation

```bibtex
@article{silva2024msig,
  title={On Why and How Statistical Significance Criteria Can Guide Multivariate Time Series Motif Analysis},
  author={Silva, Miguel G. and Henriques, Rui and Madeira, Sara C.},
  year={2024}
}
```

## License

MIT License - see LICENSE file.


