Metadata-Version: 2.4
Name: monsoonbench
Version: 0.1.2
Summary: A reproducible benchmarking framework for Indian monsoon onset prediction
Requires-Python: >=3.12
Description-Content-Type: text/markdown
Requires-Dist: pandas~=2.1
Requires-Dist: python-dotenv~=1.0.0
Requires-Dist: jupyterlab~=4.0.0
Requires-Dist: ipykernel~=6.16
Requires-Dist: ruff==0.7.2
Requires-Dist: numpy>=2.0
Requires-Dist: pre-commit~=3.5
Requires-Dist: coverage~=7.3
Requires-Dist: pytest~=7.4
Requires-Dist: xarray>=2025.9.1
Requires-Dist: dask>=2025.9.1
Requires-Dist: h5netcdf>=1.6.4
Requires-Dist: geopandas>=1.1.1
Requires-Dist: matplotlib>=3.10.6
Requires-Dist: zarr>=3.1.3
Requires-Dist: scipy>=1.11.0
Provides-Extra: netcdf4
Requires-Dist: netCDF4>=1.7.2; extra == "netcdf4"
Requires-Dist: cftime>=1.6.4; extra == "netcdf4"

# MonsoonBench
*A unified, reproducible benchmarking framework for Indian monsoon onset prediction.*

MonsoonBench provides a standardized workflow for loading rainfall and forecast datasets, computing monsoon onset, and evaluating forecasting skill across space and time.  
It is designed for climate researchers, forecasters, and data scientists aiming to compare deterministic, probabilistic, and climatology-based onset models using consistent methods.

The framework follows WeatherBench-style principles: **clean APIs, reproducible configuration, modular components, and shareable outputs.**

---

## Documentation Overview

MonsoonBench includes detailed module-specific guides. Use the links below to navigate the documentation.

### Core Package Overview & Pipeline
High-level explanation of the evaluation pipeline, CLI interface, onset metrics, and NetCDF outputs.  
**Path:** `monsoonbench/README.md`  
[Open Metrics & Pipeline README](monsoonbench/README.md)

---

### Data Loading Guide
How to load IMD rainfall, deterministic/probabilistic forecasts, and threshold datasets using the unified API.  
**Path:** `monsoonbench/data/dataloader_quickstart.md`  
[Open DataLoader QuickStart](monsoonbench/data/dataloader_quickstart.md)

---

### Visualization & Metric Export Tools
How to generate spatial scorecards and export skill metrics in NetCDF, CSV, Parquet, or JSON formats.  
**Path:** `monsoonbench/visualization/README.md`  
[Open Visualization README](monsoonbench/visualization/README.md)

---

### Examples (Configs, Scripts, Notebooks)
Example YAML configs, runnable scripts, and tutorial notebooks demonstrating end-to-end usage.  
**Path:** `examples/README.md`  
[Open Examples README](examples/README.md)

---

## Installation

MonsoonBench is available on TestPyPI for pre-release testing:

```bash
pip install -i https://test.pypi.org/simple/ \
    --extra-index-url https://pypi.org/simple/ \
    monsoonbench==0.1.0
```

### Verify installation:

```bash
monsoonbench --help
```

## Python API Example

```python
from monsoonbench.metrics import DeterministicOnsetMetrics
from monsoonbench.visualization import create_model_comparison_table

# Initialize metrics calculator
metrics = DeterministicOnsetMetrics()

# Compute metrics for multiple years
df, onset_data = metrics.compute_metrics_multiple_years(
    years=[2019, 2020, 2021, 2022],
    model_forecast_dir="data/model_forecast_data/fuxi/...",
    imd_folder="data/imd_rainfall_data/4p0",
    thres_file="data/imd_onset_threshold/mwset4x4.nc4",
    tolerance_days=3,
    verification_window=1,
    forecast_days=15,
)

# Create spatial metrics
spatial = metrics.create_spatial_far_mr_mae(df, onset_data)

# Generate comparison table
comparison = create_model_comparison_table({"FuXi": spatial})
print(comparison)
```


## Repository Structure

```
monsoon-bench/
│
├── monsoonbench/ # Core package
│ ├── data/ # Dataloaders
│ │ └── dataloader_quickstart.md
│ ├── metrics/ # Onset detection + evaluation pipeline
│ ├── visualization/ # Scorecards + metric downloaders
│ │ └── README.md
│ ├── README.md # Module-level pipeline documentation
│ └── ...
│
├── examples/ # Configs, scripts, tutorial notebooks
│ └── README.md
│
├── tests/ # Unit tests
├── Dockerfile
├── Makefile
└── pyproject.toml
```

## Development Process with branches

Each team member created their own branch to implement specific fixes or features, such as the data loader, data downloader, and visualizations. We regularly merged these branches during TA meetings to ensure that the codebase stayed consistent and that everyone remained aligned on progress and design decisions.
