Metadata-Version: 2.4
Name: nomad-hpc
Version: 1.2.2
Summary: A lightweight HPC monitoring and predictive analytics tool
Author-email: Joao Tonini <jtonini@richmond.edu>
Maintainer-email: Joao Tonini <jtonini@richmond.edu>
Project-URL: Homepage, https://nomad-hpc.com
Project-URL: Documentation, https://jtonini.github.io/nomad-hpc/
Project-URL: Repository, https://github.com/jtonini/nomad-hpc
Project-URL: Issues, https://github.com/jtonini/nomad-hpc/issues
Keywords: hpc,monitoring,slurm,cluster,predictive-analytics,machine-learning,anomaly-detection,graph-neural-network
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: System Administrators
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: System :: Monitoring
Classifier: Topic :: System :: Systems Administration
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: click>=8.0
Requires-Dist: toml>=0.10
Requires-Dist: numpy>=1.21
Requires-Dist: pandas>=1.3
Requires-Dist: scipy>=1.7
Provides-Extra: ml
Requires-Dist: scikit-learn>=1.0; extra == "ml"
Requires-Dist: torch>=2.0; extra == "ml"
Requires-Dist: torch-geometric>=2.0; extra == "ml"
Provides-Extra: dashboard
Requires-Dist: jinja2>=3.0; extra == "dashboard"
Provides-Extra: alerts
Provides-Extra: all
Requires-Dist: nomad[dashboard,ml]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: pre-commit>=3.0; extra == "dev"

# NØMAD-HPC

**NØde Monitoring And Diagnostics** — Lightweight HPC monitoring, visualization, and predictive analytics.

> *"Travels light, adapts to its environment, and doesn't need permanent infrastructure."*

[![PyPI](https://img.shields.io/pypi/v/nomad-hpc.svg)](https://pypi.org/project/nomad-hpc/)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.18614517.svg)](https://doi.org/10.5281/zenodo.18614517)

---

📖 **[Full Documentation](https://jtonini.github.io/nomad-hpc/)** — Installation guides, configuration, CLI reference, network methodology, ML framework, and more.

---

## Quick Start

```bash
pip install nomad-hpc
nomad demo                    # Try with synthetic data
```

For production:
```bash
nomad init                    # Configure for your cluster
nomad collect                 # Start data collection
nomad dashboard               # Launch web interface
```

---

## Features

| Feature | Description | Command |
|---------|-------------|---------|
| **Dashboard** | Real-time multi-cluster monitoring with partition views | `nomad dashboard` |
| **Educational Analytics** | Track computational proficiency development | `nomad edu explain <job>` |
| **Alerts** | Threshold + predictive alerts (email, Slack, webhook) | `nomad alerts` |
| **ML Prediction** | Job failure prediction using similarity networks | `nomad predict` |
| **Community Export** | Anonymized datasets for cross-institutional research | `nomad community export` |
| **Interactive Sessions** | Monitor RStudio/Jupyter sessions | `nomad report-interactive` |
| **Derivative Analysis** | Detect accelerating trends before thresholds | Built into alerts |

---

## Architecture

```
┌────────────────────────────────────────────────────────────┐
│                         NØMAD                              │
├──────────────┬──────────────┬──────────────┬───────────────┤
│  Collectors  │   Analysis   │     Viz      │    Alerts     │
├──────────────┼──────────────┼──────────────┼───────────────┤
│ disk         │ derivatives  │ dashboard    │ thresholds    │
│ iostat       │ similarity   │ network 3D   │ predictive    │
│ slurm        │ ML ensemble  │ partitions   │ email/slack   │
│ gpu          │ edu scoring  │ edu views    │ webhooks      │
│ nfs          │              │              │               │
└──────────────┴──────────────┴──────────────┴───────────────┘
                              │
                    ┌─────────┴─────────┐
                    │  SQLite Database  │
                    └───────────────────┘
```

---

## CLI Reference

### Core Commands
```bash
nomad init                    # Setup wizard
nomad collect                 # Start collectors
nomad dashboard               # Web interface
nomad demo                    # Demo mode
nomad status                  # System status
```

### Educational Analytics
```bash
nomad edu explain <job_id>    # Job analysis with recommendations
nomad edu trajectory <user>   # User proficiency over time
nomad edu report <group>      # Course/group report
```

### Analysis & Prediction
```bash
nomad disk /path              # Filesystem trends
nomad jobs --user <user>      # Job history
nomad similarity              # Network analysis
nomad train                   # Train ML models
nomad predict                 # Run predictions
```

### Community & Alerts
```bash
nomad community export        # Export anonymized data
nomad community preview       # Preview export
nomad alerts                  # View alerts
nomad alerts --unresolved     # Unresolved only
```

---

## Installation

### From PyPI
```bash
pip install nomad-hpc
```

### From Source
```bash
git clone https://github.com/jtonini/nomad.git
cd nomad && pip install -e .
```

### Requirements
- Python 3.9+
- SQLite 3.35+
- sysstat package (`iostat`, `mpstat`)
- Optional: SLURM, nvidia-smi, nfsiostat

### System Check
```bash
nomad syscheck
```

---

## Documentation

📖 **[jtonini.github.io/nomad-hpc](https://jtonini.github.io/nomad-hpc/)**

- [Installation & Configuration](https://jtonini.github.io/nomad-hpc/installation/)
- [System Install (`--system`)](https://jtonini.github.io/nomad-hpc/system-install/)
- [Dashboard Guide](https://jtonini.github.io/nomad-hpc/dashboard/)
- [Educational Analytics](https://jtonini.github.io/nomad-hpc/edu/)
- [Network Methodology](https://jtonini.github.io/nomad-hpc/network/)
- [ML Framework](https://jtonini.github.io/nomad-hpc/ml/)
- [Proficiency Scoring](https://jtonini.github.io/nomad-hpc/proficiency/)
- [CLI Reference](https://jtonini.github.io/nomad-hpc/cli/)
- [Configuration Options](https://jtonini.github.io/nomad-hpc/config/)

---

## License

Dual-licensed:
- **AGPL v3** — Free for academic, educational, and open-source use
- **Commercial License** — Available for proprietary deployments

---

## Citation

```bibtex
@software{nomad2026,
  author = {Tonini, João Filipe Riva},
  title = {NØMAD: Lightweight HPC Monitoring with Machine Learning-Based Failure Prediction},
  year = {2026},
  url = {https://github.com/jtonini/nomad},
  doi = {10.5281/zenodo.18614517}
}
```

---

## Contributing

See [CONTRIBUTING.md](docs/CONTRIBUTING.md) for guidelines.

---

## Contact

- **Author**: João Tonini
- **Email**: jtonini@richmond.edu
- **Issues**: [GitHub Issues](https://github.com/jtonini/nomad/issues)
