Metadata-Version: 2.4
Name: gpuma
Version: 0.5.1
Summary: GPUMA - Geometry optimization toolkit using Fairchem UMA models and Torch-Sim
Author-email: Niklas Hölter <niklas.hoelter@uni-muenster.de>
License-Expression: MIT
Project-URL: Source, https://github.com/NiklasHoelter/gpuma
Project-URL: Documentation, https://niklashoelter.github.io/gpuma/
Keywords: chemistry,geometry-optimization,uma,fairchem,ase,mlip,gpuma
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: ==3.12.*
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: ase==3.27.0
Requires-Dist: torch-sim-atomistic==0.5.1
Requires-Dist: fairchem-core<2.10,>=2.7
Requires-Dist: morfeus-ml>=0.7
Requires-Dist: rdkit>=2022.9.5
Requires-Dist: tables>=3.10.2
Requires-Dist: scipy<1.15.0
Requires-Dist: hf_xet
Provides-Extra: yaml
Requires-Dist: pyyaml>=6.0; extra == "yaml"
Provides-Extra: dev
Requires-Dist: pytest>=9.0.0; extra == "dev"
Requires-Dist: build>=1.2.1; extra == "dev"
Requires-Dist: twine>=5.0.0; extra == "dev"
Requires-Dist: pip-tools>=7.5.2; extra == "dev"
Requires-Dist: ruff>=0.14.14; extra == "dev"
Requires-Dist: mkdocs>=1.6.1; extra == "dev"
Requires-Dist: mkdocs-material>=9.7.1; extra == "dev"
Requires-Dist: mkdocstrings[python]>=1.0.2; extra == "dev"
Dynamic: license-file

# GPUMA

<div align="center">
  <img src="docs/logo_bg.png" alt="GPUMA Logo"/>
</div>

---

GPUMA is a minimalist Python toolkit for facile and rapid high-throughput molecular geometry optimization 
based on the [UMA/OMol25 machine-learning interatomic potential](https://arxiv.org/abs/2505.08762).  

GPUMA is especially designed for batch optimizations of many structures (conformer ensembles, datasets) on GPU,
ensuring efficient parallelization and maximum GPU utilization by leveraging the [torch-sim library](https://arxiv.org/abs/2508.06628).
It wraps Fairchem UMA models and torch-sim functionality to provide both a simple command-line 
interface (CLI) and a small but expressive Python API for single- and multi-structure optimizations.

If conformer sampling is desired, GPUMA can generate conformer ensembles on the fly from SMILES strings 
using the [morfeus library](https://digital-chemistry-laboratory.github.io/morfeus/). Alternative input formats
are described in the CLI section below.

Feedback and improvements are always welcome!

## Installation

> ⚠️ **Required for UMA models:**</br>
> To access the UMA models on Hugging Face, **you must provide a token** either via the `HUGGINGFACE_TOKEN` environment variable or via the config (direct token string or path to a file containing the token).

### Option 1: Install from PyPI (recommended)

This installs `gpuma` together with its core dependencies. 
At the moment, installation and tests have only been
validated under Python 3.12; using other Python versions is currently
considered experimental.

- **Using a `uv` virtual environment**
  ```powershell
  # create and activate a fresh environment
  uv venv .venv

  # activate the environment

  # install gpuma from PyPI inside the environment
  uv pip install gpuma
  ```

- **Using a `conda` environment**
  ```powershell
  # create and activate a fresh environment with Python 3.12
  conda create -n gpuma-py312 python=3.12
  conda activate gpuma-py312

  # install gpuma from PyPI inside the environment
  pip install gpuma
  ```


### Option 2: Install from source

```bash
# clone the repository
git clone https://github.com/niklashoelter/gpuma.git
cd gpuma

# install using (uv) pip
uv pip install .
# or, without uv:
pip install .
```

## Documentation

Full documentation is available at [https://niklashoelter.github.io/gpuma/](https://niklashoelter.github.io/gpuma/).

For local browsing of the Markdown sources, see in particular:
- [docs/index.md](docs/index.md) – overview and getting started
- [docs/install.md](docs/install.md) – installation details
- [docs/cli.md](docs/cli.md) – CLI options and input formats
- [docs/config.md](docs/config.md) – configuration file schema and examples
- [docs/reference.md](docs/reference.md) – API and configuration reference

Using a configuration file is highly recommended for reproducibility and ease of use.

Also check the [examples/](examples) folder in the repository for sample config files and usage examples:
- [examples/config.json](examples/config.json) – minimal example configuration
- [examples/example_single_optimization.py](examples/example_single_optimization.py) – single-structure optimization from Python
- [examples/example_ensemble_optimization.py](examples/example_ensemble_optimization.py) – ensemble/multi-structure optimization from Python

## CLI Usage

The CLI is provided via the command `gpuma`. For best results, create a
config file (JSON or YAML) and reference it in all CLI calls (see [examples/config.json](examples/config.json) for a minimal example).

### Examples: Batch optimization of multiple XYZ structures

Optimize all XYZ files in a directory (each file containing a single structure):

```bash
gpuma optimize --config examples/config.json --xyz-dir examples/example_input_xyzs/multi_xyz_dir/
```

Optimize multiple structures contained in a single multi-XYZ file:

```bash
gpuma optimize --config examples/config.json --xyz examples/example_input_xyzs/multi_xyz_file.xyz
```

Refer to the [CLI documentation](docs/cli.md) for details on configuration options, supported input formats (SMILES, XYZ, directories, multi-XYZ files), and additional CLI examples.

## Python API

A minimalistic and high-level Python API is provided for easy integration into custom scripts and workflows.

For example usage, see:
- [examples/example_single_optimization.py](examples/example_single_optimization.py)
- [examples/example_ensemble_optimization.py](examples/example_ensemble_optimization.py)

Please refer to the documentation and examples for detailed usage examples and API reference.

## Known limitations

When a run is started from SMILES, an RDKit force field (via the morfeus library) is used to generate an initial structure. Spin is not taken into account during this step, so the initial estimated geometries can be incorrect. When the UMA/Omol25 models are applied subsequently, the structure can sometimes be optimized to a maximum rather than a minimum because the model is not provided with Hessian matrices. This behavior only affects runs originating from SMILES; it does not occur with better starting geometries (e.g., when starting from XYZ files).

## Troubleshooting
- Missing libraries: install optional dependencies like `pyyaml` if you use YAML configs.
- Fairchem/UMA: ensure network access for model downloads and optionally set or provide 
`huggingface_token` (e.g., via a token file) to access the UMA model family.

## License
MIT License (see LICENSE)
