Metadata-Version: 2.4
Name: gp-tempest
Version: 0.1.0
Summary: Gaussian Process Temporal Embedding for Protein Simulations and Transitions
Author: Georg Diez
License: MIT License
        
        Copyright (c) 2024 Georg Diez
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        Academic Citation Request: If you use this software in work that leads to a
        publication, presentation, or report, we kindly ask that you cite the following
        paper:
        
            G. Diez, N. Dethloff, G. Stock,
            "Recovering Hidden Degrees of Freedom Using Gaussian Processes,"
            J. Chem. Phys. 163, 124105 (2025).
            https://doi.org/10.1063/5.0282147
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Paper, https://doi.org/10.1063/5.0282147
Project-URL: Repository, https://github.com/moldyn/GP-TEMPEST
Keywords: molecular dynamics,dimensionality reduction,variational autoencoder,gaussian process,machine learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: matplotlib>=3.4.0
Requires-Dist: prettypyplot>=0.10.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: click>=8.0.0
Requires-Dist: tqdm>=4.60.0
Dynamic: license-file

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="docs/hero_dark.png">
    <source media="(prefers-color-scheme: light)" srcset="docs/hero_light.png">
    <img alt="GP-TEMPEST" src="docs/hero_light.png" width="800">
  </picture>
</p>

# GP-TEMPEST

**Gaussian Process Temporal Embedding for Protein Simulations and Transitions**

<p align="center">
  <a href="https://doi.org/10.1063/5.0282147"><img src="https://img.shields.io/badge/DOI-10.1063%2F5.0282147-blue" alt="DOI"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-green" alt="License: MIT"></a>
  <a href=".github/workflows/pytest.yml"><img src="https://github.com/moldyn/GP-TEMPEST/actions/workflows/pytest.yml/badge.svg" alt="Tests"></a>
  <a href="https://codecov.io/gh/moldyn/GP-TEMPEST"><img src="https://codecov.io/gh/moldyn/GP-TEMPEST/branch/main/graph/badge.svg" alt="Coverage"></a>
  <img src="https://img.shields.io/badge/python-3.9%2B-blue" alt="Python 3.9+">
  <img src="https://img.shields.io/badge/PyTorch-2.0%2B-EE4C2C?logo=pytorch" alt="PyTorch">
</p>

<p align="center">
  <a href="#features">Features</a> •
  <a href="#installation">Installation</a> •
  <a href="#usage">Usage</a> •
  <a href="#citation">Citation</a>
</p>

---

GP-TEMPEST is a PyTorch implementation of the Gaussian Process Variational Autoencoder (GP-VAE) framework for time-aware dimensionality reduction of molecular dynamics (MD) simulations. The method leverages physics-informed Gaussian Process priors to capture temporal correlations in the latent space, enabling the recovery of hidden or kinetically relevant degrees of freedom in complex biomolecular systems.

## Features

- **Physics-informed dimensionality reduction** using Gaussian Processes as temporal priors
- **Flexible kernel selection** with support for the Matérn kernel (ν = 0.5, 1.5, 2.5)
- **Sparse GP inference** with inducing points for scalability to large molecular trajectories
- **Compatible with large MD datasets** and batch-wise training

## Installation

**1. Clone the repository:**
```bash
git clone https://github.com/moldyn/GP-TEMPEST.git
cd GP-TEMPEST
```

**2. Install dependencies:**
```bash
pip install torch --index-url https://download.pytorch.org/whl/cpu  # or cu118 for CUDA
pip install -r requirements.txt
```

## Usage

### Command-line interface

**Fully-connected variant:**
```bash
# Generate a default config file
python tempest_main.py --generate_config

# Run with your config
python tempest_main.py --config my_config.yaml
```

### Python API

```python
import numpy as np
import torch
from gptempest import TEMPEST, MaternKernel, load_prepare_data

# Set up kernel and model
kernel = MaternKernel(scale=10.0, nu=1.5, dtype=torch.float64)
inducing_points = np.linspace(0, 1, 50)

model = TEMPEST(
    cuda=False,
    kernel=kernel,
    dim_input=dim_input,
    dim_latent=2,
    layers_hidden_encoder=[128, 64],
    layers_hidden_decoder=[64, 128],
    inducing_points=inducing_points,
    beta=1.0,
    N_data=N_data,
    dtype=torch.float64,
)

# Train
model.train_model(dataset, train_size=1.0, learning_rate=1e-3,
                  weight_decay=1e-5, batch_size=512, n_epochs=100)

# Extract latent space
embedding = model.extract_latent_space(dataset, batch_size=512)
```

### Configuration file

GP-TEMPEST is configured via YAML files. Generate a template with `--generate_config` and adjust the following key parameters.
The discussion of these parameters can be found in the paper.

| Parameter | Description |
|-----------|-------------|
| `dim_latent` | Dimensionality of the latent space (typically 2) |
| `layers_hidden` | Hidden layer sizes for encoder/decoder |
| `kernel_nu` | Matérn kernel smoothness (0.5, 1.5, or 2.5) |
| `kernel_scale` | Time-scale of the GP prior |
| `beta` | Weight of the GP regularization term |
| `inducing_points` | Path to inducing point time coordinates |

## Citation

If you use GP-TEMPEST in your research, please cite:

```bibtex
@article{diez2025gptempest,
  title   = {Recovering Hidden Degrees of Freedom Using Gaussian Processes},
  author  = {Diez, Georg and Dethloff, Nele and Stock, Gerhard},
  journal = {J. Chem. Phys.},
  volume  = {163},
  pages   = {124105},
  year    = {2025},
  doi     = {10.1063/5.0282147}
}
```

> G. Diez, N. Dethloff, G. Stock,
> "Recovering Hidden Degrees of Freedom Using Gaussian Processes,"
> *J. Chem. Phys.* **163**, 124105 (2025), https://doi.org/10.1063/5.0282147

## License

This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.
