Metadata-Version: 2.1
Name: csrk
Version: 0.1.1
Summary: Python tools for CSRK rust Gaussian Process crate
Keywords: gaussian-process,information analysis,machine learning,kernel
Author-Email: "V. Delfavero" <xevra86@gmail.com>
Maintainer-Email: "V. Delfavero" <xevra86@gmail.com>
License: BSD 3-Clause License
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Project-URL: Homepage, https://gitlab.com/xevra/csrk-py
Project-URL: Bug tracker, https://gitlab.com/xevra/csrk-py/issues
Requires-Python: >=3.9
Requires-Dist: numpy<2.4.0,>=2.0.0
Requires-Dist: h5py>=2.7.0
Requires-Dist: matplotlib>=2.0.0
Description-Content-Type: text/markdown

<img src=csrk_Nighthawks.jpg>
(image interpolated from scan of Nighthawks by Edward Hopper -- 1942 -- public domain)

# Gaussian Process Regression with Compactly Supported Radial Kernel
<a href=https://crates.io/crates/csrk target="_blank">csrk</a> is a Rust crate for large-scale Gaussian Process regression using compactly supported Wendland kernels, spatial hashing, and sparse LDL^T factorization. It enables training on tens of thousands of points and fast sampling and evaluation of GP realizations with near-constant per-query cost.

It is a CPU-based code implementing the Wendland kernels (piecewise polynomial kernels with compact support) using the <a href=https://github.com/sparsemat/sprs target="_blank">sprs</a> crates for sparse matrix operations (<a href=https://crates.io/crates/sprs target="_blank">sprs</a>) and sparse Cholesky decomposition (<a href=https://crates.io/crates/sprs-ldl>sprs-ldl</a>).

This Python API calls out to that Rust crate and instantiates a
    Hybrid Python/Rust Gaussian Process, capable of most of the methods
    and features included in the Rust representation.

## Installation
Installation of the source distribution may require having the rust compiler Cargo installed.
#### From Pypi
```bash
pip install csrk
```

#### From source
```bash
git clone https://gitlab.com/xevra/csrk-py
cd csrk-py
pip install .
```

## Example
```python
from csrk import HybridGP

gp = HybridGP(x_train, y_train, y_err, scale, whitenoise, order)

y_evals = gp.predict_mean(x_evals)
```

## Features
- compactly supported Wendland kernels
- sparse kernel construction via spatial hashing
- scalable sparse LDL^T training
- Serialization in hdf5
- No dependency on scikit-sparse

For more on the performance of the Rust GP and an overview of the
    modules and algorith, see the 
    <a href=https://gitlab.com/xevra/csrk target="_blank">Rust crate gitlab</a>.

## Motivation
The Wendland kernels can be used for training and evaluating GPR interpolators in O(n * m), for n evaluation points and m nearest neighbors. This also affects the Cholesky decomposition of the kernel.

When done properly, this process conserves both compute time and computer memory, as large (n x n) arrays are never allocated.

Previously, I had developed a separate Python library for doing this: (<a href=https://gitlab.com/xevra/gaussian-process-api target="_blank">gaussian-process-api</a>). However, despite the use of sparse matrices in the Cholesky decomposition (using scikit-sparse), and despite the C extension backend for the kernel evaluations, this library still constructs dense array intermediate products which may challenge memory resources.

By writing a new module in Rust, I would like to remove the dependency on scikit-sparse and avoid storing large dense matrices in memory at any point throughout the computation, allowing for a lightweight and fast Gaussian Process regression implementation.

## Contributing

I am open to suggestions and pull requests.

## Acknowledgements
My background in Gaussian Processes, and the Wenland kernels come from <a href=https://direct.mit.edu/books/oa-monograph/2320/Gaussian-Processes-for-Machine-Learning>Rasmussen and Williams (2005)</a>. I thank the authors for publishing openly without charge.

My own work in implementing sparse Gaussian Processes for signal to noise estimation for binary black hole merger detectability with a single LIGO detector is briefly summarized in Appendix A of <a href=https://journals.aps.org/prd/abstract/10.1103/PhysRevD.108.043023 target="_blank">Delfavero et al. (2023)</a>.

I would like to acknowledge the work done by <a href=https://www.sciencedirect.com/science/article/abs/pii/S0020025525004384 target="_blank">Esmaeilbeigi et al. (2025)</a> for putting the advantages and limitations of the Wenland kernel that I have stumbled through in practical implementations into the vocabulary of higher mathematics.

I would also like to thank Nicolas Posner, who has accompanied my introduction to rust, and whose <a href=https://nrposner.com/ target="_blank">blog</a> and contributions to other modules encouraged me to learn Rust.

I would also like to thank Nick Fotopoulos for a thorough and constructive code review of the Rust crate!
