Metadata-Version: 2.1
Name: gprob
Version: 1.0.1
Summary: Probabilistic programming with arrays of Gaussian variables.
Author-email: Sergey Fedorov <fedorov.s.a@outlook.com>
License: MIT License
        
        Copyright (c) 2024 Sergey A. Fedorov
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Source, https://github.com/SAFedorov/gprob
Keywords: Gaussian distribution,Noise,Random variables,Stochastic processes,Gaussian processes,Probabilistic programming,Python,Numpy,Scipy
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.25
Requires-Dist: scipy

# gprob
gprob is a python package that implements a probabilistic programming language for Gaussian random variables with exact conditioning. It is built around the idea that arrays of Gaussian random variables can be handled in the same way as numerical numpy arrays.

To give a flavor of it, the first example shows a few operations on scalar variables and conditioning
```python
>>> import gprob as gp
>>> x = gp.normal()
>>> y = gp.normal()
>>> z = x + 0.2 * y + 3
>>> z
Normal(mean=3, var=1.04)
>>> z | {y - 0.5 * x: 1}  # conditioning
Normal(mean=2.76, var=0.968)
```

The second example is the construction of a random walk of a Brownian particle observed in the beginning at x=0 and midway through its motion at x=1,
```python
>>> nstep = 5 * 10**3
>>> dx = gp.normal(0, 1/nstep, size=(nstep,))
>>> x = gp.cumsum(dx, 0)  # unconditional particle positions
>>> xc = x | {x[nstep//2]: 1}  # positions conditioned on x[nstep//2] == 1
>>> samples = xc.sample(10**2)  # sampling 100 trajectories
```
```python
>>> import matplotlib.pyplot as plt
>>> plt.plot(samples.T, alpha=0.1, color='gray')
>>> plt.show()
```
![brownian readme](./assets/brownian_readme.png)

## Requirements
* python >= 3.9
* [numpy](https://numpy.org/) >= 1.25
* [scipy](https://scipy.org/)

## Installation
The package can be installed from PyPI,
```
pip install gprob
```

or from this repository (to get the latest version),

```
pip install git+https://github.com/SAFedorov/gprob.git
```

## Getting started
Have a look at the notebooks in the [examples](examples) folder, starting from the tutorials on
1. [Random variables](examples/1-random-variables.ipynb)
2. [Array operations](examples/2-array-operations.ipynb)
3. [Sparse arrays](examples/3-sparse-arrays.ipynb)
4. [Likelihood fitting](examples/4-likelihood-fitting-fisher.ipynb)

roughly in this order.

## How it works
There is a supplementary [note](https://safedorov.github.io/gprob-note/) that presents some of the underying theory, especially the theory of inference.

## How many variables it can handle
General multivariate Gaussian distributions of *n* variables require memory quadratic in *n* for their storage, and computational time cubic in *n* for their exact conditioning. My laptop can typically handle arrays whose sizes count in thousands.

If the Gaussian variables are such that their joint distribution is a direct product, they can be packed into sparse arrays. For those, memory and computational requirements grow linearly with the number of independent distributions, and the total number of variables can be larger. 

## Acknowledgements
gprob was inspired by (but works differently from) [GaussianInfer](https://github.com/damast93/GaussianInfer). See the corresponding paper,

D. Stein and S. Staton, "Compositional Semantics for Probabilistic Programs with Exact Conditioning," 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), Rome, Italy, 2021, pp. 1-13, doi: 10.1109/LICS52264.2021.9470552 .

gprob uses the subscript parser from [opt-einsum](https://github.com/dgasmith/opt_einsum). Some linearization tricks and choices of tooling follow [autograd](https://github.com/HIPS/autograd).

