Metadata-Version: 2.1
Name: dolomite-matrix
Version: 0.1.0a1
Summary: Save and load matrices in the dolomite framework
Home-page: https://github.com/ArtifactDB/dolomite-matrix
Author: LTLA
Author-email: infinite.monkeys.with.keyboards@gmail.com
License: MIT
Project-URL: Documentation, https://artifactdb.github.io/dolomite-matrix/
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
License-File: LICENSE.txt
Requires-Dist: importlib-metadata; python_version < "3.8"
Requires-Dist: dolomite-base>=0.2.0-alpha5
Requires-Dist: h5py
Requires-Dist: delayedarray>=0.3.3
Requires-Dist: numpy
Requires-Dist: filebackedarray
Provides-Extra: testing
Requires-Dist: setuptools; extra == "testing"
Requires-Dist: pytest; extra == "testing"
Requires-Dist: pytest-cov; extra == "testing"
Requires-Dist: scipy; extra == "testing"
Requires-Dist: dask; extra == "testing"

<!-- These are examples of badges you might want to add to your README:
     please update the URLs accordingly

[![Built Status](https://api.cirrus-ci.com/github/<USER>/dolomite-matrix.svg?branch=main)](https://cirrus-ci.com/github/<USER>/dolomite-matrix)
[![ReadTheDocs](https://readthedocs.org/projects/dolomite-matrix/badge/?version=latest)](https://dolomite-matrix.readthedocs.io/en/stable/)
[![Coveralls](https://img.shields.io/coveralls/github/<USER>/dolomite-matrix/main.svg)](https://coveralls.io/r/<USER>/dolomite-matrix)
[![PyPI-Server](https://img.shields.io/pypi/v/dolomite-matrix.svg)](https://pypi.org/project/dolomite-matrix/)
[![Conda-Forge](https://img.shields.io/conda/vn/conda-forge/dolomite-matrix.svg)](https://anaconda.org/conda-forge/dolomite-matrix)
[![Monthly Downloads](https://pepy.tech/badge/dolomite-matrix/month)](https://pepy.tech/project/dolomite-matrix)
[![Twitter](https://img.shields.io/twitter/url/http/shields.io.svg?style=social&label=Twitter)](https://twitter.com/dolomite-matrix)
-->

[![Project generated with PyScaffold](https://img.shields.io/badge/-PyScaffold-005CA0?logo=pyscaffold)](https://pyscaffold.org/)

# Read and save matrices in Python

## Introduction

The **dolomite-matrix** package is the Python counterpart to the [**alabaster.matrix**](https://github.com/ArtifactDB/alabaster.matrix) R package,
providing methods for saving/reading arrays and matrices within the [**dolomite** framework](https://github.com/ArtifactDB/dolomite-base).
Dense arrays are stored in the usual HDF5 dataset, while sparse matrices are saved inside a HDF5 file in compressed sparse format.

## Quick start

Let's save a dense matrix to a HDF5 file with some accompanying metadata:

```python
import numpy
x = numpy.random.rand(1000, 200) 

import tempfile
dir = tempfile.mkdtemp()

import dolomite_base
import dolomite_matrix
meta = dolomite_base.stage_object(x, dir, "whee")
dolomite_base.write_metadata(meta, dir)
print(meta["path"])
## whee/array.h5
```

Now we can transfer the directory and reload the matrix in a new session.
This produces a `Hdf5DenseArray` from the [**filebackedarray**](https://github.com/BiocPy/filebackedarray) package.

```python
import dolomite_base
info = dolomite_base.acquire_metadata(dir, "whee/array.h5")
obj = dolomite_base.load_object(info, dir)
## <1000 x 200> Hdf5DenseArray object of type 'float64'
## [[0.58444226, 0.82595149, 0.7214525 , ..., 0.32493652, 0.58206044,
##   0.73770346],
##  [0.96398317, 0.73200292, 0.16410134, ..., 0.31626547, 0.11499628,
##   0.19768697],
##  [0.82350911, 0.48012452, 0.65221052, ..., 0.94989611, 0.15422992,
##   0.77173718],
##  ...,
##  [0.71715436, 0.19266116, 0.52316388, ..., 0.23104537, 0.935654  ,
##   0.51663007],
##  [0.38585049, 0.26709808, 0.70358993, ..., 0.91822795, 0.66144925,
##   0.42465112],
##  [0.08535589, 0.00144712, 0.51411921, ..., 0.84546122, 0.35001404,
##   0.53644868]]
```

## Sparse matrices

We can also save and load a sparse matrix from a HDF5 file:

```python
import scipy 
import numpy
x = scipy.sparse.random(1000, 200, 0.2, dtype=numpy.int16, format="csc")

import tempfile
dir = tempfile.mkdtemp()

import dolomite_base
import dolomite_matrix
meta = dolomite_base.stage_object(x, dir, "whee")
dolomite_base.write_metadata(meta, dir)
print(meta["path"])
## whee/matrix.h5
```

And again, loading it back in a new session.
This produces a `Hdf5CompressedSparseMatrix` from the [**filebackedarray**](https://github.com/BiocPy/filebackedarray) package.

```python
import dolomite_base
info = dolomite_base.acquire_metadata(dir, "whee/matrix.h5")
obj = dolomite_base.load_object(info, dir)
## <1000 x 200> sparse Hdf5CompressedSparseMatrix object of type 'int16'
## [[     0,      0, -28638, ...,      0,      0,  26194],
##  [     0,      0,      0, ...,      0, -30829,      0],
##  [     0,      0,      0, ...,      0,      0,      0],
##  ...,
##  [ 10895,      0,      0, ...,      0,      0,      0],
##  [     0,  32539,      0, ...,      0,   2780, -12106],
##  [     0,      0,      0, ...,   1452,      0, -26314]]
```
