Metadata-Version: 2.4
Name: chi-som
Version: 1.0
Summary: Fast self-oranizing maps for cheminformatics using numba
Keywords: som,self-organizing map,machine learning,rdkit,cheminformatics,drug discovery,numba
Author: Johannes Kaminski, Oliver Koch
Author-email: Johannes Kaminski <j.kaminski@uni-muenster.de>, Oliver Koch <okoch@uni-muenster.de>
License-Expression: LGPL-3.0-or-later
License-File: LICENSES/GPL-3.0-only
License-File: LICENSES/LGPL-3.0-only
License-File: LICENSES/copyright
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: GPU
Classifier: Environment :: GPU :: NVIDIA CUDA
Classifier: Environment :: GPU :: NVIDIA CUDA :: 12
Classifier: Environment :: GPU :: NVIDIA CUDA :: 13
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.13
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Requires-Dist: numba>=0.61.2
Requires-Dist: pandas>=2.2.3
Requires-Dist: rdkit>=2024.9.6
Requires-Dist: tables>=3.10.2
Requires-Dist: tqdm>=4.65
Requires-Dist: zarr>=3.0.8
Requires-Dist: scipy>=1.15.2
Requires-Dist: pyqtgraph>=0.13.7
Requires-Dist: pyside6>=6.10.0
Requires-Dist: matplotlib>=3.10.8
Requires-Dist: numpy<2.4
Requires-Dist: numba-cuda[cu12]>=0.20.0,<0.25.0 ; extra == 'cu12'
Requires-Dist: numba-cuda[cu13]>=0.20.0,<0.25.0 ; extra == 'cu13'
Maintainer: Johannes Kaminski
Maintainer-email: Johannes Kaminski <j.kaminski@uni-muenster.de>
Requires-Python: >=3.13
Project-URL: Homepage, https://kochlab.org
Project-URL: Documentation, https://kochgroup.github.io/ChI-SOM
Project-URL: Repository, https://github.com/kochgroup/ChI-SOM
Project-URL: Issues, https://github.com/kochgroup/ChI-SOM/issues
Provides-Extra: cu12
Provides-Extra: cu13
Description-Content-Type: text/markdown

# &#7521;-SOM

> **Ch**em**I**nformatics SOM Toolkit

&#7521;-SOM is a high-performance framework for training emergent self-organizing maps (ESOMs) with a specific focus on cheminformatics; including on-disc, low-latency data storage and a GUI.  
It was specifically developed for visualising the chemical space of million-scale molecular datasets and for interactive exploration.


![Overview of the ChI-SOM GUI](images/gui_screenshot.png "The GUI")

## Installation
Currently, __ChI-SOM__ is only available for Linux, and Windows using _WSL2_.

It can be installed directly from PyPI
```sh
pip install chi-som
```  
  
For the CUDA compute backend, `numba-cuda` is required.
On systems running CUDA, ChI-SOM can be installed with CUDA support via
```sh
pip install chi-som[cu12]
```
for CUDA13 or 
```sh
pip install chi-som[cu13]
```
for CUDA12  
  
Please refer to the [numba-cuda](https://nvidia.github.io/numba-cuda/) documentation for more complex setups.

## CAVEAT
This software may be considered to be in beta stage. While the user-facing API is expected to remain stable up to a 2.0 release, the internal API might change at any release and can not be considered stable.  

## Usage example

```python
import numpy as np

from chisom import Som, start_chisom_viewer
from chisom.utils import decay_linear, lattice_size

data = np.random.random((600, 400))

# Set up with ESOM rules
n_datapoints, n_features = data.shape
rows, columns = lattice_size(n_datapoints)
SIGMA = rows // 2

# Create a SOM object
# The high and low parameters should be chosen according to the dataset values
som = Som(
    rows,
    columns,
    n_features,
    low=data.min(),
    high=data.max(),
)

N_EPOCHS = 30

# The training loop
for epoch in range(N_EPOCHS):
    # Calculate the current sigma and alpha values using decay functions
    current_sigma = decay_linear(epoch, SIGMA, total_iterations=N_EPOCHS)
    current_alpha = decay_linear(epoch, 0.8, total_iterations=N_EPOCHS)

    # Train one epoch
    som.train(data, epoch, current_sigma, current_alpha)

# Calculate the U-Matrix
umx = som.get_umatrix()

# Predict the best matching units and quantization errors for all data points
bmus, qe = som.predict(data)


# Using the GUI needs information to overlay on the datapoints
dataset = pd.from_dict(
    {"Type:": ["A"] * len(data)}
)

# Start the GUI
start_chisom_viewer(umx, bmus, dataset)
```  

## Development Setup
ChI-SOM is developed, built, and packaged using [Astral uv](https://docs.astral.sh/uv/)

To set up a development environment initalize with
```sh
uv sync
```  

To build run
```sh
uv build
```  


## Meta
Authors: Johannes Kaminski, Oliver Koch @ [AG Koch](https://www.uni-muenster.de/Chemie.pz/forschen/ag/koch/index.html)  
Contact: j.kaminski[at]uni-muenster.de

ChI-SOM is distributed under the LGPLv3. See LICENCES for more information.