Metadata-Version: 2.4
Name: lazyqsar
Version: 2.2.2
Summary: A library to quickly build QSAR models
License: GPLv3
License-File: LICENSE
Keywords: qsar,machine-learning,chemistry,computer-aided-drug-design
Author: Ersilia Open Source Initiative
Author-email: hello@ersilia.io
Requires-Python: >=3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Provides-Extra: descriptors
Requires-Dist: h5py (==3.14.0)
Requires-Dist: joblib (==1.5.1)
Requires-Dist: loguru (==0.7.3)
Requires-Dist: numpy (==2.1.3)
Requires-Dist: onnxconverter-common (==1.16.0)
Requires-Dist: onnxruntime (==1.20.1)
Requires-Dist: optuna (==4.4.0)
Requires-Dist: pandas (==2.3.0)
Requires-Dist: psutil (==7.0.0)
Requires-Dist: rdkit (==2025.9.1) ; extra == "descriptors"
Requires-Dist: rich (==14.1.0)
Requires-Dist: scikit-learn (==1.6.1)
Requires-Dist: skl2onnx (==1.19.1)
Project-URL: Source Code, https://github.com/ersilia-os/lazy-qsar
Description-Content-Type: text/markdown

# Ersilia's LazyQSAR

A library to build supervised models for chemistry fastly.

## Installation

Install LazyQSAR from source:

```bash
git clone https://github.com/ersilia-os/lazy-qsar.git
cd lazy-qsar
python -m pip install -e .
```

To use the default Lazy QSAR descriptors, please install them:
```bash
python -m pip install -e .[descriptors]
```

This command will enable descriptors (featurizers) calculation. The first time you run LazyQSAR it, it will download the Chemeleon and the CDDD checkpoints, as well as install other dependencies. If you want to finalize this setup upfront, simply run:

```bash
lazyqsar-setup
```

## Binary Classification

LazyQSAR's binary classifier can run either with default descriptors or with custom descriptors passed by the user.

### Built-in descriptors

Instantiate the LazyBinaryQSAR class with a mode of choice (`fast`, `default`, `slow`):

```python
from lazyqsar.qsar import LazyBinaryQSAR

model = LazyBinaryQSAR(mode="fast")
model.fit(smiles_list=smiles_train, y=y_train)
y_hat = model.predict_proba(smiles_list=smiles_test)[:,1]
```

### Custom-made descriptors

Pre-calculate your descriptors using the preferred method. We recommend using the [Ersilia Model Hub](https://github.com/ersilia-os/ersilia) to that end. The `.h5` format generated by Ersilia can be directly passed to the LazyQSAR pipeline. Alternatively, just pass the descriptors as an array in-memory.

```python
from lazyqsar.agnostic import LazyBinaryClassifier

model = LazyBinaryClassifier()
model.fit(X=X_train, y=y_train)
y_hat = model.predict_proba(X=X_test)[:,1]
```

### Using saved models at inference time

By default, models are saved as ONNX files. When a model is trained, you can simply load it using an artifact. In this case, the only crucial dependency is the ONNX runtime.

To save a model, simply run:

```python
model.save(model_dir)
```

This will create a folder with ONNX files in it. You can use with the artifact.

```python
from lazyqsar.artifacts import LazyBinaryClassifierArtifact

model = LazyBinaryClassifier.load(model_dir)
y_hat = model.predict_proba(X=X)[:,1]
```

## Tests and benchmarks

### Quick testing

In the `/tests` folder you can find a quick implementation of the methods described for easily checking that code is working. The Bioavailability dataset and Chemeleon descriptors are used as an example.

```bash
python test/test_binary_classification.py
python test/test_binary_classification.py --agnostic
```

### Benchmarking

In the [benchmark repository](https://github.com/ersilia-os/zaira-chem-tdc-benchmark) you will find the performance of the default estimators and descriptors on the TDCommons ADMET dataset. This is a provisional benchmark. The team is working on a more exhaustive one.

## Disclaimer

This library is only intended for quick-and-dirty QSAR modeling. For a more complete automated QSAR modeling, please refer to [Zaira Chem](https://github.com/ersilia-os/zaira-chem).

## About us

Learn about the [Ersilia Open Source Initiative](https://ersilia.io)!

