Metadata-Version: 2.1
Name: drexml
Version: 1.1.2
Summary: (DRExM³L) Drug REpurposing using and eXplainable Machine Learning and Mechanistic Models of signal transduction"
Author-Email: Carlos Loucera <carlos.loucera@juntadeandalucia.es>, Marina Esteban-Medina <marina.esteban@juntadeandalucia.es>, Maria Pena-Chilet <maria.pena.chilet.ext@juntadeandalucia.es.com>, =?utf-8?q?V=C3=ADctor_Manuel_de_la_Oliva_Roque?= <victorm.oliva@juntadeandalucia.es>, =?utf-8?q?Sara_Herr=C3=A1iz-Gil_?= <sherraiz@ing.uc3m.es>
License: MIT
Requires-Python: <3.11,>=3.8
Requires-Dist: scikit-learn>=1.3.0
Requires-Dist: numpy<2.0,>=1.24.4
Requires-Dist: scipy>=1.10.1
Requires-Dist: pandas>=2.0.3
Requires-Dist: seaborn>=0.12.2
Requires-Dist: shap==0.42.0
Requires-Dist: matplotlib>=3.7.2
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pyarrow>=12.0.1
Requires-Dist: statsmodels>=0.14.0
Requires-Dist: pystow>=0.5.0
Description-Content-Type: text/markdown

[![DOI](https://img.shields.io/badge/DOI-10.1016/j.csbj.2024.02.027-FAB70C?logo=doi)](https://doi.org/10.1016/j.csbj.2024.02.027)
[![DOI](https://zenodo.org/badge/362395439.svg)](https://zenodo.org/badge/latestdoi/362395439) 
[![PyPI version](https://badge.fury.io/py/drexml.svg)](https://badge.fury.io/py/drexml)
[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)

# Drug REpurposing using eXplainable Machine Learning and Mechanistic Models of signal transduction

Repository for the `drexml` python package: (DRExM³L) Drug REpurposing using eXplainable Machine Learning and Mechanistic Models of signal transduction

## Citation

Find the associated publication [here](https://doi.org/10.1016/j.csbj.2024.02.027):


Esteban-Medina M, de la Oliva Roque VM, Herráiz-Gil S, Peña-Chilet M, Dopazo J, Loucera C. drexml: A command line tool and Python package for drug repurposing. Computational and Structural Biotechnology Journal 2024;23:1129–43. https://doi.org/10.1016/j.csbj.2024.02.027.


Part of the [Intelligent Biology and Medicine](https://www.sciencedirect.com/science/journal/20010370/vsi/10XRHM1G1LS) special issue:

https://www.sciencedirect.com/journal/computational-and-structural-biotechnology-journal/special-issue/10XRHM1G1LS


And the `BIB` file:

```
@article{EstebanMedina2024,
  title = {drexml: A command line tool and Python package for drug repurposing},
  volume = {23},
  ISSN = {2001-0370},
  url = {http://dx.doi.org/10.1016/j.csbj.2024.02.027},
  DOI = {10.1016/j.csbj.2024.02.027},
  journal = {Computational and Structural Biotechnology Journal},
  publisher = {Elsevier BV},
  author = {Esteban-Medina,  Marina and de la Oliva Roque,  Víctor Manuel and Herráiz-Gil,  Sara and Peña-Chilet,  María and Dopazo,  Joaquín and Loucera,  Carlos},
  year = {2024},
  month = dec,
  pages = {1129–1143}
}
```

The article was written using `drexml` version `v1.1.0`. Install it using:
```
pip install drexml==1.1.0
```
Version `v1.1.1` improves the documentation and `README` by including a reference to the published article for easier access.

## Setup

To install the `drexml` package use the following:

```
conda create -n drexml python=3.10
conda activate drexml
pip install drexml
```

If a CUDA~10.2/11.x (< 12) compatible device is available use:

```
conda create -n drexml --override-channels -c "nvidia/label/cuda-11.8.0" -c conda-forge cuda cuda-nvcc cuda-toolkit gxx=11.2 python=3.10
conda activate drexml
pip install --no-cache-dir --no-binary=shap drexml
```

To install `drexml` in an existing environment, activate it and use:

```
pip install drexml
```

Note that by default the `setup` will try to compile the `CUDA` modules, if not possible it will use the `CPU` modules.

## Run

To run the program for a disease map that uses circuits from the preprocessed `KEGG` pathways and the `KDT` standard list, construct an environment file (e.g. `disease.env`):

- using the following template if you have a set of seed genes (comma-separated):

```
seed_genes=2175,2176,2189
```

- using the following template if you want to use the DisGeNET [1] curated gene-disease associations as seeds.

```
disease_id="C0015625"
```

- using the following template if you know which circuits to include (the disease map):

```
circuits=circuits.tsv.gz
```

The `TSV` file `circuits.tsv` has the following format (tab delimited):

```
index	in_disease
P-hsa03320-37	0
P-hsa03320-61	0
P-hsa03320-46	0
P-hsa03320-57	0
P-hsa03320-64	0
P-hsa03320-47	0
P-hsa03320-65	0
P-hsa03320-55	0
P-hsa03320-56	0
P-hsa03320-33	0
P-hsa03320-58	0
P-hsa03320-59	0
P-hsa03320-63	0
P-hsa03320-44	0
P-hsa03320-36	0
P-hsa03320-30	0
P-hsa03320-28	1
```

where:

- `index`: Hipathia circuit id
- `in_disease`: (boolean) True/1 if a given circuit is part of the disease

Note that in all cases you can restrict the circuits to the physiological list by setting `use_physio=true` in the `env` file.

To run the experiment using 10 CPU cores and 0 GPUs, run the following command within an activated environment:

```
drexml run --n-gpus 0 --n-cpus 10 $DISEASE_PATH
```

where:

- `--n-gpus` indicates the number of gpu devices to use in parallel (-1 -> all) (0 -> None)
- `--n-cpus` indicates the number of cpu devices to use in parallel (-1 -> all) 8
- `DISEASE_PATH` indicates the path to the disease env file (e.g. `/path/to/disease/folder/disease.env`)

Use the `--debug` option for testing that everything works using a few iterations.

Note that the first time that the full program is run, it will take longer as it downloads the latest versions of each background dataset from Zenodo:

https://doi.org/10.5281/zenodo.6020480

## Contribute to development

The recommended setup is:

- setup `pipx`
- setup `miniforge`
- use `pipx` to install `pdm`
- ensure that `pdm` is version >=2.1, otherwise update with `pipx`
- use `pipx` to inject pdm-bump into `pdm`
- use `pipx` to install `nox`
- run `pdm config venv.backend conda`
- run `make`, if you want to use a CUDA enabled GPU, use `make gpu=1`
- (Recommended): For GPU development, clear the cache using `pdm cache clear` first

## Documentation

The documentation can be found here:

https://loucerac.github.io/drexml/


## References
[1] Janet Piñero, Juan Manuel Ramírez-Anguita, Josep Saüch-Pitarch, Francesco Ronzano, Emilio Centeno, Ferran Sanz, Laura I Furlong. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucl. Acids Res. (2019) doi:10.1093/nar/gkz1021
