Metadata-Version: 2.1
Name: novana
Version: 0.1.2
Summary: NovAna (Novelty Analysis) is a cheminformatics tool that allows decomposing molecules into their scaffolds and shapes using a method similar to that described in by Wills and Lipkus in ACS Med. Chem. Lett. 2020, 11, 2114-2119.
Author: Gian Marco Ghiandoni
Author-email: ghiandoni.g@gmail.com
License: MIT
Description-Content-Type: text/markdown
License-File: LICENSE

![Novana logo](https://github.com/ghiander/novana/blob/main/docs/static/logo.png?raw=true)


## Introduction
**Novana** (Novelty Analysis) is a cheminformatics tool that allows decomposing *molecules* into their *scaffolds* and *shapes* in milliseconds using a method similar to that described by Wills and Lipkus in *ACS Med. Chem. Lett. 2020, 11, 2114-2119* (https://dx.doi.org/10.1021/acsmedchemlett.0c00319). The tool can be used to analyse molecule data sets across multiple levels of generalisation - as an alternative to conventional structural and similarity methods. The tool can potentially also be used for drug repurposing and the creation of train/validation data sets for machine learning. Here is an example of structural decomposition from *molecule* to *scaffold* and finally *shape*.

![Example of usage](https://github.com/ghiander/novana/blob/main/docs/static/example.png?raw=true)

## Method
For a given SMILES input:
- A *scaffold* is generated by removing recursively all terminal atoms until only atoms bonded with at least two different atoms are retained (e.g., cycles and chains). Double-bonded terminal atoms (e.g., oxygens of a carbonyl/sulfonyl group) are removed. The bad valences of the retained atoms are fixed by adding or removing hydrogens according to some heuristics which can be found in `valence.py`. The charges of atoms that have been modified by the algorithm are neutralised, whereas the charges of other atoms in the input remain unchanged.
- A *shape* is produced as a further decomposition of a *scaffold* by converting all its atoms into single bonded, non-aromatic, neutral carbons.

Novana also deals with mixtures automatically by extracting the largest fragment containing rings and using it as input to the decomposition. If no structures with rings are found in the input SMILES, Novana throws an error.

## How to install the tool
```bash
pip install novana
```

## Usage
```python
# Example of input SMILES
from rdkit import Chem
smiles = "COC(=O)N1CC2(C1)CS(=O)(=Nc1cc(C)c3c(Nc4ccc(F)cc4O[C@H](C)C(=O)NCC(F)(F)F)ncnc3c1)C2"
Chem.MolFromSmiles(smiles)
```
![Example of molecule](https://github.com/ghiander/novana/blob/main/docs/static/example_molecule.png?raw=true)

```python
# Decompose into scaffold
from novana.api import scaffold_from_smiles
scaffold_from_smiles(smiles)
```
![Example of scaffold](https://github.com/ghiander/novana/blob/main/docs/static/example_scaffold.png?raw=true)

```python
# Decompose into shape
from novana.api import shape_from_smiles
shape_from_smiles(smiles)
```
![Example of shape](https://github.com/ghiander/novana/blob/main/docs/static/example_shape.png?raw=true)

```python
# Molecule, scaffold, and shape can also be obtained efficiently in one run
# (This function can be particularly useful for processing large sets)
from novana.api import molecule_scaffold_shape_from_smiles
mol, sfl, shp = molecule_scaffold_shape_from_smiles(smiles)
```

## License
Distributed under the terms of the `MIT license`. *Novana* is free and open-source software.

## For developers
- The package can be installed from the wheel in the `dist/` folder. When a new version needs to be released, a new wheel must be built. That can be done by changing the version of the package inside `setup.py` then calling `python setup.py bdist_wheel` and `python setup.py sdist` which will create a new build.
- The code can be automatically tested using `python setup.py test` which requires `pytest` to be installed
- The `Makefile` can also be used for building (`make build`) or testing (`make test`)
- Before committing new code, please always check that the formatting is consistent using `flake8`
