Metadata-Version: 2.1
Name: synformer
Version: 0.1.0
Summary: Synformer: Generative Model for Synthesizable Molecule Generation
Home-page: https://github.com/wenhao-gao/synformer
Author: Wenhao Gao, Shitong Luo, Connor W. Coley
Author-email: gaowh19@gmail.com
License: Apache-2.0
Platform: UNKNOWN
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: rdkit (>=2023.09)
Requires-Dist: biopython
Requires-Dist: pytdc
Requires-Dist: torch
Requires-Dist: tensorboard
Requires-Dist: einops
Requires-Dist: scikit-learn
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: lmdb
Requires-Dist: pyyaml
Requires-Dist: types-pyyaml
Requires-Dist: joblib
Requires-Dist: omegaconf
Requires-Dist: gitpython
Requires-Dist: matplotlib
Requires-Dist: seaborn
Requires-Dist: tqdm
Requires-Dist: jupyterlab
Requires-Dist: ipywidgets
Requires-Dist: click

# synformer

This repo contaisn the main code for SynFormer model. To run the model, you need to first download the list of building blocks. We only provide a 5k building block list for test purpose.

# TO-DOs

- [ ] Clean the repo and prepare for the release
- [ ] Make a demo on Huggingface

# Instruction

0. Start (for development)

```bash
pip install -e .
```

1. Preprocess data

Download building block data to `data/building_blocks/` in sdf format .

```bash
python scripts/preprocess.py --model-config configs/dev_smiles_diffusion.yml
```
2. Train the model

```bash
python scripts/train.py configs/dev_smiles_diffusion.yml --debug --devices 1
# python scripts/train_ed.py configs/dev_ed.yml --debug --devices 1
```

3. Model inference

Work in progress

4. Use for molecular optimization

```bash
python scripts/molopt.py \
    --model <decoder-only-model-checkpoint> \
    --use-replay-buffer \
    --use-prior \
    --oracle <domain>:<task>  # Example: "tdc:osimertinib_mpo"
```


