Metadata-Version: 2.1
Name: senseppi
Version: 0.5.0
Summary: SENSE-PPI: Sequence-based EvolutioNary ScalE Protein-Protein Interaction prediction
Home-page: 
Author: Konstantin Volzhenin
Author-email: konstantin.volzhenin@sorbonne-universite.fr
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE

SENSE-PPI
=======================================

SENSE-PPI is a Deep Learning model for predicting physical protein-protein interactions based on amino acid sequences. 
It is based on embeddings generated by ESM2 and uses Siamese RNN architecture to perform a binary classification.

## Installation

SENSE-PPI requires Python 3.9 or higher. To install the package, run:

```bash
pip install senseppi
```

**N.B.**: if you intend to use the `create_dataset` command to generate new datasets from STRING,
do not forget to additionally install the MMseqs2 software (instructions can be found at: https://github.com/soedinglab/MMseqs2).
The `mmseqs` command should be available in your PATH.

## Usage

There are 5 commands available in the package:

- `train`: trains SENSE-PPI on a given dataset
- `test`: computes test metrics (AUROC, AUPRC, F1, MCC, Presicion, Recall, Accuracy) on a given dataset
- `predict`: predicts interactions for a given dataset
- `predict_string`: predicts interactions for a given dataset using STRING database:
the interactions are taken from the STRING database (based on seed proteins). 
Predictions are compared with the STRING database. Optionally, the graphs can be constructed.
- `create_dataset`: creates a dataset from the STRING database based on the taxonomic ID of the organism.


The original SENSE-PPI repository contains two pretrained models (checkpoints with weights): `senseppi.ckpt` and `dscript.ckpt` pretrained on SENSE-PPI and DSCRIPT human datasets respectively.

- `senseppi.ckpt` (preferred) : Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/senseppi.ckpt)
- `dscript.ckpt` : Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/dscript.ckpt)

**N.B.**: Both pretrained models were made to work with proteins in range 50-800 amino acids.
