Metadata-Version: 2.1
Name: gpt2_prot
Version: 0.1
Summary: Single NT/AA resoultion biological GPT2 language modelling
Project-URL: Homepage, https://github.com/JBwdn/gpt2-prot
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: biopython
Requires-Dist: jsonargparse[signatures]>=4.27.7
Requires-Dist: lightning
Requires-Dist: numpy
Requires-Dist: requests
Requires-Dist: tensorboard
Requires-Dist: torch
Requires-Dist: tqdm
Provides-Extra: dev
Requires-Dist: black; extra == "dev"
Requires-Dist: ipython; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: pylint; extra == "dev"
Requires-Dist: pyright; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest; extra == "test"

# gpt2-prot
Train biological language models at single NT or AA resolution.

## Todo

- [x] Add config recipes for eg. foundation model training, specific protein modelling etc.
- [ ] Docstrings etc.
- [ ] Readme instructions
- [ ] AWS spot instances demo
- [ ] Github actions for publishing the package to pypi
- [ ] Add inference mode

## Installation

  Installation from pypi is on the way 

```bash
micromamba create -f environment.yml  # or conda etc.
micromamba activate gpt2-prot

pip install .  # Basic install
pip install -e ".[dev]"  # Install in editable mode with dev dependencies
pip install ".[test]"  # Install the package and all test dependencies
```

## Usage

### From the CLI

```bash
gpt2-prot -h

gpt2-prot fit --config recipes/cas9_analogues.yml  # Run the demo config for cas9 protein language modelling
```

## Development

### Running pre-commit hooks

```bash
# Install the hooks:
pre-commit install

# Run all the hooks:
pre-commit run --all-files
```

### Running tests

Pytest will find all files with the name "test_*.py" or "*_test.py", run simply by calling `pytest` from the repo root.
