Metadata-Version: 2.1
Name: tmtools
Version: 0.0.2
Summary: Python bindings around the TM-align code for structural alignment of proteins
Home-page: https://github.com/jvkersch/tmtools
Author: Joris Vankerschaver
Author-email: joris.vankerschaver@gmail.com
License: GPLv3
Platform: Linux
Platform: Mac OS-X
Platform: Unix
Platform: Windows
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Scientific/Engineering
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: biopython
Requires-Dist: numpy

TM-Tools
========

Python bindings for the TM-align algorithm and code [developed by Zhang et
al](https://zhanggroup.org/TM-align/) for protein structure comparison.


Installation
------------

From the console, simply run
```console
    pip install git+https://github.com/jvkersch/tmtools.git#egg=tmtools
```

The package supports Python 3.6 and up. You will need a fairly recent version
of pip, as well as a C++ compiler that supports C++ 14.

This package supports Linux, macOS, and Windows.

Usage
-----

The function `tmtools.tm_align` takes two NumPy arrays with coordinates for the
residues (with shape `(N, 3)`) and two sequences of peptide codes, performs the
alignment, and returns the optimal rotation matrix and translation, along with
the TM score:
```python
>>> import numpy as np
>>> from tmtools import tm_align
>>>
>>> coords1 = np.array(
...     [[1.2, 3.4, 1.5],
...      [4.0, 2.8, 3.7],
...      [1.2, 4.2, 4.3],
...      [0.0, 1.0, 2.0]])
>>> coords2 = np.array(
...     [[2.3, 7.4, 1.5],
...      [4.0, 2.9, -1.7],
...      [1.2, 4.2, 4.3]])
>>>
>>> seq1 = "AYLP"
>>> seq2 = "ARN"
>>>
>>> res = tm_align(coords1, coords2, seq1, seq2)
>>> res.t
array([ 2.94676159,  5.55265245, -1.75151383])
>>> res.u
array([[ 0.40393231,  0.04161396, -0.91384187],
       [-0.59535733,  0.77040999, -0.22807475],
       [ 0.69454181,  0.63618922,  0.33596866]])
>>> res.tm_norm_chain1
0.3105833326322145
>>> res.tm_norm_chain2
0.414111110176286
```

If you already have some PDB files, you can use the functions from `tmalign.io`
to retrieve the coordinate and sequence data:
```python
>>> from tmtools.io import get_structure, get_residue_data
>>> from tmtools.testing import get_pdb_path
>>> s = get_structure(get_pdb_path("2gtl"))
>>> s
<Structure id=2gtl>
>>> chain = next(s.get_chains())
>>> coords, seq = get_residue_data(chain)
>>> seq
'DCCSYEDRREIRHIWDDVWSSSFTDRRVAIVRAVFDDLFKHYPTSKALFERVKIDEPESGEFKSHLVRVANGLKLLINLLDDTLVLQSHLGHLADQHIQRKGVTKEYFRGIGEAFARVLPQVLSCFNVDAWNRCFHRLVARIAKDLP'
>>> coords.shape
(147, 3)
```

These functions are light-weight wrappers around BioPython.

Credits
-------

This package arose out of a personal desire to better understand both the
TM-score algorithm and the
[pybind11](https://pybind11.readthedocs.io/en/stable/index.html) library to
interface with C++ code. At this point in time it contains no original research
code.

If you use the package for research, you should cite the [original TM-score
papers](https://zhanggroup.org/TM-score/):

- Y. Zhang, J. Skolnick, _Scoring function for automated assessment of protein
  structure template quality_, Proteins, 57: 702-710 (2004).
- J. Xu, Y. Zhang, How significant is a protein structure similarity with
  TM-score=0.5? Bioinformatics, 26, 889-895 (2010).

License
-------

The original TM-align software (version 20210224, released under the MIT
license) is bundled with this repository (`src/extern/TMalign.cpp`). Some small
tweaks had to be made to compile the code on macOS and to embed it as a
library. This modifications are also released under the MIT license.

The rest of the codebase is released under the GPL v3 license.


