PB assignation¶
Table of Contents
Note
This page is initialy a jupyter notebook. You can see a notebook HTML render of it or download the notebook itself.
We hereby demonstrate how to use the API to assign PB sequences.
from __future__ import print_function, division
from pprint import pprint
import os
import pbxplore as pbx
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-2-853cbdb98b68> in <module>()
----> 1 import pbxplore as pbx
ImportError: No module named pbxplore
Use the built-in structure parser¶
Assign PB for a single structure¶
The pbxplore.chains_from_files() function is the prefered way to
read PDB and PDBx/mmCIF files using PBxplore. This function takes a list
of file path as argument, and yield each chain it can read from these
files. It provides a single interface to read PDB and PDBx/mmCIF files,
to read single model and multimodel files, and to read a single file of
a collection of files.
Here we want to read a single file with a single model and a single
chain. Therefore, we need the first and only record that is yield by
pbxplore.chains_from_files(). This record contains a name for
the chain, and the chain itself as a
pbxplore.structure.structure.Chain object. Note that, even if
we want to read a single file, we need to provide it as a list to
pbxplore.chains_from_files().
pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '1BTA.pdb')
structure_reader = pbx.chains_from_files([pdb_path])
chain_name, chain = next(structure_reader)
print(chain_name)
print(chain)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-3-8a9163a9b9bb> in <module>()
----> 1 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '1BTA.pdb')
2 structure_reader = pbx.chains_from_files([pdb_path])
3 chain_name, chain = next(structure_reader)
4 print(chain_name)
5 print(chain)
NameError: name 'pbx' is not defined
Protein Blocks are assigned based on the dihedral angles of the
backbone. So we need to calculate them. The
pbxplore.structure.structure.Chain.get_phi_psi_angles() methods
calculate these angles and return them in a form that can be directly
provided to the assignement function.
The dihedral angles are returned as a dictionnary. Each key of this dictionary is a residue number, and each value is a dictionary with the phi and psi angles.
dihedrals = chain.get_phi_psi_angles()
pprint(dihedrals)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-4-9cf0a8bc4086> in <module>()
----> 1 dihedrals = chain.get_phi_psi_angles()
2 pprint(dihedrals)
NameError: name 'chain' is not defined
The dihedral angles can be provided to the pbxplore.assign()
function that assigns a Protein Block to each residue, and that returns
the PB sequence as a string. Note that the first and last two residues
are assigned to the Z jocker block as some dihedral angles cannot be
calculated.
pb_seq = pbx.assign(dihedrals)
print(pb_seq)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-5-71c7132535bf> in <module>()
----> 1 pb_seq = pbx.assign(dihedrals)
2 print(pb_seq)
NameError: name 'pbx' is not defined
Assign PB for several models of a single file¶
A single PDB file can contain several models. Then, we do not want to read only the first chain. Instead, we want to iterate over all the chains.
pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '2LFU.pdb')
for chain_name, chain in pbx.chains_from_files([pdb_path]):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-378e76790064> in <module>()
----> 1 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '2LFU.pdb')
2 for chain_name, chain in pbx.chains_from_files([pdb_path]):
3 dihedrals = chain.get_phi_psi_angles()
4 pb_seq = pbx.assign(dihedrals)
5 print('* {}'.format(chain_name))
NameError: name 'pbx' is not defined
Assign PB for a set of structures¶
The pbxplore.chains_from_files() function can also handle
several chains from several files.
import glob
files = [os.path.join(pbx.DEMO_DATA_PATH, pdb_name)
for pdb_name in ('1BTA.pdb', '2LFU.pdb', '3ICH.pdb')]
print('The following files will be used:')
pprint(files)
for chain_name, chain in pbx.chains_from_files(files):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-73672ce266c9> in <module>()
1 import glob
2 files = [os.path.join(pbx.DEMO_DATA_PATH, pdb_name)
----> 3 for pdb_name in ('1BTA.pdb', '2LFU.pdb', '3ICH.pdb')]
4 print('The following files will be used:')
5 pprint(files)
NameError: name 'pbx' is not defined
Assign PB for frames in a trajectory¶
PB sequences can be assigned from a trajectory. To do so, we use the
pbxplore.chains_from_trajectory() function that takes the path
to a trajectory and the path to the corresponding topology as argument.
Any file formats readable by MDAnalysis can be used. Except for its
arguments, pbxplore.chains_from_trajectory() works the same as
pbxplore.chains_from_files().
** Note that MDAnalysis is required to use this feature. **
trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')
topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')
for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-5c7dfa551061> in <module>()
----> 1 trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')
2 topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')
3 for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
4 dihedrals = chain.get_phi_psi_angles()
5 pb_seq = pbx.assign(dihedrals)
NameError: name 'pbx' is not defined
Use a different structure parser¶
Providing the dihedral angles can be formated as expected by
pbxplore.assign(), the source of these angles does not matter.
For instance, other PDB parser can be used with PBxplore.
BioPython¶
import Bio.PDB
import math
pdb_path = os.path.join(pbx.DEMO_DATA_PATH, "2LFU.pdb")
for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_path):
for chain in model:
polypeptides = Bio.PDB.PPBuilder().build_peptides(chain)
for poly_index, poly in enumerate(polypeptides):
dihedral_list = poly.get_phi_psi_list()
dihedrals = {}
for resid, (phi, psi) in enumerate(dihedral_list, start=1):
if not phi is None:
phi = 180 * phi / math.pi
if not psi is None:
psi = 180 * psi / math.pi
dihedrals[resid] = {'phi': phi, 'psi': psi}
print(model, chain)
pb_seq = pbx.assign(dihedrals)
print(pb_seq)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-7d6a63049325> in <module>()
2 import math
3
----> 4 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, "2LFU.pdb")
5 for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_path):
6 for chain in model:
NameError: name 'pbx' is not defined