Metadata-Version: 2.1
Name: macromol_census
Version: 0.2.1
Summary: Tools for creating machine-learning datasets from macromolecular structure 
Author-email: Kale Kundert <kale@thekunderts.net>
Requires-Python: ~=3.10
Description-Content-Type: text/x-rst
Classifier: Programming Language :: Python :: 3
Requires-Dist: biopython
Requires-Dist: docopt
Requires-Dist: duckdb>=0.10.0
Requires-Dist: gemmi
Requires-Dist: more_itertools
Requires-Dist: networkx
Requires-Dist: numpy
Requires-Dist: polars
Requires-Dist: pyarrow
Requires-Dist: scipy
Requires-Dist: tidyexc
Requires-Dist: tqdm
Requires-Dist: sphinx ; extra == "doc"
Requires-Dist: sphinx_rtd_theme ; extra == "doc"
Requires-Dist: autoclasstoc ; extra == "doc"
Requires-Dist: pytest ; extra == "test"
Requires-Dist: parametrize_from_file ; extra == "test"
Requires-Dist: pytest_unordered ; extra == "test"
Project-URL: Bug Tracker, https://github.com/kalekundert/macromol_census/issues
Project-URL: Continuous Integration, https://github.com/kalekundert/macromol_census/actions
Project-URL: Documentation, https://macromol-census.readthedocs.io/en/latest/
Project-URL: Test Coverage, https://coveralls.io/github/kalekundert/macromol_census
Project-URL: Version Control, https://github.com/kalekundert/macromol_census
Provides-Extra: doc
Provides-Extra: test

********************
Macromolecule Census
********************

.. image:: https://img.shields.io/pypi/v/macromol_census.svg
   :alt: Last release
   :target: https://pypi.python.org/pypi/macromol_census

.. image:: https://img.shields.io/pypi/pyversions/macromol_census.svg
   :alt: Python version
   :target: https://pypi.python.org/pypi/macromol_census

.. image:: https://img.shields.io/readthedocs/macromol_census.svg
   :alt: Documentation
   :target: https://macromol-census.readthedocs.io/en/latest/?badge=latest

.. image:: https://img.shields.io/github/actions/workflow/status/kalekundert/macromol_census/test.yml?branch=master
   :alt: Test status
   :target: https://github.com/kalekundert/macromol_census/actions

.. image:: https://img.shields.io/coveralls/kalekundert/macromol_census.svg
   :alt: Test coverage
   :target: https://coveralls.io/github/kalekundert/macromol_census?branch=master

.. image:: https://img.shields.io/github/last-commit/kalekundert/macromol_census?logo=github
   :alt: Last commit
   :target: https://github.com/kalekundert/macromol_census

*Macromolecule Census* is a set of tools for creating machine-learning datasets 
from macromolecular structure data, especially those made available by the 
protein data bank (PDB).  The purpose of these tools is to account for the 
following:

- Filter for high-quality (e.g. high resolution, low R-factor), low-redundancy 
  (i.e. sequence identity cutoffs) structures.

- Make robust training/validation/test splits by accounting for domain-level 
  structural similarities.

- Store atomic coordinates in a compact, portable, standard format (SQLite).

