Metadata-Version: 2.1
Name: pypgx
Version: 0.4.1
Summary: A Python package for pharmacogenomics research
Home-page: https://github.com/sbslee/pypgx
Author: Seung-been "Steven" Lee
Author-email: sbstevenlee@gmail.com
License: MIT
Description: ..
           This file was automatically generated by docs/create.py.
        
        README
        ******
        
        .. image:: https://badge.fury.io/py/pypgx.svg
            :target: https://badge.fury.io/py/pypgx
        
        .. image:: https://readthedocs.org/projects/pypgx/badge/?version=latest
            :target: https://pypgx.readthedocs.io/en/latest/?badge=latest
            :alt: Documentation Status
        
        Introduction
        ============
        
        The main purpose of the PyPGx package is to provide a unified platform for pharmacogenomics (PGx) research.
        
        The package is written in Python, and supports both command line interface (CLI) and application programming interface (API) whose documentations are available at the `Read the Docs <https://pypgx.readthedocs.io/en/latest/>`_.
        
        Your contributions (e.g. feature ideas, pull requests) are most welcome.
        
        | Author: Seung-been "Steven" Lee
        | Email: sbstevenlee@gmail.com
        | License: MIT License
        
        Installation
        ============
        
        The following packages are required to run PyPGx:
        
        .. parsed-literal::
        
           fuc
           scikit-learn
        
        There are various ways you can install PyPGx. The recommended way is via conda (`Anaconda <https://www.anaconda.com/>`__):
        
        .. code-block:: text
        
           $ conda install -c bioconda pypgx
        
        Above will automatically download and install all the dependencies as well. Alternatively, you can use pip (`PyPI <https://pypi.org/>`__) to install PyPGx and all of its dependencies:
        
        .. code-block:: text
        
           $ pip install pypgx
        
        Finally, you can clone the GitHub repository and then install PyPGx locally:
        
        .. code-block:: text
        
           $ git clone https://github.com/sbslee/pypgx
           $ cd pypgx
           $ pip install .
        
        The nice thing about this approach is that you will have access to development versions that are not available in Anaconda or PyPI. For example, you can access a development branch with the ``git checkout`` command. When you do this, please make sure your environment already has all the dependencies installed.
        
        Archive file, semantic type, and metadata
        =========================================
        
        In order to efficiently store and transfer data, PyPGx uses the ZIP archive file format (``.zip``) which supports lossless data compression. Each archive file created by PyPGx has a metadata file (``metadata.txt``) and a data file (e.g. ``data.tsv``, ``data.vcf``). A metadata file contains important information about the data file within the same archive, which is expressed as pairs of ``=``-separated keys and values (e.g. ``Assembly=GRCh37``):
        
        .. list-table::
            :widths: 20 40 40
            :header-rows: 1
        
            * - Metadata
              - Description
              - Examples
            * - ``Assembly``
              - Reference genome assembly.
              - ``GRCh37``, ``GRCh38``
            * - ``Control``
              - Control gene.
              - ``VDR``, ``chr1:10000-20000``
            * - ``Gene``
              - Target gene.
              - ``CYP2D6``, ``GSTT1``
            * - ``Platform``
              - NGS platform.
              - ``WGS``, ``Targeted``
            * - ``Program``
              - Name of the phasing program.
              - ``Beagle``
            * - ``Samples``
              - Samples used for inter-sample normalization.
              - ``NA07000,NA10854,NA11993``
            * - ``SemanticType``
              - Semantic type of the archive.
              - ``CovFrame[CopyNumber]``, ``Model[CNV]``
        
        Notably, all archive files have defined semantic types, which allows us to ensure that the data that is passed to a PyPGx command (CLI) or method (API) is meaningful for the operation that will be performed. Below is a list of currently defined semantic types:
        
        - ``CovFrame[CopyNumber]``
            * CovFrame for storing target gene's per-base copy number which is computed from read depth with control statistics.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Platform``, ``Control``, ``Samples``.
        - ``CovFrame[ReadDepth]``
            * CovFrame for storing target gene's per-base read depth which is computed from BAM files.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Platform``.
        - ``Model[CNV]``
            * Model for calling CNV in target gene.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Control``.
        - ``SampleTable[Alleles]``
            * TSV file for storing target gene's candidate star alleles for each sample.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Program``.
        - ``SampleTable[CNVCalls]``
            * TSV file for storing target gene's CNV call for each sample.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Control``.
        - ``SampleTable[Genotypes]``
            * TSV file for storing target gene's genotype call for each sample.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``.
        - ``SampleTable[Results]``
            * TSV file for storing various results for each sample.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``.
        - ``SampleTable[Statistcs]``
            * TSV file for storing control gene's various statistics on read depth for each sample. Used for converting target gene's read depth to copy number.
            * Requires following metadata: ``Control``, ``Assembly``, ``SemanticType``, ``Platform``.
        - ``VcfFrame[Consolidated]``
            * VcfFrame for storing target gene's consolidated variant data.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Program``.
        - ``VcfFrame[Imported]``
            * VcfFrame for storing target gene's raw variant data.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``.
        - ``VcfFrame[Phased]``
            * VcfFrame for storing target gene's phased variant data.
            * Requires following metadata: ``Gene``, ``Assembly``, ``SemanticType``, ``Program``.
        
        Getting help
        ============
        For detailed documentations on the CLI and API, please refer to the `Read the Docs <https://pypgx.readthedocs.io/en/latest/>`_.
        
        For getting help on the CLI:
        
        .. code-block:: text
        
           $ pypgx -h
        
           usage: pypgx [-h] [-v] COMMAND ...
           
           positional arguments:
             COMMAND
               call-genotypes      Call genotypes for target gene.
               combine-results     Combine various results for the target gene.
               compute-control-statistics
                                   Compute various statistics for control gene with BAM data.
               compute-copy-number
                                   Compute copy number from read depth for target gene.
               compute-target-depth
                                   Compute read depth for target gene with BAM data.
               create-consolidated-vcf
                                   Create consolidated VCF.
               create-read-depth-tsv
                                   Compute read depth for target gene with BAM data.
               create-regions-bed  Create a BED file which contains all regions used by PyPGx.
               estimate-phase-beagle
                                   Estimate haplotype phase of observed variants with the Beagle program.
               filter-samples      Filter Archive file for specified samples.
               import-read-depth   Import read depth data for target gene.
               import-variants     Import variant data for target gene.
               plot-bam-copy-number
                                   Plot copy number profile with BAM data.
               plot-bam-read-depth
                                   Plot read depth profile with BAM data.
               plot-vcf-allele-fraction
                                   Plot allele fraction profile with VCF data.
               plot-vcf-read-depth
                                   Plot read depth profile with VCF data.
               predict-alleles     Predict candidate star alleles based on observed variants.
               predict-cnv         Predict CNV for target gene based on copy number data.
               print-metadata      Print the metadata of specified archive.
               run-ngs-pipeline    Run NGS pipeline for the target gene.
               test-cnv-caller     Test a CNV caller for the target gene.
               train-cnv-caller    Train a CNV caller for the target gene.
           
           optional arguments:
             -h, --help            Show this help message and exit.
             -v, --version         Show the version number and exit.
        
        For getting help on a specific command (e.g. call-genotypes):
        
        .. code-block:: text
        
           $ pypgx call-genotypes -h
        
        Below is the list of submodules available in the API:
        
        - **genotype** : The genotype submodule is a suite of tools for accurately predicting genotype calls.
        - **pipeline** : The pipeline submodule is used to provide convenient methods that combine multiple PyPGx actions and automatically handle semantic types.
        - **plot** : The plot submodule is used to plot various kinds of profiles such as read depth, copy number, and allele fraction.
        - **utils** : The utils submodule is the main suite of tools for PGx research.
        
        
        For getting help on a specific submodule (e.g. utils):
        
        .. code:: python3
        
           >>> from pypgx.api import utils
           >>> help(utils)
        
        CLI examples
        ============
        
        Run NGS pipeline for CYP2D6:
        
        .. code-block:: text
        
           $ pypgx run-ngs-pipeline \
           CYP2D6 \
           CYP2D6-pipeline \
           --vcf input.vcf \
           --panel ref.vcf \
           --tsv input.tsv \
           --control-statistics control-statistics-VDR.zip
        
        API examples
        ============
        
        Predict phenotype based on two haplotype calls:
        
        .. code:: python3
        
            >>> import pypgx
            >>> pypgx.predict_phenotype('CYP2D6', '*4', '*5')   # Both alleles have no function
            'Poor Metabolizer'
            >>> pypgx.predict_phenotype('CYP2D6', '*5', '*4')   # The order of alleles does not matter
            'Poor Metabolizer'
            >>> pypgx.predict_phenotype('CYP2D6', '*1', '*22')  # *22 has uncertain function
            'Indeterminate'
            >>> pypgx.predict_phenotype('CYP2D6', '*1', '*1x2') # Gene duplication
            'Ultrarapid Metabolizer'
            >>> pypgx.predict_phenotype('CYP2B6', '*1', '*4')   # *4 has increased function
            'Rapid Metabolizer'
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Description-Content-Type: text/x-rst
