Metadata-Version: 2.1
Name: picus
Version: 0.0.2
Summary: Pointed Interpretation of Clinical Variant Significance
Home-page: UNKNOWN
Author: bars
Author-email: barslmn@gmail.com
License: UNKNOWN
Keywords: genomics next generation sequencing variant classification human genetics
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Requires-Dist: pandas

# Picus
Pointed Interpretation of Clinical Variant Significance

## Quick Install
* Linux&Mac  

> sudo pip3 install picus

* Windows

> pip install picus

## Example Uses

* Picus examples  

> picus -i input.csv -o output.json


## Evidence Collection Process

### PVS1
* PVS1 null variant (nonsense, frameshift, canonical ±1 or 2 splice sites, initiation codon, single or multiexon deletion) in a gene where LOF is a known mechanism of disease.

#### Status
* Implemented

#### Resources
* LoF genes defined by Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
* Null variants defined as HIGH IMPACT by https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html

#### Conditions
* "gene_symbol" is in LoF gene list.
* "transcript_consequence_terms" is high impact.

#### Shortcomings
* LoF gene list is only predictive and may be missing some actual LoF genes.
* No checks for multiexon deletion.

#### Pathogenic Strong

### PS1
* Same amino acid change as a previously established pathogenic variant regardless of nucleotide change.

#### Status
* Implemented

#### Resources
* Clinvar xml (ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/)

#### Annotation Steps
1. Clinvar data is parsed using https://github.com/barslmn/clinvar.
2. Sample data and clinvar data is merged based on columns "CHR" and "POS".
3. Clinvar feature columns "ALT", "hgvsp", and "clinical_significance" added to original annotation.

#### Conditions
1. "clinical_significance" is pathogenic.
2. Sample "hgvsp" and later added clinvar "hgvsp" changes are the same.
3. Sample "ALT" and clinvar "ALT" are different.

#### Shortcomings

### PS2
* De novo (both maternity and paternity confirmed) in a patient with the disease and no family history.

#### Status
* Not Checked

#### Resources

#### Conditions

#### Shortcomings

### PS3
* Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product

#### Status
* Not Checked

#### Resources

#### Conditions

#### Shortcomings

### PS4
* The prevalence of the variant in affected individuals is significantly increased compared with the prevalence in controls

#### Status
* Implemented

#### Resources
* https://github.com/WGLab/InterVar/blob/master/intervardb/PS4.variants.hg19

#### Conditions
1. "id" is in id list.

#### Shortcomings
1. No idea how the source is made.

#### Pathogenic Moderate

### PM1
* Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of an enzyme) without benign variation

#### Status
* Planned.

#### Resources

#### Conditions

#### Shortcomings

### PM2
* Absent from controls (or at extremely low frequency if recessive) (Table 6) in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium

#### Status
* Implemented

#### Resources
* VEP

#### Conditions
* "gnomad" less than 0.001.

#### Shortcomings

### PM3
* For recessive disorders, detected in trans with a pathogenic variant

#### Status
* Not Checked

#### Resources

#### Conditions

#### Shortcomings

### PM4
* Protein length changes as a result of in-frame deletions/insertions in a nonrepeat region or stop-loss variants

#### Status
* Implemented

#### Resources
* VEP

#### Conditions
* "transcript_consequence_terms" is "inframe_insertion", "inframe_deletion", or "stop_lost".

#### Shortcomings
* No checks for repeat regions.

### PM5
* Novel missense change at an amino acid residue where a different missense change determined to be pathogenic has been seen before

#### Status

#### Resources

#### Conditions

#### Shortcomings

### PM6
* Assumed de novo, but without confirmation of paternity and maternity
#### Status

#### Resources

#### Conditions

#### Shortcomings

#### Pathogenic Supporting

### PP1
* Cosegregation with disease in multiple affected family members in a gene definitively known to cause the disease
#### Status

#### Resources

#### Conditions

#### Shortcomings

### PP2
* Missense variant in a gene that has a low rate of benign missense variation and in which missense variants are a common mechanism of disease
#### Status

#### Resources

#### Conditions

#### Shortcomings

### PP3
* Multiple lines of computational evidence support a deleterious effect on the gene or gene product (conservation, evolutionary, splicing impact, etc.)
#### Status

#### Resources

#### Conditions

#### Shortcomings

### PP4
* Patient’s phenotype or family history is highly specific for a disease with a single genetic etiology
#### Status

#### Resources

#### Conditions

#### Shortcomings

### PP5
* Reputable source recently reports variant as pathogenic, but the evidence is not available to the laboratory to perform an independent evaluation
#### Status

#### Resources

#### Conditions

#### Shortcomings

### Benign

#### Benign Stand-alone

### BA1
* Allele frequency is >5% in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium
#### Status

#### Resources

#### Conditions

#### Shortcomings

### Benign Strong

### BS1
* Allele frequency is greater than expected for disorder
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BS2
* Observed in a healthy adult individual for a recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorder, with full penetrance expected at an early age
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BS3
* Well-established in vitro or in vivo functional studies show no damaging effect on protein function or splicing
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BS4
* Lack of segregation in affected members of a family
#### Status

#### Resources

#### Conditions

#### Shortcomings

### Benign Supporting

### BP1
* Missense variant in a gene for which primarily truncating variants are known to cause disease
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BP2
* Observed in trans with a pathogenic variant for a fully penetrant dominant gene/disorder or observed in cis with a pathogenic variant in any inheritance pattern
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BP3
* In-frame deletions/insertions in a repetitive region without a known function
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BP4
* Multiple lines of computational evidence suggest no impact on gene or gene product (conservation, evolutionary, splicing impact, etc.)
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BP5
* Variant found in a case with an alternate molecular basis for disease
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BP6
* Reputable source recently reports variant as benign, but the evidence is not available to the laboratory to perform an independent evaluation
#### Status

#### Resources

#### Conditions

#### Shortcomings

### BP7
* A synonymous (silent) variant for which splicing prediction algorithms predict no impact to the splice consensus sequence nor the creation of a new splice site AND the nucleotide is not highly conserved
#### Status

#### Resources

#### Conditions

#### Shortcomings


