Metadata-Version: 2.1
Name: pairwiseANIviz
Version: 1.0
Summary: Pairwise ANI (Average Nucleotide Identity) visulization tool.
Home-page: https://github.com/RunJiaJi/pairwiseANIviz
Author: Runjia Ji
Author-email: jirunjia@gmail.com
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: matplotlib
Requires-Dist: seaborn
Requires-Dist: scipy

# pairwiseANIviz
<!-- [![PyPI version](https://badge.fury.io/py/asperaSRAgetter.svg)](https://badge.fury.io/py/asperaSRAgetter) -->

Pairwise ANI (Average Nucleotide Identity) visulization tool. This tool is suitable for visualizing the results of pairwise comparisons between multiple genomes.
pairwiseANIviz first using [Scipy](https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage) to perform hierarchical/agglomerative clustering, a clustermap is then generated using [Seaborn](https://seaborn.pydata.org/generated/seaborn.clustermap.html) which supports different [Matplotlib]() colormaps.

#### Main features
* Support different __matplotlib colormaps__
* Taxonomic classification result can be included to __illustrate different taxa__
* Specific __outrange ANI values__ can be set (eg. 95% ANI values) 
* __Multi-format outputs (JPG, PNG, TIFF, SVG, PDF, EPS)__


## Example
#### 1. Using different [matplotlib colormaps](https://matplotlib.org/stable/users/explain/colors/colormaps.html)
<img src="/static/example_with_diffferent_cmap.svg" alt="Figure"/>

#### 2. With taxonomy indicated by different palettes
<img src="/static/example_with_different_palette.svg" alt="Figure"/>

#### 3. With ANI values illustrated
<img src="/static/example_with_annotation.svg" alt="Figure"/>

#### 4. With outrange ANI values (95%) colored red
<img src="/static/example_with_outrangeValue.svg" alt="Figure"/>


## Installation
 
<!-- AsperaSRAgetter has been distributed on [pypi](https://pypi.org/project/AsperaSRAgetter/). You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI [with Conda](https://anaconda.org/hcc/aspera-cli). -->

```shell
# Dependencies: Matplotlib, Seaborn, Scipy, Pandas
# Install pairwiseANIviz using pip
pip install pairwiseANIviz==1.0
```




## Usage


![overallUsage](./static/Overall_Usage.png) 

#### Options

```bash
usage: pairwiseANIviz [options] anifile

positional arguments:
  anifile               File containing pairwise ANI analysis result.

options:
  -h, --help            show this help message and exit
  -v, --version         Show pairwiseANIviz version number and exit.
  -o OUTDIR, --outdir OUTDIR
                        Directory to save the output figures (default 'pairwiseANIviz').
  --method {single,complete,average,weighted,centroid,median,ward}
                        Linkage method to use for calculating clusters (default 'average').
                         See https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage
  --metric {braycurtis,canberra,chebyshev,cityblock,correlation,cosine,dice,euclidean,hamming,jaccard,jensenshannon,kulczynski1,mahalanobis,matching,minkowski,rogerstanimoto,russellrao,seuclidean,sokalmichener,sokalsneath,sqeuclidean,yule}
                        The distance metric to use (default 'euclidean').
                         See https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html#scipy.spatial.distance.pdist
  -cmap COLORMAP, --colormap COLORMAP
                        Matplotlib colormap used when drawing the heatmap of ANI values (default 'Blues').
                         See https://matplotlib.org/stable/users/explain/colors/colormaps.html
  --figWidth FIGWIDTH   Figure width (default '15').
  --figHeight FIGHEIGHT
                        Figure height (default '15').
  --linewidth LINEWIDTH
                        Line width of the main heatmap (default 0.5)
  --linecolor LINECOLOR
                        Line color of the main heatmap (default 'grey').
  --rowCluster          Draw the row cluster.
  --colCluster          Draw the column cluster.
  --annotation          Show ANI values on the plot.
  --outrangeValue OUTRANGEVALUE
                        Cells have ANI values over specific threshold set to red (eg. cells have ANI value >=0.95 set to red) (default 100).
  -c CLASSIFICATIONFILE, --classificationFile CLASSIFICATIONFILE
                        File containing classification result generated by GTDBTk(https://github.com/Ecogenomics/GTDBTk).
  -t {domain,phylum,class,order,family,genus,species}, --taxaLevel {domain,phylum,class,order,family,genus,species}
                        Taxa level illustrated on the plot.
                         Choose from "domain, phylum, class, order, family, genus, species".
                         Note that this parameter only works if classification result was input.
  --colorPalette COLORPALETTE
                        Color palette used to return a specified number of evenly spaced hues which are then used to illustrate different taxa (default 'hls').
                         Note that this parameter only works if classification result was input.

General usage
----------------
1. ANI result visulization **without classification info**:
   $ pairwiseANIviz ani_result.txt

2. ANI result visulization **with classification info**:
   $ pairwiseANIviz ani_result.txt --classificationFile classification_result.tsv

Runjia Ji, 2023

```

## Contact
If you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.
