Metadata-Version: 2.1
Name: revolutionhtl
Version: 1.0.13
Summary: REvolutionH-tl: Reconstruction of Evolutionary Histories tool
Author-email: José Antonio Ramírez-Rafael <jose.ramirezra@cinvestav.mx>
Project-URL: Homepage, https://gitlab.com/jarr.tecn/revolutionh-tl
Project-URL: Bug Tracker, https://gitlab.com/jarr.tecn/revolutionh-tl/issues
Keywords: Evolution reconstruction,Gene/species tree inference,Tree reconciliation,Orthology,Paralogy,Aligment hits,Best match graphs
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: networkx>=2.8
Requires-Dist: pandas>=1.4.2
Requires-Dist: numpy>=1.22.3
Requires-Dist: tqdm>=4.63.0
Requires-Dist: bmgedit

![REvolutionH-tl logo.](https://gitlab.com/jarr.tecn/revolutionh-tl/-/raw/master/docs/images/Logo_horizontal.png)

Bioinformatics tool for the reconstruction of evolutionary histories. Input: fasta files or sequence alignment hits, Output: orthology. event-labeled gene trees, and reconciliations.

[Bioinformatics & complex networks lab](https://ira.cinvestav.mx/ingenieriagenetica/dra-maribel-hernandez-rosales/bioinformatica-y-redes-complejas/)

- José Antonio Ramírez-Rafael [jose.ramirezra@cinvestav.mx]
- Maribel Hernandez-Rosales [maribel.hr@cinvestav.mx ]

# Install

```bash
pip install revolutionhtl
```

**Requirements**

[Python >=3.7 ](https://www.python.org/)

If you want to run sequence alignments using revolutionhtl, then install [Diamond](https://github.com/bbuchfink/diamond).

# Usage

> Go to the [wiki](https://gitlab.com/jarr.tecn/revolutionh-tl/-/blob/master/docs/wiki.md?ref_type=heads) for details and an [example](https://gitlab.com/jarr.tecn/revolutionh-tl/-/blob/master/docs/example.md?ref_type=heads).

```bash
python -m revolutionhtl <arguments>
```

Below are described the steps of the program, as well as the arguments to specify input files.

## Steps

1. **Orthogroup & best hit selection.** Input: alignment hits (generate this using `revolutionhtl.diamond`) .
2. **Orthology and gene tree reconstruction.** Input: best hits (generate this at step 1).
3. **Species tree reconstruction.** Input: gene trees (generate this at step 2).
4. **Tree reconciliation.** Input: gene and species trees (generate this at steps 2 and 3).

## Arguments


<details>
  <summary> <b>Input data</b> (Click to expand)  </summary> 
  <b>- -h    --help </b> <br/> show this help message and exit <br/> <br/>
  <b>-steps [integers] </b> <br/> List of steps to run (default: 1 2 3 4).  <br/> <br/>
  <b>-alignment_h   --alignment_hits [string]</b> <br/> Directory containing alignment hits, the input of step 1. (default: ./). <br/> <br/>
  <b>-best_h      --best_hits [string]</b> <br/> .tsv file containing best hits, the input of step 2. (default: use output of step 1). <br/> <br/>
  <b>-T      --gene_trees [string]</b> <br/> .tsv file containing gene trees, the input of steps 3 and 4. (default: use output of step 2). <br/> <br/>
  <b>-S     --species_tree [string]</b> <br/> .nhx file containing a species tree, an input of step 4. (default: use output of step 3). <br/> <br/>
</details>


<details>
  <summary> <b>File names</b> (Click to expand)  </summary> 
  <b>-o      --output_prefix [string] </b> <br/>
  Prefix used for output files (default "tl_project").<br/><br/>
  <b>-og      --orthogroup_column [string]</b> <br/>
  Column in -best_h     -T, and output files specifying orthogroups (default: OG).<br/><br/>
  <b>-Nm      --N_max [integer] </b> <br/>
  Indicates the maximum number of genes in a orthogroup, bigger orthogroups are splitted. If 0, no orthogroup is splitted. (default= 2000).<br/><br/>
  <b>-k      --k_size_partition [integer]</b> <br/>
  Integer indicatng how many best hit graphs will be processed in bunch:: first graphs with <k genes, then <2k. then <3k, and so on. (default: k=100)<br/><br/>
</details>

<details>
  <summary> <b>Algorithm parameters</b> (Click to expand)  </summary> 
  <b>-bh_heuristic     --besthit_heuristic  [string] </b> <br/>
  Indicates how to normalize bit-score in step 1 (default: normal). Normal: no normalization, prt: use proteinortho auxiliary files, smallest: use length of the smallest sequence, target: use target sequence, query: use query sequence, directed: x->y hit, bidirectional: use x->y and y->x hits.<br/>
  Options: normal, prt, smallest_bidirectional, smallest_directed, query_directed, target_directed, alignment_directed, query_bidirectional, target_bidirectional, alignment_bidirectional<br/><br/>
  <b>-f      --f_value [float]</b> <br/>
  Real number between 0 and 1, a parameter of step 1. Defines the adaptative threshhold as: f\*max_bit_score (default: 0.95).<br/><br/>
  <b>-bmg_h     --bmg_heuristic [string] </b> <br/>
  Comunity detection method, an heuristic of step 2. (default: Louvain).<br/>
  Options: Mincut, BPMF, Karger, Greedy, Gradient_Walk, Louvain, Louvain_Obj<br/><br/>
  <b>-bmgh_nb      --bmgh_no_binary [bool]</b> <br/>
  Flag, specifies if force binary tree in step 2. (no flag: force binary, flag: do not force binary).<br/><br/>
  <b>-stree_h     --species_tree_heuristic [string]</b> <br/>
  Comunity detection method, an heuristic of step 3. (default: louvain_weight).<br/>
  Options: naive, louvain, mincut, louvain_weight<br/><br/>
  <b>-streeh_repeats     --stree_heuristic_repeats [integer]</b> <br/>
  integer, specifies how many times run the heuristic of step 3. (default: 3)<br/><br/>
  <b>-streeh_b     --streeh_binary [bool]</b> <br/>
  Flag, specifies if force binary tree in step 3. (no flag: do not force binary, flag: force binary).<br/><br/>
  <b>-streeh_ndb     --streeh_no_doble_build [bool]</b> <br/>
  Flag, specifies if run build algorithm twice to obtain less resolved tree in step 3. (no flag: double build, flag: single build).<br/><br/>
</details>



<img src="https://gitlab.com/jarr.tecn/revolutionh-tl/-/raw/master/docs/images/revolution_diagram.png" alt="pipeline" style="zoom:25%;" />
