Metadata-Version: 2.4
Name: himalayas
Version: 0.0.5
Summary: Hierarchical Matrix Layout and Annotation Software
Author: Ira Horecka
License-Expression: BSD-3-Clause
Project-URL: Homepage, https://github.com/himalayas-base/himalayas
Project-URL: Repository, https://github.com/himalayas-base/himalayas
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: matplotlib
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: scipy
Requires-Dist: typing_extensions; python_version < "3.10"
Provides-Extra: text
Requires-Dist: nltk; extra == "text"
Dynamic: license-file

# HiMaLAYAS

![Python](https://img.shields.io/badge/python-3.8%2B-yellow)
[![PyPI](https://img.shields.io/pypi/v/himalayas.svg)](https://pypi.python.org/pypi/himalayas)
[![License](https://img.shields.io/badge/license-BSD%203--Clause-blue.svg)](LICENSE)

**Hierarchical Matrix Layout and Annotation Software** (**HiMaLAYAS**) is a
framework for post hoc enrichment-based annotation of hierarchically clustered
matrices. HiMaLAYAS treats dendrogram-defined clusters as statistical units,
evaluates annotation enrichment, and renders significant annotations alongside
their matrix regions. HiMaLAYAS supports both biological and non-biological domains.

For a full description of HiMaLAYAS and its applications, see:
<br>
**Horecka, I. and Rost, H. (unpublished)**.
_HiMaLAYAS: enrichment-based annotation of hierarchically clustered matrices_.
Manuscript in preparation.

## Documentation and Tutorial

Full documentation is available at:

- **Docs:** [himalayas-base.github.io/himalayas-docs](https://himalayas-base.github.io/himalayas-docs)
- **Tutorial Jupyter Notebook Repository:** [github.com/himalayas-base/himalayas-docs](https://github.com/himalayas-base/himalayas-docs)

## Key Features of HiMaLAYAS

- **Real-Valued Matrix Input**: Operates on real-valued matrices encoding
  relationships among observations.
- **Depth-Aware Cluster Definition**: Cuts the dendrogram at a user-defined
  depth to define dendrogram-defined clusters for downstream analysis.
- **Overrepresentation Testing**: Uses a one-sided hypergeometric test to
  evaluate term enrichment in each cluster against the matrix background.
- **Multiple-Testing Control**: Supports Benjamini-Hochberg false discovery
  rate (FDR) correction for cluster-term tests.
- **Annotation Mapping and Rendering**: Maps significant annotations onto the
  clustered matrix and supports publication-ready matrix visualizations.

## Installation

HiMaLAYAS is compatible with Python 3.8 or later and runs on major operating
systems. To install the latest version, run:

```bash
pip install himalayas --upgrade
```

## Example Usage

We applied HiMaLAYAS to a hierarchically clustered
_Saccharomyces cerevisiae_ genetic interaction profile similarity matrix
(Costanzo _et al_., 2016), focusing on genes with high profile variance.
Dendrogram-defined clusters were tested for Gene Ontology Biological Process
(GO BP; Ashburner _et al_., 2000) enrichment, revealing hierarchical
organization of biological processes.

[![HiMaLAYAS workflow overview](https://i.imgur.com/mninW8a.jpeg)](https://i.imgur.com/mninW8a.jpeg)
**HiMaLAYAS workflow and application to a hierarchically clustered yeast
genetic interaction profile similarity matrix (Costanzo _et al_., 2016).**
A real-valued matrix and categorical annotations serve as inputs. The matrix is
cut at a user-defined depth, and each dendrogram-defined cluster is evaluated
for GO BP enrichment.

## Citation

### Primary citation

**Horecka, I. and Rost, H. (unpublished)**
_HiMaLAYAS: enrichment-based annotation of hierarchically clustered matrices_.
Manuscript in preparation.

### Software archive

No Zenodo archive is available yet.

## Contributing

We welcome contributions from the community:

- [Issues Tracker](https://github.com/himalayas-base/himalayas/issues)
- [Source Code](https://github.com/himalayas-base/himalayas/tree/main/src/himalayas)

## Support

If you encounter issues or have suggestions for new features, please use the
[Issues Tracker](https://github.com/himalayas-base/himalayas/issues) on GitHub.

## License

This project is distributed under the [BSD 3-Clause License](LICENSE).
