Metadata-Version: 2.4
Name: TreeOrdination
Version: 1.3.5
Summary: Projection of High-Dimensional Data Using Multivariate Decision Trees and UMAP
Project-URL: Homepage, https://github.com/jrudar/TreeOrdination
Project-URL: Repository, https://github.com/jrudar/TreeOrdination.git
Project-URL: Bug Tracker, https://github.com/jrudar/TreeOrdination/issues
Author: G. Brian Golding, Stefan C. Kremer
Author-email: Josip Rudar <joe.rudar@inspection.gc.ca>, Mehrdad Hajibabaei <mhajibab@uoguelph.ca>
License: MIT License
        
        Copyright (c) 2023 Josip Rudar, G.Brian Golding, Stefan C. Kremer, Mehrdad Hajibabaei
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: dimensionality reduction,ecology,multivariate statistics
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.11
Requires-Dist: landmarkclassifier>=2.1.0
Requires-Dist: numpy==2.1.3
Requires-Dist: scikit-learn>=1.6.1
Requires-Dist: seaborn
Requires-Dist: shap>=0.47.1
Requires-Dist: umap-learn>=0.5.7
Provides-Extra: dev
Requires-Dist: black; extra == 'dev'
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Requires-Dist: twine; extra == 'dev'
Provides-Extra: test
Requires-Dist: pytest; extra == 'test'
Requires-Dist: pytest-cov; extra == 'test'
Description-Content-Type: text/markdown

### TreeOrdination
[![CI](https://github.com/jrudar/TreeOrdination/actions/workflows/ci.yml/badge.svg)](https://github.com/jrudar/TreeOrdination/actions/workflows/ci.yml)

Implementation of a wrapper which creates unsupervised projections using LANDMark and UMAP.
    
### Install
From PyPI:

```
pip install TreeOrdination
```

From source:

```bash
git clone https://github.com/jrudar/TreeOrdination.git
cd TreeOrdination
pip install .
# or create a virtual environment
python -m venv venv
source venv/bin/activate
pip install .
```
            
### Example Usage
        from TreeOrdination import TreeOrdination
        from sklearn.datasets import make_classification
        
        #Create the dataset
        X, y = make_classification(n_samples = 200, n_informative = 20)
        
        #Give features a name
        f_names = ["Feature %s" %str(i) for i in range(X.shape[0])]
        
        tree_ord = TreeOrdination(feature_names = f_names).fit(X, y)

        #This is the LANDMark embedding of the dataset. This dataset is used to train the supervised model ('supervised_clf' parameter)
        landmark_embedding = tree_ord.LM_emb
        
        #This is the UMAP projection of the LANDMark embedding
        umap_projection = tree_ord.UMAP_emb
        
        #This is the PCA projetion of the UMAP embedding
        pca_projection = tree_ord.PCA_emb     

### Notebooks and Other Examples
Comming Soon.
When available, examples of how to use `TreeOrdination` will be found [here](notebooks/README.md).

### Interface
An overview of the API can be found [here](docs/API.md).

### Contributing
To contribute to the development of `TreeOrdination` please read our [contributing guide](docs/CONTRIBUTING.md)

### References

Rudar, J., Porter, T.M., Wright, M., Golding G.B., Hajibabaei, M. LANDMark: an ensemble 
approach to the supervised selection of biomarkers in high-throughput sequencing data. 
BMC Bioinformatics 23, 110 (2022). https://doi.org/10.1186/s12859-022-04631-z

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: 
Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–30. 
   
Geurts P, Ernst D, Wehenkel L. Extremely Randomized Trees. Machine Learning. 2006;63(1):3–42.
    
Rudar, J., Golding, G.B., Kremer, S.C., Hajibabaei, M. (2023). Decision Tree Ensembles Utilizing 
Multivariate Splits Are Effective at Investigating Beta Diversity in Medically Relevant 16S Amplicon 
Sequencing Data. Microbiology Spectrum e02065-22.

Jai Ram Rideout, Greg Caporaso, Evan Bolyen, Daniel McDonald, Yoshiki Vázquez Baeza, Jorge Cañardo
Alastuey, Anders Pitman, Jamie Morton, Qiyun Zhu, Jose Navas, Kestrel Gorlick, Justine Debelius, 
Zech Xu, Matt Aton, llcooljohn, Joshua Shorenstein, Laurent Luce, Will Van Treuren, John Chase, 
… Dr. K. D. Murray. (2025). scikit-bio/scikit-bio: scikit-bio 0.6.3 (0.6.3). 
Zenodo. https://doi.org/10.5281/zenodo.14640761

