Metadata-Version: 2.4
Name: treeml
Version: 0.1.0
Summary: Phylogenetic machine learning: scikit-learn estimators that account for evolutionary non-independence
Home-page: https://github.com/jlsteenwyk/treeml
Author: Jacob L. Steenwyk
Author-email: jlsteenwyk@gmail.com
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: phykit>=1.11.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: scipy>=1.11.3
Requires-Dist: scikit-learn>=1.4.2
Requires-Dist: pandas>=2.0.0
Requires-Dist: shap>=0.42.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# treeml

Phylogenetic machine learning: scikit-learn estimators that account for evolutionary non-independence among species.

## Installation

```shell
pip install treeml
```

## Quick Start

```python
from treeml import PhyloRandomForestRegressor, PhyloDistanceCV
from sklearn.model_selection import cross_val_score
from Bio import Phylo

tree = Phylo.read("species.nwk", "newick")
# X = feature matrix (n_species x p_features)
# y = target vector (n_species)

model = PhyloRandomForestRegressor(n_estimators=100)
model.fit(X, y, tree=tree, species_names=names)

cv = PhyloDistanceCV(tree=tree, species_names=names, n_splits=5)
scores = cross_val_score(model, X, y, cv=cv)
```
