Metadata-Version: 2.4
Name: cv_compare
Version: 0.1.0
Summary: Tool to quickly compare the performance of a set of baseline ML methodologies. 
Project-URL: bugs, https://github.com/hprich80/cv_compare/issues
Project-URL: changelog, https://github.com/hprich80/cv_compare/releases
Project-URL: documentation, https://hprich80.github.io/cv_compare/
Project-URL: homepage, https://github.com/hprich80/cv_compare
Author-email: Harry Hesketh-Prichard <hheskethprich@gmail.com>
Maintainer-email: Harry Hesketh-Prichard <hheskethprich@gmail.com>
License: MIT
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: matplotlib
Requires-Dist: pandas
Requires-Dist: scikit-learn
Description-Content-Type: text/markdown

# cv_compare

![PyPI version](https://img.shields.io/pypi/v/cv_compare.svg)

Tool to quickly compare the performance of a set of baseline ML methodologies using cross-validation.

* [GitHub](https://github.com/hprich80/cv_compare/) | [PyPI](https://pypi.org/project/cv_compare/)
* Created by [Harry Hesketh-Prichard](https://github.com/hprich80) | GitHub [@hprich80](https://github.com/hprich80) | PyPI [@hprich80](https://pypi.org/user/hprich80/)
* MIT License

## Installation
```bash
pip install cv_compare
```

## Usage
```python
from cv_compare import cv_compare
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=500, random_state=1)

results = cv_compare(X, y, task='classification')

results.summary_table   # ranked table of mean scores
results.plot()          # boxplot of CV score distributions
```

## Arguments

| Argument | Type | Default | Description |
|---|---|---|---|
| X | array-like | required | Feature matrix |
| y | array-like | required | Target vector |
| model_task | str | `'classification'` | `'classification'` or `'regression'` |
| models | list | `None` | Replace default models with a custom list |
| add_models | list | `None` | Append models to the default list |
| random_state | int | `1` | Random seed |
| scaler | bool | `True` | Apply StandardScaler to each pipeline |
| cv_scoring | str | `None` | Sklearn scoring metric. Defaults to `'accuracy'` for classification and `'neg_root_mean_squared_error'` for regression |

## Models included

**Classification:** KNN, Logistic Regression, Decision Tree, Random Forest, Bagging (DT), Bagging (RF), AdaBoost, Gradient Boosting, Voting Classifier

**Regression:** Linear Regression, Decision Tree, Random Forest, Bagging (DT), Bagging (RF), AdaBoost, Gradient Boosting, Voting Regressor

## Development
```bash
git clone git@github.com:hprich80/cv_compare.git
cd cv_compare
pip install -e .
```

Run tests:
```bash
pytest
```

## Author

cv_compare was created in 2026 by Harry Hesketh-Prichard.

Built with [Cookiecutter](https://github.com/cookiecutter/cookiecutter) and the [audreyfeldroy/cookiecutter-pypackage](https://github.com/audreyfeldroy/cookiecutter-pypackage) project template.