Metadata-Version: 2.4
Name: pasi
Version: 0.3.0
Summary: Predictive Accuracy Subgroup Identification (PASI) trees and ensemble methods
Project-URL: Homepage, https://github.com/ruotaozhang/pasi
Author: Ruotao Zhang
License: MIT
License-File: LICENSE
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Requires-Dist: joblib
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: progressbar2
Requires-Dist: scikit-learn
Requires-Dist: scipy
Provides-Extra: numba
Requires-Dist: numba; extra == 'numba'
Description-Content-Type: text/markdown

# pasi

**Predictive Accuracy Subgroup Identification (PASI)** — decision trees and ensemble methods for identifying subgroups with heterogeneous predictive model accuracy.

## Installation

```bash
pip install pasi
```

For Numba-accelerated computation:

```bash
pip install pasi[numba]
```

## Quick Start

```python
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from pasi import pasiTree

# Load data
data = pd.read_csv("thyroid_dataset.csv")
X = data.drop(columns=["target"])
y = data["target"].values

# Fit a predictive model
model = LogisticRegression(max_iter=1000)
model.fit(X, y)
y_pred = model.predict_proba(X)[:, 1]

# Fit a PASI tree (AUC-based)
pasi_model = pasiTree(measure='auc', min_samples_leaf=100, max_depth=3)
pasi_model.fit(X, y=y, y_pred=y_pred)

# Visualize
dot_string = pasi_model.tree.export_graphviz(
    feature_names=list(X.columns),
    measure_name='auc'
)
```

## Available Classes

| Class | Description |
|-------|-------------|
| `pasiTree` | PASI decision tree with support for `indiv`, `auc`, and `auprc` measures |
| `pasiRF` | Random forest ensemble of PASI trees |
| `pasiGB` | Gradient boosting with PASI tree weak learners |
| `mvModelComb` | Majority-vote model combiner |
| `emModelComb` | EM-based model combiner |

## Accuracy Measures

- **`indiv`** — Individual-level accuracy using pseudo-response `mu`
- **`auc`** — ROC AUC-based splitting via DeLong's method
- **`auprc`** — Area Under Precision-Recall Curve (bootstrap-based)

## License

MIT
