Metadata-Version: 2.4
Name: shaprpy
Version: 0.4.3
Summary: Python wrapper for the R package shapr (via rpy2)
Author: Martin Jullum, Lars Henry Berge Olsen, Didrik Nielsen
License: # MIT License
        
        Copyright (c) 2019 Norsk Regnesentral
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/NorskRegnesentral/shapr
Project-URL: Documentation, https://norskregnesentral.github.io/shapr/shaprpy.html
Project-URL: Issues, https://github.com/NorskRegnesentral/shapr/issues
Project-URL: Changelog, https://github.com/NorskRegnesentral/shapr/blob/main/python/CHANGELOG.md
Keywords: explainable-ai,shapley-values,machine-learning,model-interpretability
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: LICENSE.md
Requires-Dist: rpy2>=3.5.1
Requires-Dist: numpy>=1.22.3
Requires-Dist: pandas>=1.4.2
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: tabulate>=0.8.10
Requires-Dist: shap>=0.40.0
Requires-Dist: matplotlib>=3.5.0
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: syrupy>=4.0.0; extra == "test"
Requires-Dist: xgboost>=1.5.0; extra == "test"
Provides-Extra: dev
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: syrupy>=4.0.0; extra == "dev"
Requires-Dist: xgboost>=1.5.0; extra == "dev"
Dynamic: license-file

# shaprpy

`shaprpy` is a Python wrapper for the R package [shapr](https://github.com/NorskRegnesentral/shapr),
using the [`rpy2`](https://rpy2.github.io/) Python library to access R from within Python.

> **Note:** This wrapper is **not** as comprehensively tested as the R package.
> `rpy2` has limited support on Windows, and the same therefore applies to `shaprpy`.
> `shaprpy` has only been tested on Linux (and WSL - Windows Subsystem for Linux), and the below instructions assume a Linux environment.
>
> **Requirement:** Python 3.10 or later is required to use `shaprpy`.

## Changelog

For a list of changes and updates to the `shaprpy` package, see the [shaprpy CHANGELOG](https://norskregnesentral.github.io/shapr/py_changelog.html).

---

## Installation

These instructions assume you already have **pip** and **R** installed and available to the Python environment in which you want to run `shaprpy`.

- Official instructions for installing `pip` can be found [here](https://pip.pypa.io/en/stable/installation/).
- Official instructions for installing R can be found [here](https://cran.r-project.org/).

On Debian/Ubuntu-based systems, R can also be installed via:
```bash
sudo apt update
sudo apt install r-base r-base-dev -y
```

### 1. Install the R package `shapr`

`shaprpy` requires the R package `shapr` (version 1.0.5 or newer).
In your R environment, install the latest version from CRAN using:

```bash
Rscript -e 'install.packages("shapr", repos="https://cran.rstudio.com")'
```

### 2. Ensure R is discoverable (R_HOME and PATH)

Sometimes `rpy2` (which `shaprpy` relies on) cannot automatically locate your R installation. To ensure proper detection, verify that:

- R is available in your system `PATH`, **or**
- The `R_HOME` environment variable is set to your R installation directory.

Example:
```bash
export R_HOME=$(R RHOME)
export PATH=$PATH:$(R RHOME)/bin
```

### 3. Install the Python wrapper

Install directly from PyPI with:

```bash
pip install shaprpy
```

#### Local development install (for contributors)
If you have cloned the repository and want to install in development mode for local changes, navigate to the `./python` directory and run:
```bash
pip install -e .
```
The `-e` flag installs in editable mode, allowing local code changes to be reflected immediately.

---

## Quick Demo

```python
from sklearn.ensemble import RandomForestRegressor
from shaprpy import explain
from shaprpy.datasets import load_california_housing

# Load example data
dfx_train, dfx_test, dfy_train, dfy_test = load_california_housing()

# Fit a model
model = RandomForestRegressor()
model.fit(dfx_train, dfy_train.values.flatten())

# Explain predictions
explanation = explain(
    model=model,
    x_train=dfx_train,
    x_explain=dfx_test,
    approach="empirical",
    phi0=dfy_train.mean().item(),
    seed=1
)

explanation.print() # Print the Shapley values

# Get a summary object with computation details
summary = explanation.summary()
print(summary)  # Displays a formatted summary (also available directly via explanation.summary())

# Access specific summary attributes (available with tab-completion in Jupyter)
summary['approach']     # Approach used
summary['timing_summary']['total_time_secs']  # Total computation time

# Extract one or more specific result objects directly
explanation.get_results("proglang") # Programming language used (Python/R)
explanation.get_results("approach") # Approach used
explanation.get_results().keys()  # All available result objects

# Plotting (requires the 'shap' library)
# Convert to a SHAP Explanation object
shap_exp = explanation.to_shap()

import shap
shap.plots.waterfall(shap_exp[0]) # Plot the first observation

```

---

## Supported Models

`shaprpy` can explain predictions from models built with:

- [`scikit-learn`](https://scikit-learn.org/)
- [`keras`](https://keras.io/) (Sequential API)
- [`xgboost`](https://xgboost.readthedocs.io/)

For other model types, you can supply:

- A custom `predict_model` function
- (Optionally) a custom `get_model_specs` function
to `shaprpy.explain`.

---

## Examples

See the [examples folder](https://github.com/NorskRegnesentral/shapr/tree/master/python/examples) on GitHub for runnable examples, including:

- Basic usage with `scikit-learn` models
- Usage with `xgboost` models
- Usage with `keras` models
- A custom PyTorch model
- Usage of the `Shapr` class and associated `ShaprSummary` class for exploration and extraction of explanation results.
- Plotting functionality for the Shapley values through the `shap` package
- The **regression paradigm** described in [Olsen et al. (2024)](https://link.springer.com/article/10.1007/s10618-024-01016-z),
  which shows:
  - How to specify the regression model
  - How to enable automatic cross-validation of hyperparameters
  - How to apply pre-processing steps before fitting regression models
