Metadata-Version: 2.1
Name: smace
Version: 0.0
Summary: Semi-Model-Agnostic Contextual Explainer library
Home-page: https://github.com/gianluigilopardo/smace
Author: Gianluigi Lopardo
Author-email: Gianluigi.Lopardo@inria.fr
License: UNKNOWN
Project-URL: Homepage, http://github.com/gianluigilopardo/smace
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: numpy (==1.20.3)
Requires-Dist: pandas (==1.2.5)
Requires-Dist: cvxpy (==1.1.15)
Requires-Dist: shap (==0.39.0)
Requires-Dist: lime (==0.2.0.1)
Requires-Dist: matplotlib (==3.4.3)
Requires-Dist: xgboost (==1.4.2)
Requires-Dist: scikit-learn (==0.24.2)
Requires-Dist: scipy (==1.7.1)


# SMACE --- Semi-Model-Agnostic Contextual Explainer

Python code for [*SMACE: A New Method for the Interpretability of Composite Decision Systems*](https://arxiv.org/abs/2111.08749).

The code is stored in two main repositories: `smace` and `evaluation`. The first one contains the code behind the method (see below for **Usage**).
The `evaluation` folder contains a `notebooks` subfolder, where some simple example are given as [Jupyter Notebook](https://jupyter.org/).
Aggregated experiments are in the `experiments` folder and the results saved in the `results` subfolder.

## Evaluation
The experiments in Section **5.1 Simple cases** of the paper are available as notebooks (in  `evaluation/notebooks`) :
* `rule_only.ipynb` refers to **Rules only**
* `hybrid_paper.ipynb` refers to **Symple hybrid system**
The folder also contains additional experiments.

The experiment in Section **5.2 Realistic use case** of the paper is generated by `telco.py` in  `evaluation/experiments`. The folder contains additional experiments with different decision-making systems on synthetic data.
These experiments should be performed individually, and when finished, the results will be available in the directory `evaluation/experiments/results`.


## Usage
First,  one must define the decision-making system,  _i.e._,  a  `DM`  object.  To define it,  you need a set of rules in JSON format,  a list of models,  and a  [pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).

### 1. Define your set of rules
The rules must be defined in a JSON object,  resulting in Python lists/dictionaries.  Each  `rule`  is a dictionary with two fields:  `conditions`  and  `decision`.  The latter is the output of the decision process,  if the rule is satisfied.  A condition is defined by the triple  `(name, operator, value)`:
-   `name`  is the variable referred to;
-   `operator`  can be  `geq`  ($\geq$),  `gt`  ($>$),  `leq`  ($\leq$),  `lt`  ($<$);
-   `value`  is the cutoff.

As an example,  let us say our set of variables includes four features:  $x_1$,  $x_2$,  $x_3$,  $x_4$,  and two models:  `model_1`  and  `model_2`.  The JSON with two rules  `rule1`  and  `rule2`  can be as follow:

```
{"rule1": {"conditions": [{"name": "x2",
                             "operator": "geq",
                             "value": 0.6},
                            {"name": "x3",
                             "operator": "geq",
                             "value": 0.25},
                            {"name": "model_1",
                             "operator": "geq",
                             "value": 1},
                            {"name": "model_2",
                             "operator": "leq",
                             "value": 50}],
            "decision": "decision1"},
"rule2": {"conditions": [{"name": "x4",
                             "operator": "geq",
                             "value": 0.1},
                            {"name": "model_1",
                             "operator": "geq",
                             "value": 0.2},
                            {"name": "x1",
                             "operator": "geq",
                             "value": 0.1},
                            {"name": "x4",
                             "operator": "leq",
                             "value": 0.9}],
            "decision": "decision2"}
}

```

Once defined,  to read a JSON file one can use the  `json`  ([docs here](https://docs.python.org/3/library/json.html))  to read it:

```
import json
with open('rules.json', 'r') as fp:
    rules_json = json.load(fp)

```

### 2. Define your list of models

A model can be any function that works on a subset of the original data,  with a numerical output.  `DM`  needs a  `Model`  object initialized as  `Model(predictive_function, model_name, data)`,  where

-   `predictive_function`  is the function that produces the output. In the case of a  `sklearn`  model  `m`  for regression (resp., for classification), for instance, it corresponds to  `m.predict`  (resp.,  `m.predict_proba`);
-   `model_name`  is the name used in the rules to refer to the output of the model;
-   `data`  is the  `pandas.DataFrame`  to which the model is applied.

For example,  assuming we have a dataset  `X`  and two targets  `y1`  and  `y2`,  we can proceed as follows:

```
from smace.models import Model

lm = linear_model.LinearRegression()
lm.fit(X,y1)

xgb = xgboost.XGBClassifier()
xgb.fit(X,y2)

model_1 = Model(lm.predict, 'model_1', df)
model_2 = Model(xgb.predict_proba, 'model_2', df)

models_list = [model_1, model_2]

```

### 3. Define the  `DM`  object

Having the rules  `rules_json`,  the list of models  `models_list`  and the input dataset  `df`,  you can construct the  `DM`  object as

```
from smace.decisions import DM
dm = DM(rules_json, models_list, df)

```

To get the decision explicitly for an example,  we use the  `make_decision()`  function:

```
example = np.random.rand(4)
decision = dm.make_decision(example, verbose=True)

```

```
Output:
    Rule(s) ['rule1'] triggered.
    Decision(s) ['decision1'] made.

```

### Apply SMACE

Once the configuration is complete,  you can use SMACE to explain the decisions of the defined system.

Let us say we want to explain why for the example above  `rule2`  was not triggered:

```
from smace.explainer import Smace
explainer = Smace(dm)

explanation = explainer.explain(example, 'rule2')

```

`explanation`  contains all the information computed by SMACE.  The following methods can be applied:

-   `explanation.table()`  and  `explanation.bar()`  to obtain the overall contributions of the input features as tables or bars, respectively;
-   `explanation.rule_table()`  and  `explanation.rule_bar()`  to get the contributions of all variables in the rule as tables or bars, respectively;
-   `explanation.model_table('mod')`  and  `explanation.model_bar('mod')`  to get the importance of input features to the model named  `'mod'`.

It is possible to specify the maximum number of variables to display through the  `num_features`  parameters.

