
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/evaluation/plot_metrics.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_evaluation_plot_metrics.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_evaluation_plot_metrics.py:


=======================================
Metrics specific to imbalanced learning
=======================================

Specific metrics have been developed to evaluate classifier which
has been trained using imbalanced data. :mod:`imblearn` provides mainly
two additional metrics which are not implemented in :mod:`sklearn`: (i)
geometric mean and (ii) index balanced accuracy.

.. GENERATED FROM PYTHON SOURCE LINES 11-15

.. code-block:: default


    # Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
    # License: MIT








.. GENERATED FROM PYTHON SOURCE LINES 16-20

.. code-block:: default

    print(__doc__)

    RANDOM_STATE = 42








.. GENERATED FROM PYTHON SOURCE LINES 21-22

First, we will generate some imbalanced dataset.

.. GENERATED FROM PYTHON SOURCE LINES 24-39

.. code-block:: default

    from sklearn.datasets import make_classification

    X, y = make_classification(
        n_classes=3,
        class_sep=2,
        weights=[0.1, 0.9],
        n_informative=10,
        n_redundant=1,
        flip_y=0,
        n_features=20,
        n_clusters_per_class=4,
        n_samples=5000,
        random_state=RANDOM_STATE,
    )








.. GENERATED FROM PYTHON SOURCE LINES 40-41

We will split the data into a training and testing set.

.. GENERATED FROM PYTHON SOURCE LINES 43-49

.. code-block:: default

    from sklearn.model_selection import train_test_split

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, stratify=y, random_state=RANDOM_STATE
    )








.. GENERATED FROM PYTHON SOURCE LINES 50-52

We will create a pipeline made of a :class:`~imblearn.over_sampling.SMOTE`
over-sampler followed by a :class:`~sklearn.svm.LinearSVC` classifier.

.. GENERATED FROM PYTHON SOURCE LINES 54-62

.. code-block:: default

    from imblearn.pipeline import make_pipeline
    from imblearn.over_sampling import SMOTE
    from sklearn.svm import LinearSVC

    model = make_pipeline(
        SMOTE(random_state=RANDOM_STATE), LinearSVC(random_state=RANDOM_STATE)
    )








.. GENERATED FROM PYTHON SOURCE LINES 63-67

Now, we will train the model on the training set and get the prediction
associated with the testing set. Be aware that the resampling will happen
only when calling `fit`: the number of samples in `y_pred` is the same than
in `y_test`.

.. GENERATED FROM PYTHON SOURCE LINES 69-72

.. code-block:: default

    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    /Users/glemaitre/Documents/packages/scikit-learn/sklearn/svm/_base.py:1199: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
      warnings.warn(




.. GENERATED FROM PYTHON SOURCE LINES 73-76

The geometric mean corresponds to the square root of the product of the
sensitivity and specificity. Combining the two metrics should account for
the balancing of the dataset.

.. GENERATED FROM PYTHON SOURCE LINES 78-82

.. code-block:: default

    from imblearn.metrics import geometric_mean_score

    print(f"The geometric mean is {geometric_mean_score(y_test, y_pred):.3f}")





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    The geometric mean is 0.938




.. GENERATED FROM PYTHON SOURCE LINES 83-85

The index balanced accuracy can transform any metric to be used in
imbalanced learning problems.

.. GENERATED FROM PYTHON SOURCE LINES 87-97

.. code-block:: default

    from imblearn.metrics import make_index_balanced_accuracy

    alpha = 0.1
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(geometric_mean_score)

    print(
        f"The IBA using alpha={alpha} and the geometric mean: "
        f"{geo_mean(y_test, y_pred):.3f}"
    )





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    The IBA using alpha=0.1 and the geometric mean: 0.880




.. GENERATED FROM PYTHON SOURCE LINES 98-105

.. code-block:: default

    alpha = 0.5
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(geometric_mean_score)

    print(
        f"The IBA using alpha={alpha} and the geometric mean: "
        f"{geo_mean(y_test, y_pred):.3f}"
    )




.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    The IBA using alpha=0.5 and the geometric mean: 0.880





.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.161 seconds)


.. _sphx_glr_download_auto_examples_evaluation_plot_metrics.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example



  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_metrics.py <plot_metrics.py>`



  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_metrics.ipynb <plot_metrics.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
