
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/over-sampling/plot_shrinkage_effect.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_over-sampling_plot_shrinkage_effect.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_over-sampling_plot_shrinkage_effect.py:


======================================================
Effect of the shrinkage factor in random over-sampling
======================================================

This example shows the effect of the shrinkage factor used to generate the
smoothed bootstrap using the
:class:`~imblearn.over_sampling.RandomOverSampler`.

.. GENERATED FROM PYTHON SOURCE LINES 10-14

.. code-block:: default


    # Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
    # License: MIT








.. GENERATED FROM PYTHON SOURCE LINES 15-21

.. code-block:: default

    print(__doc__)

    import seaborn as sns

    sns.set_context("poster")





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'rocket' which already exists.
      mpl_cm.register_cmap(_name, _cmap)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'rocket_r' which already exists.
      mpl_cm.register_cmap(_name + "_r", _cmap_r)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'mako' which already exists.
      mpl_cm.register_cmap(_name, _cmap)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'mako_r' which already exists.
      mpl_cm.register_cmap(_name + "_r", _cmap_r)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'icefire' which already exists.
      mpl_cm.register_cmap(_name, _cmap)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'icefire_r' which already exists.
      mpl_cm.register_cmap(_name + "_r", _cmap_r)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'vlag' which already exists.
      mpl_cm.register_cmap(_name, _cmap)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'vlag_r' which already exists.
      mpl_cm.register_cmap(_name + "_r", _cmap_r)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'flare' which already exists.
      mpl_cm.register_cmap(_name, _cmap)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'flare_r' which already exists.
      mpl_cm.register_cmap(_name + "_r", _cmap_r)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'crest' which already exists.
      mpl_cm.register_cmap(_name, _cmap)
    /Users/glemaitre/mambaforge/envs/dev/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'crest_r' which already exists.
      mpl_cm.register_cmap(_name + "_r", _cmap_r)




.. GENERATED FROM PYTHON SOURCE LINES 22-24

First, we will generate a toy classification dataset with only few samples.
The ratio between the classes will be imbalanced.

.. GENERATED FROM PYTHON SOURCE LINES 24-37

.. code-block:: default

    from collections import Counter
    from sklearn.datasets import make_classification

    X, y = make_classification(
        n_samples=100,
        n_features=2,
        n_redundant=0,
        weights=[0.1, 0.9],
        random_state=0,
    )
    Counter(y)






.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    Counter({1: 90, 0: 10})



.. GENERATED FROM PYTHON SOURCE LINES 38-48

.. code-block:: default

    import matplotlib.pyplot as plt

    fig, ax = plt.subplots(figsize=(7, 7))
    scatter = plt.scatter(X[:, 0], X[:, 1], c=y, alpha=0.4)
    class_legend = ax.legend(*scatter.legend_elements(), loc="lower left", title="Classes")
    ax.add_artist(class_legend)
    ax.set_xlabel("Feature #1")
    _ = ax.set_ylabel("Feature #2")
    plt.tight_layout()




.. image:: /auto_examples/over-sampling/images/sphx_glr_plot_shrinkage_effect_001.png
    :alt: plot shrinkage effect
    :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 49-52

Now, we will use a :class:`~imblearn.over_sampling.RandomOverSampler` to
generate a bootstrap for the minority class with as many samples as in the
majority class.

.. GENERATED FROM PYTHON SOURCE LINES 52-58

.. code-block:: default

    from imblearn.over_sampling import RandomOverSampler

    sampler = RandomOverSampler(random_state=0)
    X_res, y_res = sampler.fit_resample(X, y)
    Counter(y_res)





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    Counter({1: 90, 0: 90})



.. GENERATED FROM PYTHON SOURCE LINES 59-66

.. code-block:: default

    fig, ax = plt.subplots(figsize=(7, 7))
    scatter = plt.scatter(X_res[:, 0], X_res[:, 1], c=y_res, alpha=0.4)
    class_legend = ax.legend(*scatter.legend_elements(), loc="lower left", title="Classes")
    ax.add_artist(class_legend)
    ax.set_xlabel("Feature #1")
    _ = ax.set_ylabel("Feature #2")
    plt.tight_layout()



.. image:: /auto_examples/over-sampling/images/sphx_glr_plot_shrinkage_effect_002.png
    :alt: plot shrinkage effect
    :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 67-73

We observe that the minority samples are less transparent than the samples
from the majority class. Indeed, it is due to the fact that these samples
of the minority class are repeated during the bootstrap generation.

We can set `shrinkage` to a floating value to add a small perturbation to the
samples created and therefore create a smoothed bootstrap.

.. GENERATED FROM PYTHON SOURCE LINES 73-77

.. code-block:: default

    sampler = RandomOverSampler(shrinkage=1, random_state=0)
    X_res, y_res = sampler.fit_resample(X, y)
    Counter(y_res)





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    Counter({1: 90, 0: 90})



.. GENERATED FROM PYTHON SOURCE LINES 78-86

.. code-block:: default

    fig, ax = plt.subplots(figsize=(7, 7))
    scatter = plt.scatter(X_res[:, 0], X_res[:, 1], c=y_res, alpha=0.4)
    class_legend = ax.legend(*scatter.legend_elements(), loc="lower left", title="Classes")
    ax.add_artist(class_legend)
    ax.set_xlabel("Feature #1")
    _ = ax.set_ylabel("Feature #2")
    plt.tight_layout()




.. image:: /auto_examples/over-sampling/images/sphx_glr_plot_shrinkage_effect_003.png
    :alt: plot shrinkage effect
    :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 87-92

In this case, we see that the samples in the minority class are not
overlapping anymore due to the added noise.

The parameter `shrinkage` allows to add more or less perturbation. Let's
add more perturbation when generating the smoothed bootstrap.

.. GENERATED FROM PYTHON SOURCE LINES 92-96

.. code-block:: default

    sampler = RandomOverSampler(shrinkage=3, random_state=0)
    X_res, y_res = sampler.fit_resample(X, y)
    Counter(y_res)





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    Counter({1: 90, 0: 90})



.. GENERATED FROM PYTHON SOURCE LINES 97-105

.. code-block:: default

    fig, ax = plt.subplots(figsize=(7, 7))
    scatter = plt.scatter(X_res[:, 0], X_res[:, 1], c=y_res, alpha=0.4)
    class_legend = ax.legend(*scatter.legend_elements(), loc="lower left", title="Classes")
    ax.add_artist(class_legend)
    ax.set_xlabel("Feature #1")
    _ = ax.set_ylabel("Feature #2")
    plt.tight_layout()




.. image:: /auto_examples/over-sampling/images/sphx_glr_plot_shrinkage_effect_004.png
    :alt: plot shrinkage effect
    :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 106-108

Increasing the value of `shrinkage` will disperse the new samples. Forcing
the shrinkage to 0 will be equivalent to generating a normal bootstrap.

.. GENERATED FROM PYTHON SOURCE LINES 108-112

.. code-block:: default

    sampler = RandomOverSampler(shrinkage=0, random_state=0)
    X_res, y_res = sampler.fit_resample(X, y)
    Counter(y_res)





.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    Counter({1: 90, 0: 90})



.. GENERATED FROM PYTHON SOURCE LINES 113-121

.. code-block:: default

    fig, ax = plt.subplots(figsize=(7, 7))
    scatter = plt.scatter(X_res[:, 0], X_res[:, 1], c=y_res, alpha=0.4)
    class_legend = ax.legend(*scatter.legend_elements(), loc="lower left", title="Classes")
    ax.add_artist(class_legend)
    ax.set_xlabel("Feature #1")
    _ = ax.set_ylabel("Feature #2")
    plt.tight_layout()




.. image:: /auto_examples/over-sampling/images/sphx_glr_plot_shrinkage_effect_005.png
    :alt: plot shrinkage effect
    :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 122-124

Therefore, the `shrinkage` is handy to manually tune the dispersion of the
new samples.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.291 seconds)


.. _sphx_glr_download_auto_examples_over-sampling_plot_shrinkage_effect.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example



  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_shrinkage_effect.py <plot_shrinkage_effect.py>`



  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_shrinkage_effect.ipynb <plot_shrinkage_effect.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
