

.. _sphx_glr_auto_examples_plot_ovk_regression_multiple_outputs.py:


==============================================
Multi-output Operator-valued kernel Regression
==============================================

An example to illustrate multi-output regression with operator-valued kernels.

We compare Operator-valued kernel (OVK)
- without prior (ID),
- with covariance constraint prior (covariance),
- with multioutput decision tree.

Setting the operator A as a good prior (here a covariance constraint) helps to
achieve a better score on a noisy test dataset. OVK methods can generalise
better than decision trees but are slower to train.



.. image:: /auto_examples/images/sphx_glr_plot_ovk_regression_multiple_outputs_001.png
    :align: center


.. rst-class:: sphx-glr-script-out

 Out::

      seed 123
    Creating dataset...
    Fitting...
    Leaning time DPeriodic ID:  0.26281785965
    Leaning time DPeriodic covariance:  0.9061460495
    Leaning time Trees:  0.00524687767029
    Predicting...
    Test time DPeriodic ID:  0.103090047836
    Test time DPeriodic covariance:  0.0985400676727
    Test time Trees:  0.000425815582275
    plotting
    Done




|


.. code-block:: python


    # Author: Romain Brault <romain.brault@telecom-paristech.fr> with help from
    #         the scikit-learn community.
    # License: MIT

    import operalib as ovk
    from sklearn.tree import DecisionTreeRegressor

    import numpy as np
    import matplotlib.pyplot as plt
    import time

    print(__doc__)

    seed = 123
    np.random.seed(seed)
    print("seed", seed)

    # Create a random dataset
    print("Creating dataset...")
    X = 200 * np.random.rand(1000, 1) - 100
    y = np.array([np.pi * np.sin(X).ravel(), np.pi * np.cos(X).ravel()]).T
    Tr = 2 * np.random.rand(2, 2) - 1
    Tr = np.dot(Tr, Tr.T)
    Tr = Tr / np.linalg.norm(Tr, 2)
    U = np.linalg.cholesky(Tr)
    y = np.dot(y, U)

    # Add some noise
    Sigma = 2 * np.random.rand(2, 2) - 1
    Sigma = np.dot(Sigma, Sigma.T)
    Sigma = 1. * Sigma / np.linalg.norm(Sigma, 2)
    Cov = np.linalg.cholesky(Sigma)
    y += np.dot(np.random.randn(y.shape[0], y.shape[1]), Cov)

    # Fit
    # real period is 2 * pi \approx 6.28, but we set the period to 6 for
    # demonstration purpose
    print("Fitting...")
    start = time.time()
    A = np.eye(2)
    regr_1 = ovk.Ridge('DPeriodic', lbda=0.01, period=6, theta=.995, A=A)
    regr_1.fit(X, y)
    print("Leaning time DPeriodic ID: ", time.time() - start)

    start = time.time()
    A = np.cov(y.T)
    regr_2 = ovk.Ridge('DPeriodic', lbda=0.01, period=6, theta=.995, A=A)
    regr_2.fit(X, y)
    print("Leaning time DPeriodic covariance: ", time.time() - start)

    start = time.time()
    regr_3 = DecisionTreeRegressor(max_depth=100)
    regr_3.fit(X, y)
    print("Leaning time Trees: ", time.time() - start)


    # Predict
    print("Predicting...")
    X_test = np.arange(-100.0, 100.0, .5)[:, np.newaxis]
    start = time.time()
    y_1 = regr_1.predict(X_test)
    print("Test time DPeriodic ID: ", time.time() - start)
    start = time.time()
    # regr_2.linop_.A = np.eye(2)
    y_2 = regr_2.predict(X_test)
    print("Test time DPeriodic covariance: ", time.time() - start)
    start = time.time()
    y_3 = regr_3.predict(X_test)
    print("Test time Trees: ", time.time() - start)

    # Ground truth
    y_t = np.dot(np.array([np.pi * np.sin(X_test).ravel(),
                           np.pi * np.cos(X_test).ravel()]).T, U)

    # Plot
    print("plotting")
    plt.figure()
    plt.scatter(y_1[::1, 0], y_1[::1, 1], c="g", label="OVK Id", s=5., lw=0)
    plt.scatter(y_2[::1, 0], y_2[::1, 1], c="b", label="OVK Cov", s=5., lw=0)
    plt.scatter(y_3[::1, 0], y_3[::1, 1], c="r", label="D TREE", s=5., lw=0)
    # plt.scatter(y[::1, 0], y[::1, 1], c="k", label="Data", s=5., lw = 0)
    plt.scatter(y_t[::1, 0], y_t[::1, 1], c="k", label="True", s=5., lw=0)
    plt.xlim([-4, 4])
    plt.ylim([-4, 4])
    plt.xlabel("data")
    plt.ylabel("target")
    plt.title("Multi-output OVK Regression")
    plt.legend()
    plt.show()

    print("Done")

**Total running time of the script:**
(0 minutes 1.942 seconds)



.. container:: sphx-glr-download

    **Download Python source code:** :download:`plot_ovk_regression_multiple_outputs.py <plot_ovk_regression_multiple_outputs.py>`


.. container:: sphx-glr-download

    **Download IPython notebook:** :download:`plot_ovk_regression_multiple_outputs.ipynb <plot_ovk_regression_multiple_outputs.ipynb>`
