GroupLasso for linear regressionΒΆ

A sample script for group lasso regression

  • ../_images/sphx_glr_example_group_lasso_001.png
  • ../_images/sphx_glr_example_group_lasso_002.png
  • ../_images/sphx_glr_example_group_lasso_003.png

Out:

Generating data
Generating coefficients
Generating targets
Starting fit
/home/yngvem/Programming/morro/group-lasso/src/group_lasso/_group_lasso.py:383: UserWarning:
The behaviour has changed since v1.1.1, before then, a bug in the optimisation
algorithm made it so the regularisation parameter was scaled by the largest
eigenvalue of the covariance matrix.

To use the old behaviour, initialise the class with the keyword argument
`old_regularisation=True`.

To supress this warning, initialise the class with the keyword argument
`supress_warning=True`

  warnings.warn(_OLD_REG_WARNING)
X shape: (10000, 1049)
True intercept: 2
Estimated intercept: [1.87411639]
/home/yngvem/Programming/morro/group-lasso/examples/example_group_lasso.py:73: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
  plt.show()

from group_lasso import GroupLasso
from utils import (
    get_groups_from_group_sizes,
    generate_group_lasso_coefficients,
)
import numpy as np
import matplotlib.pyplot as plt


GroupLasso.LOG_LOSSES = True


if __name__ == "__main__":
    np.random.seed(0)

    group_sizes = [np.random.randint(15, 30) for i in range(50)]
    groups = get_groups_from_group_sizes(group_sizes)
    num_coeffs = sum(group_sizes)
    num_datapoints = 10000
    noise_level = 0.5
    coeff_noise_level = 0.05

    print("Generating data")
    X = np.random.standard_normal((num_datapoints, num_coeffs))
    intercept = 2

    print("Generating coefficients")
    w = generate_group_lasso_coefficients(group_sizes)
    w += np.random.randn(*w.shape) * coeff_noise_level

    print("Generating targets")
    y = X @ w
    y += np.random.randn(*y.shape) * noise_level * y
    y += intercept

    gl = GroupLasso(
        groups=groups,
        n_iter=100,
        tol=1e-8,
        l1_reg=0.05,
        group_reg=0.18,
        frobenius_lipschitz=False,
        subsampling_scheme=None,
        fit_intercept=True,
    )
    print("Starting fit")
    gl.fit(X, y)

    for i in range(w.shape[1]):
        plt.figure()
        plt.plot(w[:, i], ".", label="True weights")
        plt.plot(gl.coef_[:, i], ".", label="Estimated weights")

    plt.figure()
    plt.plot([w.min(), w.max()], [gl.coef_.min(), gl.coef_.max()], "gray")
    plt.scatter(w, gl.coef_, s=10)
    plt.ylabel("Learned coefficients")
    plt.xlabel("True coefficients")

    plt.figure()
    plt.plot(gl.losses_)

    print("X shape: {X.shape}".format(X=X))
    print("True intercept: {intercept}".format(intercept=intercept))
    print("Estimated intercept: {intercept}".format(intercept=gl.intercept_))
    plt.show()

Total running time of the script: ( 0 minutes 5.298 seconds)

Gallery generated by Sphinx-Gallery