Start tutorial

You can copy this tutorial in a Jupyter notebook or run it directly on Colab.

1. Install packages

The DIRESA package depends on the tensorflow and tensorflow_probability packages. This tutorial also uses numpy and matplotlib.

# Install needed packages
!pip install numpy
!pip install matplotlib
!pip install tensorflow
!pip install tensorflow_probability
!pip install diresa

2. Load the dataset

In this tutorial, we are going to compress the 3D lorenz ‘63 butterfly into a 2D latent space. The lorenz.csv:_ contains a list of butterfly points, with three colums for the X, Y and Z coordinate. The DIRESA model has 2 inputs: the original dataset and a shuffled version of this dataset for the twin encoder.

!wget https://gitlab.com/etrovub/ai4wcm/public/diresa/-/raw/master/docs/lorenz.csv
import numpy as np
data_file = "lorenz.csv"
data = np.loadtxt(data_file, delimiter=",")
print("Shape", data_file, ":", data.shape)
train = data[:30000]
val = data[30000:]
id_train = np.argsort((np.random.random(train.shape[0])))
id_val = np.argsort((np.random.random(val.shape[0])))
train_twin = train[id_train]
val_twin = val[id_val]