Metadata-Version: 2.1
Name: snore-embedding
Version: 0.2.1
Summary: SNoRe: Scalable Unsupervised Learning of Symbolic Node Representations
Home-page: https://github.com/smeznar/SNoRe
Author: Sebastian Mežnar and Blaž Škrlj
Author-email: smeznar@gmail.com
License: GNU General Public License v3.0
Keywords: graph,representation learning,symbolic,snore,unsupervised learning
Platform: UNKNOWN
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: networkx
Requires-Dist: numba (>=0.51.2)
Requires-Dist: scikit-learn

# SNoRe: Scalable Unsupervised Learning of Symbolic Node Representations
This repository contains the implementation of SNoRe algorithm from SNoRe paper
found here:

```
TBA
```

An overview of the algorithm is presented in the image below:

![algorithm overview](/images/algorithm_overview.png)

# Installing SNoRe
```
python setup.py install
```

or

```
pip install snore-embedding
```

# Using SNoRe
A simple use-case is shown below.
First, we import the necessary libraries and load the dataset and its labels.

```
from snore import SNoRe
from scipy.io import loadmat
from sklearn.utils import shuffle
from catboost import CatBoost
import pandas as pd
from sklearn.metrics import f1_score
import numpy as np

# Load adjacency matrix and labels
dataset = loadmat("../data/cora.mat")
network_adj = dataset["network"]
labels = dataset["group"]
```

We then create the SNoRe model and embed the network. 
In code, the default parameters are shown.

```
# Create the model
model = SNoRe(dimension=256, num_walks=1024, max_walk_length=5,
              inclusion=0.005, fixed_dimension=False, metric="cosine",
              num_bins=256)

# Embed the network
embedding = model.embed(network_adj)
```

Finally, we train the classifier and test on the remaining data.

```
# Train the classifier
nodes = shuffle([i for i in range(network_adj.shape[0])])
train_mask = nodes[:int(network_adj.shape[0]*0.8)]
test_mask = nodes[int(network_adj.shape[0]*0.8):]
classifier = CatBoost(params={'loss_function': 'MultiRMSE', 'iterations': 500})
df = pd.DataFrame.sparse.from_spmatrix(embedding)
classifier.fit(df.iloc[train_mask], labels[train_mask])

# Test prediction
predictions = classifier.predict(df.iloc[test_mask])
print("Micro score:",
      f1_score(np.argmax(labels[test_mask], axis=1),
               np.argmax(predictions, axis=1),
               average='micro'))

```

Further examples of evaluation and embedding explainability can be found in the example folder.

