Metadata-Version: 1.0
Name: ugtm
Version: 1.1.8
Summary: Generative Topographic Mapping (GTM) for python, GTM classification and GTM regression
Home-page: http://github.com/hagax8/ugtm
Author: Helena A. Gaspar
Author-email: hagax8@gmail.com
License: MIT
Description: 
        ===========
        ugtm
        ===========
        
        ugtm provides an implementation of GTM (Generative Topographic Mapping), kGTM (kernel Generative Topographic Mapping), GTM classification models (kNN, Bayes) and GTM regression models. ugtm also implements cross-validation options which can be used to compare GTM classification models to SVM classification models, and GTM regression models to SVM regression models. Typical usage::
        
            #!/usr/bin/env python
        
            import ugtm 
            import numpy as np
            
            #generate sample data and labels: replace this with your own data
            data=np.random.randn(100,50)
            labels=np.random.choice([1,2],size=100)
        
            #build GTM map
            gtm=ugtm.runGTM(data=data,verbose=True)
        
            #plot GTM map (html)
            gtm.plot_html(output="out")
        
        For installation instructions, cf. https://github.com/hagax8/ugtm
        
        1. Import package
        =================
        
        Import ugtm package, allowing to construct GTM and kernel GTM (kGTM) maps, GTM classification models, GTM regression models::
        
            import ugtm
        
        
        2. Construct and plot GTM maps (or kGTM maps)
        =============================================
        
        
        A gtm object can be created by running the function runGTM on a dataset. Parameters for runGTM are: k = sqrt(number of nodes), m = sqrt(number of rbf centres), s = RBF width factor, l = regularization coefficient. The number of iteration for the expectation-maximization algorithm is set to 200 by default. This is an example with random data::
        
        
            #import numpy to generate random data
            import numpy as np
        
            #generate random data (independent variables x), 
            #discrete labels (dependent variable y),
            #and continuous labels (dependent variable y), 
            #to experiment with categorical or continuous outcomes
            
            train = np.random.randn(20,10)
            test = np.random.randn(20,10)
            labels=np.random.choice(["class1","class2"],size=20)
            activity=np.random.randn(20,1)
        
            #create a gtm object and write model
            gtm = ugtm.runGTM(train)
            gtm.write("testout1")
        
            #run verbose
            gtm = ugtm.runGTM(train, verbose=True)
        
            #to run a kernel GTM model instead, run following:
            gtm = ugtm.runkGTM(train, doKernel=True, kernel="linear")
        
            #access coordinates (means or modes), and responsibilities of gtm object
            gtm_coordinates = gtm.matMeans
            gtm_modes = gtm.matModes
            gtm_responsibilities = gtm.matR
        
        
        3. Plot html maps
        =========================================
        
        Call the plot_html() function on the gtm object::
        
            #run model on train
            gtm = ugtm.runGTM(train)
        
            # ex. plot gtm object with landscape, html: labels are continuous
            gtm.plot_html(output="testout10",labels=activity,discrete=False,pointsize=20)
        
            # ex. plot gtm object with landscape, html: labels are discrete
            gtm.plot_html(output="testout11",labels=labels,discrete=True,pointsize=20)
        
            # ex. plot gtm object with landscape, html: labels are continuous
            # no interpolation between nodes
            gtm.plot_html(output="testout12",labels=activity,discrete=False,pointsize=20, \
                          do_interpolate=False,ids=labels)
        
            # ex. plot gtm object with landscape, html: labels are discrete, 
            # no interpolation between nodes
            gtm.plot_html(output="testout13",labels=labels,discrete=True,pointsize=20, \
                          do_interpolate=False)
        
        
        
        4. Plot pdf maps
        =========================================
        
        Call the plot() function on the gtm object::
        
            #run model on train
            gtm = ugtm.runGTM(train)
        
            # ex. plot gtm object, pdf: no labels
            gtm.plot(output="testout6",pointsize=20)
        
            # ex. plot gtm object with landscape, pdf: labels are discrete
            gtm.plot(output="testout7",labels=labels,discrete=True,pointsize=20)
        
            # ex. plot gtm object with landscape, pdf: labels are continuous
            gtm.plot(output="testout8",labels=activity,discrete=False,pointsize=20)
        
        
        
        5. Plot multipanel views (only if labels or activities are provided)
        ======================================================================
        
        Call the plot_multipanel() function on the gtm object.
        This plots a general model view, showing means, modes, landscape with or without points.
        The plot_multipanel function only works if you have defined labels::
        
            #run model on train
            gtm = ugtm.runGTM(train)
        
            # ex. with discrete labels and inter-node interpolation
            gtm.plot_multipanel(output="testout2",labels=labels,discrete=True,pointsize=20)
        
            # ex. with continuous labels and inter-node interpolation
            gtm.plot_multipanel(output="testout3",labels=activity,discrete=False,pointsize=20)
        
            # ex. with discrete labels and no inter-node interpolation
            gtm.plot_multipanel(output="testout4",labels=labels,discrete=True,pointsize=20, \
                                do_interpolate=False)
        
            # ex. with continuous labels and no inter-node interpolation
            gtm.plot_multipanel(output="testout5",labels=activity,discrete=False,pointsize=20, \
                                do_interpolate=False)
        
        
        6. Project new data onto existing GTM map
        ===================================================================
        
        New data can be projected on the GTM map by using the transform() function, which takes as input the gtm model, a training and test set. The train set is then only used to perform data preprocessing on the test set based on the train (for example: apply the same PCA transformation to the train and test sets before running the algorithm)::
        
            #run model on train
            gtm = ugtm.runGTM(train,doPCA=True)
        
            #test new data (test)
            transformed=ugtm.transform(optimizedModel=gtm,train=train,test=test,doPCA=True)
        
            #plot transformed test (html)
            transformed.plot_html(output="testout14",pointsize=20)
        
            #plot transformed test (pdf)
            transformed.plot(output="testout15",pointsize=20)
        
            #plot transformed data on existing classification model, 
            #using training set labels
            gtm.plot_html_projection(output="testout16",projections=transformed,\
                                     labels=labels, \
                                     discrete=True,pointsize=20)
        
        
        7. Output predictions for a test set: GTM regression (GTR) and classification (GTC)
        ====================================================================================
        
        The GTR() function implements the GTM regression model (cf. references) and GTC() function a GTM classification model (cf. references)::
        
            #continuous labels (prediction by GTM regression model)
            predicted=ugtm.GTR(train=train,test=test,labels=activity)
        
            #discrete labels (prediction by GTM classification model)
            predicted=ugtm.GTC(train=train,test=test,labels=labels)
        
        
        8. Advanced GTM predictions with per-class probabilities
        =========================================================
        
        Per-class probabilities for a test set can be given by the advancedGTC() function (you can set the m, k, l, s parameters just as with runGTM)::
        
            #get whole output model and label predictions for test set
            predicted_model=ugtm.advancedGTC(train=train,test=test,labels=labels)
        
            #write whole predicted model with per-class probabilities
            ugtm.printClassPredictions(predicted_model,"testout17")
        
        
        
        9. Crossvalidation experiments
        ==============================
        
        Different crossvalidation experiments were implemented to compare GTC and GTR models to classical machine learning methods::
        
            #crossvalidation experiment: GTM classification model implemented in ugtm, 
            #here: set hyperparameters s=1 and l=1 (set to -1 to optimize)
            ugtm.crossvalidateGTC(data=train,labels=labels,s=1,l=1,n_repetitions=10,n_folds=5)
        
            #crossvalidation experiment: GTM regression model
            ugtm.crossvalidateGTR(data=train,labels=activity,s=1,l=1)
        
            #you can also run the following functions to compare
            #with other classification/regression algorithms:
        
            #crossvalidation experiment, k-nearest neighbours classification
            #on 2D PCA map with 7 neighbors (set to -1 to optimize number of neighbours)
            ugtm.crossvalidatePCAC(data=train,labels=labels,n_neighbors=7)
        
            #crossvalidation experiment, SVC rbf classification model (sklearn implementation):
            ugtm.crossvalidateSVCrbf(data=train,labels=labels,C=1,gamma=1)
        
            #crossvalidation experiment, linear SVC classification model (sklearn implementation):
            ugtm.crossvalidateSVC(data=train,labels=labels,C=1)
        
            #crossvalidation experiment, linear SVC regression model (sklearn implementation):
            ugtm.crossvalidateSVR(data=train,labels=activity,C=1,epsilon=1)
        
            #crossvalidation experiment, k-nearest neighbours regression on 2D PCA map with 7 neighbors:
            ugtm.crossvalidatePCAR(data=train,labels=activity,n_neighbors=7)
        
        
        
        10. Links & references
        =======================
        
        1. GTM algorithm by Bishop et al: https://www.microsoft.com/en-us/research/wp-content/uploads/1998/01/bishop-gtm-ncomp-98.pdf
        
        2. kernel GTM: https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2010-44.pdf
        
        3. GTM classification models: https://www.ncbi.nlm.nih.gov/pubmed/24320683
        
        4. GTM regression models: https://www.ncbi.nlm.nih.gov/pubmed/27490381
        
        5. github: https://github.com/hagax8/ugtm
        
Platform: UNKNOWN
