Metadata-Version: 1.1
Name: powergrasp
Version: 0.8.10
Summary: compress graphs with answer-set-programming
Home-page: https://github.com/aluriak/powergrasp
Author: Lucas Bourneuf
Author-email: lucas.bourneuf@inria.fr
License: GPL
Description: # PowerGrASP
        Graph compression.
        
        Note that this is a full reimplementation of PowerGrASP,
        taking advantage of ASP and Python lifting and simplifications.
        For the version published in 2017, see [this repository](https://github.com/aluriak/PowerGrASP-1).
        
        
        ## CLI
        
            python -m powergrasp mygraph.gml -o compressed.bbl
        
        ### help !
        
            python -m powergrasp --help
        
        ## API
        
        ```python
        import powergrasp
        with open('compressed.bbl', 'w') as fd:
            for line in powergrasp.compress_by_cc('mygraph.gml'):
                fd.write(line + '\n')
        ```
        
        ### help !
        Sorry, no technical doc for the moment.
        
        
        ## Configuration
        PowerGrASP has some [configuration values](powergrasp/constants.py),
        that can be overwritten by a `powergrasp.cfg` file placed in the working directory.
        
        The configuration may be printed in stdout using:
        
            python -m powergrasp --config
        
        The config file may be either in ini or json format, as shown in [`test/powergrasp.oneshot.cfg`](test/powergrasp.oneshot.cfg) and [`test/powergrasp.manyoptions.cfg`](test/powergrasp.manyoptions.cfg).
        
        Configuration allow user to define how much text will be outputed by powergrasp,
        and to tune the core compression and related optimizations.
        
        See complete list in the [options section](#Options).
        
        
        ## installation
        
            pip install powergrasp
        
        On random error, try adding `--no-cache-dir` at the end of the command.
        
        
        ## TODO
        - [x] unit tests
        - [x] CLI
        - [x] run on big graph
        - [x] search multiple motif in the same run (speed up observed on bio graph)
        - [x] timers for solving, full run, output writing
        - [ ] timers for extraction
        - [ ] technical documentation
        - [ ] search and specific compression of trees
        - [ ] search and specific compression of triangle-free graphs
        - [x] search multiple motif in the same run (speed up observed on bio graph)
        
        
        ## Changelog
        
        - 8.10
            - option [parallel cc compression](#parallel-cc-compression) to enable parallel compression of connected components
            - option [bubble embeds cc](#bubble-embeds-cc), to put each cc in a dedicated powernode
            - option [bubble with simple edges](#bubble-with-simple-edges), to keep or discard the simple edges from output
            - statistics on time and compression metrics are computed by cc and for all graph. See new options.
            - option [cc statistic file](#cc-statistic-file), to retrieve stats and metrics for connected components.
            - option [global statistics](#global-statistics), to compute statistics/metrics over all connected components.
            - option [bubble with statistics](#bubble-with-statistics), to include stats and metrics in output bubble file
            - [motif type order](#motif-type-order) option, allowing user to tune the order in which motifs are searched.
            - [parallel motif search](#parallel-motif-search) option, making motifs search in different threads, and the overhaul compression much slower.
            - CLI: --config flag to print the full configuration in stdout
            - handling spaces instead of _ in fields names for INI format
            - do not filter out nodes alone in their connected component (cf [keep-single-nodes option](#keep-single-nodes))
            - perf gain: the first motif to reach the best score is compressed, instead of the last
            - improve handling of statistic files, that now contains cc idx and motif bounds
            - expose `__version__` attribute in the package
            - add a list of available options in the README
            - graph filtering: optimization leading to general performance gain when enabled
            - improvements on {low,upp}erbounds computations
        - 8.9
            - many bugfixes
            - CLINGO_OPTIONS can be a dict, mapping motif name to arbitrary parameters
        - 8.8
            - bugfix: lowerbound can't go under 2 (3 for cliques)
            - clique search: lowerbound is initially computed as the maximal degree of node of clustering coefficient equals to 1
            - use new versions of clyngor and bubbletools
        - 8.7
            - raise error when an invalid key is found in config file
            - StarSearch is dedicated to find stars, simplifying BicliqueSearch's job (enabled by default, user may use Biclique instead of both Star and NonStarBiclique)
            - specify a prefix for all (power) nodes with OUTPUT_NODE_PREFIX config field
            - arbitrary options for clingo using CLINGO_OPTIONS config field
            - bugfix on string options given in ini file, when unecessary quotes are kept
        - 8.6
            - bugfixes for INI format
        - 8.5
            - config may now be given in INI format
            - improve logging
            - config allow optimization target between memory or CPU (currently define a constraint implementation in ASP)
            - bugfix about edges yielded by ASP and multithreading parameter
            - no statistics file to write by default
            - timer for output writing
            - config allow user to define the number of CPU for clingo to use
            - simplify ASP code by removing useless arg in block/4 atoms
        - 8.4
            - replace ASP code constraint to achieve much more efficient grounding
            - configuration allow for improved lowerbound computation of bicliques
            - generated bubble allow client to give heading comments
            - handle keyboard interruption during search with grace
            - implement timers and statistics recording
            - enable multishot motif search by default
            - various bugfixes
        
        ## Options
        Description of all powergrasp options.
        All values example are given for INI format.
        
        ### test integrity
        Run some integrity tests during runtime to ensure that compression is working well.
        May slow the compression a lot, and is mainly a debugging tool.
        
        Default value:
        
            test_integrity = true
        
        ### show story
        Print in stdout the main steps of compression.
        Default value:
        
            show_story = true
        
        ### show motif handling
        Print in stdout the motif transformation.
        Default value:
        
            show_motif_handling = false
        
        ### timers
        Maintain and print in stdout some timers.
        Default value:
        
            timers = true
        
        ### statistics file
        A file in which some statistics will be written in CSV format, giving among others size of compressed motifs, and their compression times (if *timers* option is enabled).
        Default value:
        
            statistics_file = None
        
        Example value:
        
            statistics_file = statistics.csv
        
        ### cc statistic file
        Filename in which the statistics about each connected component will be written.
        Data will contain one row for each connected component, with the following data:
        
        - connected component index
        - the convertion rate
        - the edge reduction
        - the compression rate
        - time needed to compress the connected component (if TIMERS is enabled)
        
        Default value stands for *no file*:
        
            cc_statistic_file = None
        
        Example value:
        
            cc_statistic_file = statistics-cc.csv
        
        ### global statistics
        When enabled, statistics and metrics computation on the overall graph will be performed at the end of compression.
        Default value:
        
            global_statistics = True
        
        ### bubble with statistics
        When enabled, output bubble will contain statistics about each connected component
        and the overall graph (if global_statistics is enabled).
        Default value:
        
            bubble_with_statistics = True
        
        ### bubble for each step
        Generate and save a bubble representation of the graph at each step.
        Mainly used for debugging.
        Default value:
        
            bubble_for_each_step = false
        
        ### output node prefix
        Prefix to add to all (power)nodes names in output.
        Default value:
        
            output_node_prefix = ''
        
        ### show debug
        Show full trace of the compression. Useful for debugging.
        Default value:
        
            show_debug = False
        
        ### covered edges from ASP
        Recover covered edges from ASP. If falsy, will ask motif searcher to compute the edges, which may be quicker.
        Default value:
        
            covered_edges_from_asp = False
        
        ### bubble with nodes
        Nodes and sets are optional in the output bubble.
        Default value:
        
            bubble_with_nodes = True
            bubble_with_sets = True
        
        ### bubble poweredge factor
        Edges in bubble are associated to a factor.
        Default value:
        
            bubble_poweredge_factor = '1.0'
            bubble_edge_factor = '1.0'
        
        ### bubble embeds cc
        If enabled, put each connected component in a dedicated powernode.
        Default value:
        
            bubble_embeds_cc = no
        
        
        ### bubble simplify quotes
        When possible, delete the quotes around identifiers in the bubble. May lead to node name collision.
        Default value:
        
            bubble_simplify_quotes = True
        
        ### bubble with simple edges
        If disabled, will discard simple (i.e. non-power) edges of output.
        Default value:
            bubble_with_simple_edges = True
        
        ### config file
        Load options found in given config file, if it exists.
        Default value:
        
            config_file = 'powergrasp.cfg'
        
        ### multishot motif search
        Search for multiple motif in a single search.
        Accelerate the solving for graph with lots of equivalent motifs. It is generally a good option to enable.
        Default value:
        
            multishot_motif_search = True
        
        ### biclique lowerbound maxnei
        Optimization on biclique lowerbound computation. Can be costly. Deactivate with 2. With value at n, up to n neighbors are considered.
        Default value:
        
            biclique_lowerbound_maxnei = 2
        
        ### clingo options
        Arbitrary parameters to give to clingo (note that some, like multithreading or optmode, may already be set by other options).
        Default value:
        
            clingo_options = {}
        
        Give a particular configuration for bicliques search:
        
            clingo_options = {'no-star-biclique': '--configuration=handy'}
        
        ### clingo multithreading
        Number of CPU available to clingo, or 0 for autodetect number of CPU.
        Default value:
        
            clingo_multithreading = 1
        
        Use as many thread there are available CPU:
        
            clingo_multithreading = 0
        
        Specify a competing search between 4 threads:
        
            clingo_multithreading = '4,compete'
        
        ### parallel cc compression
        To use to compress connected components in different processes.
        Default value (optimized in memory):
        
            parallel_cc_compression = 1
        
        Compress with 4 processes :
        
            parallel_cc_compression = 4
        
        Compress with one process for each connected component :
        
            parallel_cc_compression = 0
        
        ### use star motif
        Two different motifs for stars and bicliques, so the search space for bicliques is smaller. Yields good performance improvements on big graphs.
        Default value:
        
            use_star_motif = True
        
        ### optimize for memory
        When possible, prefer memory over CPU.
        Default value:
        
            optimize_for_memory = False
        
        ### graph filtering
        Ignore edges dynamically determined as impossible to compress.
        Default value:
        
            graph_filtering = True
        
        ### keep single nodes
        If a node is found to be alone in its connected component, it will be kept nonetheless.
        Default value:
        
            keep_single_nodes = True
        
        ### parallel motif search
        Use multithreading to search for all motifs in the same time, instead of sequentially.
        Interestingly, this yields very poor results, and it's even worse with multiprocessing.
        
            parallel_motif_search = False
        
        ### motif type order
        Define in which order the motifs are searched, e.g. cliques, then bicliques then stars.
        
            motif_type_order = star,clique,non-star-biclique,biclique
        
        Instead of expliciting the exact motif names and order, it is also possible to specify a bound-based order:
        
            motif_type_order = greatest-upperbound-first
        
        Or any variation with `worst`, `lowerbound` and `last`.
        
Keywords: graph,Answer Set Programming
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License (GPL)
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: ASP
Classifier: Topic :: Software Development :: Libraries :: Python Modules
