Metadata-Version: 1.1
Name: colmet
Version: 0.4.0
Summary: A utility to monitor the jobs ressources in a HPC environment, espacially OAR
Home-page: http://oar.imag.fr/
Author: Salem Harrache
Author-email: salem.harrache@inria.fr
License: GNU GPL
Description: Colmet - Collecting metrics about jobs running in a distributed environnement
        
        Introduction:
        -------------
        
        Colmet is a monitoring tool to collect metrics about jobs running in a
        distributed environnement, especially for gathering metrics on clusters and
        grids. It provides currently several backends :
            - taskstats: fetch task metrics from the linux kernel
            - stdout: display the metrics on the terminal
            - zeromq: transport the metrics across the network
            - hdf5: store the metrics on the filesystem
        
        Installation:
        -------------
        
        For detailed instructions on how to install Colmet on your plateform, please
        refer to the INSTALL document in the same directory as this document. Please
        carefully read the REQUIREMENTS section of the INSTALL instructions.
        
        Usage:
        ------
        
        for the nodes :
        
            sudo colmet-node -vvv -zeromq-uri tcp://127.0.0.1:5556
        
        for the collector :
        
            colmet-collector -vvv --zeromq-bind-uri tcp://127.0.0.1:5556 --hdf5-filepath /data/colmet.hdf5 --hdf5-complevel 9
        
        You will see the number of counters retrieved in the debug log.
        
        
        For more information, please refer to the help of theses scripts (--help)
        
        Licensing:
        ----------
        
        This product is distributed under the GNU General Public License Version2.
        Please read through the file LICENSE for more information about our license.
        
        
        
        
        Colmet CHANGELOG
        ================
        
        version 0.4.0
        -------------
        
        - Saved metrics in new HDF5 file if colmet is reloaded in order to avoid HDF5 data corruption
        - Handled HUP signal to reload ``colmet-collector``
        - Removed ``hiwater_rss`` and ``hiwater_vm`` collected metrics.
        
        
        version 0.3.1
        -------------
        
        - New metrics ``hiwater_rss`` and ``hiwater_vm`` for taskstats
        - Worked with pyinotify 0.8
        - Added ``--disable-procstats`` option to disable procstats backend.
        
        
        version 0.3.0
        -------------
        
        - Divided colmet package into three parts
        
          - colmet-node : Retrieve data from taskstats and procstats and send to
            collectors with ZeroMQ
          - colmet-collector : A collector that stores data received by ZeroMQ in a
            hdf5 file
          - colmet-common : Common colmet part.
        - Added some parameters of ZeroMQ backend to prevent a memory overflow
        - Simplified the command line interface
        - Dropped rrd backend because it is not yet working
        - Added ``--buffer-size`` option for collector to define the maximum number of
          counters that colmet should queue in memory before pushing it to output
          backend
        - Handled SIGTERM and SIGINT to terminate colmet properly
        
        version 0.2.0
        -------------
        
        - Added options to enable hdf5 compression
        - Support for multiple job by cgroup path scanning
        - Used Inotify events for job list update
        - Don't filter packets if no job_id range was specified, especially with zeromq
          backend
        - Waited the cgroup_path folder creation before scanning the list of jobs
        - Added procstat for node monitoring through fictive job with 0 as identifier
        - Used absolute time take measure and not delay between measure, to avoid the
          drift of measure time
        - Added workaround when a newly cgroup is created without process in it
          (monitoring is suspended upto one process is launched)
        
        
        version 0.0.1
        -------------
        
        - Conception
        
Keywords: monitoring,taskstat,oar,hpc,sciences
Platform: Linux
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: GNU General Public License (GPL)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Monitoring
Classifier: Topic :: System :: Clustering
Classifier: Programming Language :: Python :: 2.5
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
