Metadata-Version: 1.0
Name: psweep
Version: 0.2.1
Summary: loop like a pro, make parameter studies fun
Home-page: https://github.com/elcorto/psweep
Author: Steve Schmerler
Author-email: git@elcorto.com
License: BSD 3-Clause
Description: psweep -- loop like a pro, make parameter studies fun
        =====================================================
        
        About
        -----
        
        This is a package with simple helpers to set up and run parameter studies.
        
        Getting started
        ---------------
        
        Loop over two parameters 'a' and 'b':
        
        .. code-block:: python
        
            #!/usr/bin/env python3
        
            import random
            from itertools import product
            from psweep import psweep as ps
        
        
            def func(pset):
                return {'result': random.random() * pset['a'] * pset['b']}
        
        
            if __name__ == '__main__':
                a = ps.seq2dicts('a', [1,2,3,4])
                b = ps.seq2dicts('b', [8,9])
                params = ps.loops2params(product(a,b))
                df = ps.run(func, params)
                print(df)
        
        This produces a list of parameter sets to loop over (``params``)::
        
            [{'a': 1, 'b': 8},
             {'a': 1, 'b': 9},
             {'a': 2, 'b': 8},
             {'a': 2, 'b': 9},
             {'a': 3, 'b': 8},
             {'a': 3, 'b': 9},
             {'a': 4, 'b': 8},
             {'a': 4, 'b': 9}]
        
        
        and a database of results (pandas DataFrame ``df``, pickled file ``calc/results.pk``
        by default)::
        
                                       _calc_dir                              _pset_id  \
            2018-07-22 20:06:07.401398      calc  99a0f636-10b3-438c-ab43-c583fda806e8
            2018-07-22 20:06:07.406902      calc  6ec59d2b-7562-4262-b8d6-8f898a95f521
            2018-07-22 20:06:07.410227      calc  d3c22d7d-bc6d-4297-afc3-285482e624b5
            2018-07-22 20:06:07.412210      calc  f2b2269b-86e3-4b15-aeb7-92848ae25f7b
            2018-07-22 20:06:07.414637      calc  8e1db575-1be2-4561-a835-c88739dc0440
            2018-07-22 20:06:07.416465      calc  674f8a2c-bc21-40f4-b01f-3702e0338ae8
            2018-07-22 20:06:07.418866      calc  b4d3d11b-0f22-4c73-a895-7363c635c0c6
            2018-07-22 20:06:07.420706      calc  a265ca2f-3a9f-4323-b494-4b6763c46929
        
                                                                     _run_id  \
            2018-07-22 20:06:07.401398  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.406902  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.410227  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.412210  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.414637  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.416465  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.418866  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
            2018-07-22 20:06:07.420706  3e09daf8-c3a7-49cb-8aa3-f2c040c70e8f
        
                                                        _time_utc  a  b     result
            2018-07-22 20:06:07.401398 2018-07-22 20:06:07.401398  1  8   2.288036
            2018-07-22 20:06:07.406902 2018-07-22 20:06:07.406902  1  9   7.944922
            2018-07-22 20:06:07.410227 2018-07-22 20:06:07.410227  2  8  14.480190
            2018-07-22 20:06:07.412210 2018-07-22 20:06:07.412210  2  9   3.532110
            2018-07-22 20:06:07.414637 2018-07-22 20:06:07.414637  3  8   9.019944
            2018-07-22 20:06:07.416465 2018-07-22 20:06:07.416465  3  9   4.382123
            2018-07-22 20:06:07.418866 2018-07-22 20:06:07.418866  4  8   2.713900
            2018-07-22 20:06:07.420706 2018-07-22 20:06:07.420706  4  9  27.358240
        
        You see a number of reserved fields for book-keeping such as
        
        ::
        
            _run_id
            _pset_id
            _calc_dir
            _time_utc
        
        and a timestamped index. See the ``examples`` dir for more.
        
        Tests
        -----
        
        ::
        
            # apt-get install python3-nose
            $ nosetests3
        
        Concepts
        --------
        
        The basic data structure for a param study is a list ``params`` of dicts
        (called "parameter sets" or short `pset`).
        
        .. code-block:: python
        
            params = [{'a': 1, 'b': 'lala'},  # pset 1
                      {'a': 2, 'b': 'zzz'},   # pset 2
                      ...                     # ...
                     ]
        
        Each `pset` contains values of parameters ('a' and 'b') which are varied
        during the parameter study.
        
        You need to define a callback function ``func``, which takes exactly one `pset`
        such as::
        
            {'a': 1, 'b': 'lala'}
        
        and runs the workload for that `pset`. ``func`` must return a dict, for example::
        
            {'result': 1.234}
        
        or an updated `pset`::
        
            {'a': 1, 'b': 'lala', 'result': 1.234}
        
        We always merge (``dict.update``) the result of ``func`` with the `pset`,
        which gives you flexibility in what to return from ``func``.
        
        The `psets` form the rows of a pandas ``DataFrame``, which we use to store
        the `pset` and the result from each run.
        
        The idea is now to run ``func`` in a loop over all `psets` in ``params``. You
        can do this using the ``ps.run`` helper function. The function adds some
        special columns such as ``_run_id`` (once per ``ps.run`` call) or ``_pset_id``
        (once per `pset`). Using ``ps.run(... poolsize=...)`` runs ``func`` in parallel
        on ``params`` using ``multiprocessing.Pool``.
        
        This package offers some very simple helper functions which assist in creating
        ``params``. Basically, we define the to-be-varied parameters ('a' and 'b')
        and then use something like ``itertools.product`` to loop over them to create
        ``params``, which is passed to ``ps.run`` to actually perform the loop over all
        `psets`.
        
        .. code-block:: python
        
            >>> from itertools import product
            >>> from psweep import psweep as ps
            >>> x=ps.seq2dicts('x', [1,2,3])
            >>> y=ps.seq2dicts('y', ['xx','yy','zz'])
            >>> x
            [{'x': 1}, {'x': 2}, {'x': 3}]
            >>> y
            [{'y': 'xx'}, {'y': 'yy'}, {'y': 'zz'}]
            >>> ps.loops2params(product(x,y))
            [{'x': 1, 'y': 'xx'},
             {'x': 1, 'y': 'yy'},
             {'x': 1, 'y': 'zz'},
             {'x': 2, 'y': 'xx'},
             {'x': 2, 'y': 'yy'},
             {'x': 2, 'y': 'zz'},
             {'x': 3, 'y': 'xx'},
             {'x': 3, 'y': 'yy'},
             {'x': 3, 'y': 'zz'}]
        
        The logic of the param study is entirely contained in the creation of ``params``.
        E.g., if parameters shall be varied together (say x and y), then instead of
        
        .. code-block:: python
        
            >>> product(x,y,z)
        
        use
        
        .. code-block:: python
        
            >>> product(zip(x,y), z)
        
        The nestings from ``zip()`` are flattened in ``loops2params()``.
        
        .. code-block:: python
        
            >>> z=ps.seq2dicts('z', [None, 1.2, 'X'])
            >>> ps.loops2params(product(zip(x,y),z))
            [{'x': 1, 'y': 'xx', 'z': None},
             {'x': 1, 'y': 'xx', 'z': 1.2},
             {'x': 1, 'y': 'xx', 'z': 'X'},
             {'x': 2, 'y': 'yy', 'z': None},
             {'x': 2, 'y': 'yy', 'z': 1.2},
             {'x': 2, 'y': 'yy', 'z': 'X'},
             {'x': 3, 'y': 'zz', 'z': None},
             {'x': 3, 'y': 'zz', 'z': 1.2},
             {'x': 3, 'y': 'zz', 'z': 'X'}]
        
        If you want a parameter which is constant, use a list of length one:
        
        .. code-block:: python
        
            >>> c=ps.seq2dicts('c', ['const'])
            >>> ps.loops2params(product(zip(x,y),z,c))
            [{'a': 1, 'c': 'const', 'y': 'xx', 'z': None},
             {'a': 1, 'c': 'const', 'y': 'xx', 'z': 1.2},
             {'a': 1, 'c': 'const', 'y': 'xx', 'z': 'X'},
             {'a': 2, 'c': 'const', 'y': 'yy', 'z': None},
             {'a': 2, 'c': 'const', 'y': 'yy', 'z': 1.2},
             {'a': 2, 'c': 'const', 'y': 'yy', 'z': 'X'},
             {'a': 3, 'c': 'const', 'y': 'zz', 'z': None},
             {'a': 3, 'c': 'const', 'y': 'zz', 'z': 1.2},
             {'a': 3, 'c': 'const', 'y': 'zz', 'z': 'X'}]
        
        So, as you can see, the general idea is that we do all the loops *before*
        running any workload, i.e. we assemble the parameter grid to be sampled before
        the actual calculations. This has proven to be very practical as it helps
        detecting errors early.
        
        We are aware of the fact that the data structures and functions used here are
        so simple that it is almost not worth a package at all, but it is helpful to
        have the ideas and the workflow packaged up in a central place.
        
        Install
        -------
        
        ::
        
            $ pip3 install psweep
        
        
        Dev install of this repo::
        
            $ pip3 install -e .
        
        See also https://github.com/elcorto/samplepkg.
        
Keywords: parameter study sweep loop
Platform: UNKNOWN
