Metadata-Version: 2.1
Name: humanleague
Version: 2.1.1
Summary: Microsynthesis using quasirandom sampling and/or IPF
Home-page: http://github.com/virgesmith/humanleague
Author: Andrew P Smith
Author-email: a.p.smith@leeds.ac.uk
License: UNKNOWN
Description: # humanleague
        
        [![CRAN\_Status\_Badge](https://www.r-pkg.org/badges/version/humanleague)](https://CRAN.R-project.org/package=humanleague)
        [![CRAN Downloads](https://cranlogs.r-pkg.org/badges/grand-total/humanleague?color=black)](https://cran.r-project.org/package=humanleague)
        [![PyPI version](https://badge.fury.io/py/humanleague.svg)](https://badge.fury.io/py/humanleague)
        [![Travis Build Status](https://travis-ci.org/virgesmith/humanleague.png?branch=master)](https://travis-ci.org/virgesmith/humanleague)
        [![Appveyor Build status](https://ci.appveyor.com/api/projects/status/x9oypgryt21ndc3p?svg=true)](https://ci.appveyor.com/project/virgesmith/humanleague)
        [![codecov](https://codecov.io/gh/virgesmith/humanleague/branch/master/graph/badge.svg)](https://codecov.io/gh/virgesmith/humanleague)
        [![License](https://img.shields.io/github/license/mashape/apistatus.svg)](https://opensource.org/licenses/MIT)
        [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1116318.svg)](https://doi.org/10.5281/zenodo.1116318)
        [![status](https://joss.theoj.org/papers/d5aaf6e1c2efed431c506762622473b4/status.svg)](https://joss.theoj.org/papers/d5aaf6e1c2efed431c506762622473b4)
        
        > ## Latest news: 2.1 release
        > - adds new functionality for multidimensional integerisation. 
        > - deletes previously deprecated functionality `synthPop` and `synthPopG`.
        > ### Multidimensional integerisation
        > Building on the `prob2IntFreq` function - which takes a discrete probability distribution and a count, and returns the closest integer population to the distribution that sums to the count - a multidimensional equivalent `integerise` is introduced.
        > 
        > In one dimension, for example:
        > ```python
        > >>> import numpy as np
        > >>> import humanleague
        > >>> p=np.array([0.1, 0.2, 0.3, 0.4])
        > >>> humanleague.prob2IntFreq(p, 11)
        > {'freq': array([1, 2, 3, 5]), 'rmse': 0.3535533905932736}
        > ```
        > produces the optimal (i.e. closest possible) integer population to the discrete distribution.
        >  
        > The `integerise` function generalises this problem and applies it to higher dimensions: given an n-dimensional array of real numbers where the 1-d marginal sums in every dimension are integral (and thus the total population is too), it attempts to find an integral array that also satisfies these constraints. 
        
        > The QISI algorithm is repurposed to this end. As it is a sampling algorithm it cannot guarantee that a solution is found, and if so, whether the solution is optimal. If it fails this does not prove that a solution does not exist for the given input.
        
        > ```python
        > >>> a = np.array([[ 0.3,  1.2,  2. ,  1.5],
        >                   [ 0.6,  2.4,  4. ,  3. ],
        >                   [ 1.5,  6. , 10. ,  7.5],
        >                   [ 0.6,  2.4,  4. ,  3. ]])
        > # marginal sums
        > >> sum(a)
        > array([ 3., 12., 20., 15.])
        > >>> sum(a.T)
        > array([ 5., 10., 25., 10.])
        > # perform integerisation
        > >>> r = humanleague.integerise(a)
        > >>> r["conv"]
        > True
        > >>> r["result"]
        > array([[ 0,  2,  2,  1],
        >        [ 0,  3,  4,  3],
        >        [ 2,  6, 10,  7],
        >        [ 1,  1,  4,  4]])
        > >>> r["rmse"]
        > 0.5766281297335398
        > # check marginals are preserved
        > >>> sum(r["result"]) == sum(a)
        > array([ True,  True,  True,  True])
        > >>> sum(r["result"].T) == sum(a.T)
        > array([ True,  True,  True,  True])
        > ```
        >
        > ### Removed functions
        > The functions `synthPop` and `synthPopG` implement restricted versions of algorithms that are available in other functions.
        >
        > Use `qis` ins place of `synthPop`, and `qisi` in place of `synthPopG`.
        
        ### Introduction
        
        *humanleague* is a python and an R package for microsynthesising populations from marginal and (optionally) seed data. The package is implemented in C++ for performance.
        
        The package contains algorithms that use a number of different microsynthesis techniques:
        - [Iterative Proportional Fitting (IPF)](https://en.wikipedia.org/wiki/Iterative_proportional_fitting)
        - [Quasirandom Integer Sampling (QIS)](http://jasss.soc.surrey.ac.uk/20/4/14.html) (no seed population)
        - Quasirandom Integer Sampling of IPF (QISI): A combination of the two techniques whereby the integral population is sampled (without replacement) from a distribution constructed from a dynamic IPF solution.
        
        The latter provides a bridge between deterministic reweighting and combinatorial optimisation, offering advantages of both techniques:
        - generates high-entropy integral populations 
        - can be used to generate multiple populations for sensitivity analysis
        - goes some way to address the 'empty cells' issues that can occur in straight IPF
        - relatively fast compuation time
        
        The algorithms: 
        - support arbitrary dimensionality* for both the marginals and the seed.
        - produce statistical data to ascertain the likelihood/degeneracy of the population (where appropriate).
        
        The package also contains the following utility functions:
        - a Sobol sequence generator
        - construct a closest integer population from a discrete univariate probability distribution.
        - an algorithm for sampling an integer population from a discrete multivariate probability distribution, constrained to the marginal sums in every dimension.
        - 'flatten' a multidimensional population into a table: this converts a multidimensional array containing the population count for each state into a table listing individuals and their characteristics. 
        
        Version 1.0.1 reflects the work described in the [Quasirandom Integer Sampling (QIS)](http://jasss.soc.surrey.ac.uk/20/4/14.html) paper.
        
        ## R installation
        Official release:
        ```
        > install.packages("humanleague")
        ```
        For development version
        ```bash
        > devtools::install_github("virgesmith/humanleague")
        ```
        Or, for the legacy version
        ```bash
        > devtools::install_github("virgesmith/humanleague@1.0.1")
        ```
        ## python installation
        
        Requires Python 3 and numpy. PyPI package:
        ```bash
        python3 -m pip install humanleague --user
        ```
        [Conda-forge package is being worked on]
        
        ### Build, install and test (from local cloned repo)
        ```bash
        $ ./setup.py install --user
        ```
        ```bash
        $ ./setup.py test
        ```
        ### Examples
        
        Consult the package documentation, e.g.
        ```
        > library(humanleague)
        > ?humanleague
        ```
        in R, or for python:
        ```
        >>> import humanleague as hl
        >>> help(hl)
        ```
        
Platform: UNKNOWN
Description-Content-Type: text/markdown
