Metadata-Version: 2.1
Name: zfit
Version: 0.3.7
Summary: scalable pythonic model fitting for high energy physics
Home-page: https://github.com/zfit/zfit
Author: Jonas Eschle
Maintainer: zfit
Maintainer-email: zfit@physik.uzh.ch
License: BSD 3-Clause
Keywords: TensorFlow,model,fitting,scalable,HEP
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: English
Classifier: Operating System :: MacOS
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: >=3.6
Requires-Dist: tensorflow (<2,>=1.14.0)
Requires-Dist: tensorflow-probability (<0.8,>=0.6.0)
Requires-Dist: scipy (>=1.2)
Requires-Dist: uproot
Requires-Dist: pandas
Requires-Dist: numpy
Requires-Dist: iminuit
Requires-Dist: typing
Requires-Dist: colorlog
Requires-Dist: texttable
Requires-Dist: ordered-set

*******************************
zfit: scalable pythonic fitting
*******************************


.. image:: https://zenodo.org/badge/126311570.svg
   :target: https://zenodo.org/badge/latestdoi/126311570

.. image:: https://img.shields.io/pypi/v/zfit.svg
   :target: https://pypi.python.org/pypi/zfit

.. image:: https://img.shields.io/travis/zfit/zfit.svg
   :target: https://travis-ci.org/zfit/zfit

.. image:: https://coveralls.io/repos/github/zfit/zfit/badge.svg?branch=meta_changes
   :target: https://coveralls.io/github/zfit/zfit?branch=meta_changes

.. image:: https://www.codefactor.io/repository/github/zfit/zfit/badge
   :target: https://www.codefactor.io/repository/github/zfit/zfit
   :alt: CodeFactor


|


zfit is a highly scalable and customizable model manipulation and fitting library. It uses
`TensorFlow <https://www.tensorflow.org/>`_ as its computational backend
and is optimised for simple and direct manipulation of probability density functions.

- **Tutorials**: `Interactive IPython Tutorials <https://github.com/zfit/zfit-tutorials>`_
- **Quick start**: `Example scripts <examples>`_
- **Documentation**: Full documentation_ and API_
- **Questions**: see the `FAQ <https://github.com/zfit/zfit/wiki/FAQ>`_,
  `ask on StackOverflow <https://stackoverflow.com/questions/ask>`_ with the **zfit** tag or `contact`_ us directly.
- **Physics/HEP**: `zfit-physics <https://github.com/zfit/zfit-physics>`_ is the place to contribute/find more HEP
  related content


If you use zfit in research, please consider `citing <https://zenodo.org/badge/latestdoi/126311570>`_.

*N.B.*: zfit is currently in *beta stage*, so while most core parts are established, some may still be missing and bugs may be encountered.
It is, however, mostly ready for production, and is being used in analyses projects.
If you want to use it for your project and you are not sure if all the needed functionality is there, feel free to `contact`_.


Why?
====

The basic idea behind zfit is to offer a Python oriented alternative to the very successful RooFit library from the `ROOT <https://root.cern.ch/>`_ data analysis package that can integrate with the other packages that are part if the scientific Python ecosystem.
Contrary to the monolithic approach of ROOT/RooFit, the aim of zfit is to be light and flexible enough to integrate with any state-of-art tools and to allow scalability going to larger datasets.

These core ideas are supported by two basic pillars:

- The skeleton and extension of the code is minimalist, simple and finite:
  the zfit library is exclusively designed for the purpose of model fitting and sampling with no attempt to extend its functionalities to features such as statistical methods or plotting.

- zfit is designed for optimal parallelisation and scalability by making use of TensorFlow as its backend.
  The use of TensorFlow provides crucial features in the context of model fitting like taking care of the parallelisation and analytic derivatives.



How to use
==========

While the zfit library provides a model fitting and sampling framework for a broad list of applications,
we will illustrate its main features with a simple example by fitting a Gaussian distribution with an unbinned
likelihood fit and a parameter uncertainty estimation.


Example in short
----------------
.. code-block:: python

    obs = zfit.Space('x', limits=(-10, 10))

    # create the model
    mu    = zfit.Parameter("mu"   , 2.4, -1, 5)
    sigma = zfit.Parameter("sigma", 1.3,  0, 5)
    gauss = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)

    # load the data
    data_np = np.random.normal(size=10000)
    data = zfit.Data.from_numpy(obs=obs, array=data_np)

    # build the loss
    nll = zfit.loss.UnbinnedNLL(model=gauss, data=data)

    # minimize
    minimizer = zfit.minimize.Minuit()
    result = minimizer.minimize(nll)

    # calculate errors
    param_errors = result.error()

This follows the zfit workflow

.. image:: docs/images/zfit_workflow_v1.png
    :alt: zfit workflow




Full explanation
----------------

The default space (e.g. normalization range) of a PDF is defined by an *observable space*, which is created using the ``zfit.Space`` class:


.. code-block:: python

    obs = zfit.Space('x', limits=(-10, 10))


To create a simple Gaussian PDF, we define its parameters and their limits using the ``zfit.Parameter`` class.

.. code-block:: python

  # syntax: zfit.Parameter("any_name", value, lower, upper)
    mu    = zfit.Parameter("mu"   , 2.4, -1, 5)
    sigma = zfit.Parameter("sigma", 1.3,  0, 5)
    gauss = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)

For simplicity, we create the dataset to be fitted starting from a numpy array, but zfit allows for the use of other sources such as ROOT files:

.. code-block:: python

    mu_true = 0
    sigma_true = 1
    data_np = np.random.normal(mu_true, sigma_true, size=10000)
    data = zfit.Data.from_numpy(obs=obs, array=data_np)

Fits are performed in three steps:

1. Creation of a loss function, in our case a negative log-likelihood.
2. Instantiation of our minimiser of choice, in the example the ``Minuit``.
3. Minimisation of the loss function.

.. code-block:: python

    # Stage 1: create an unbinned likelihood with the given PDF and dataset
    nll = zfit.loss.UnbinnedNLL(model=gauss, data=data)

    # Stage 2: instantiate a minimiser (in this case a basic minuit)
    minimizer = zfit.minimize.Minuit()

    # Stage 3: minimise the given negative log-likelihood
    result = minimizer.minimize(nll)

Errors are calculated with a further function call to avoid running potentially expensive operations if not needed:

.. code-block:: python

    param_errors = result.error()

Once we've performed the fit and obtained the corresponding uncertainties, we can examine the fit results:

.. code-block:: python

    print("Function minimum:", result.fmin)
    print("Converged:", result.converged)
    print("Full minimizer information:", result.info)

    # Information on all the parameters in the fit
    params = result.params
    print(params)

    # Printing information on specific parameters, e.g. mu
    print("mu={}".format(params[mu]['value']))

And that's it!
For more details and information of what you can do with zfit, checkout the documentation_.

Prerequisites
=============

``zfit`` works with Python versions 3.6 and 3.7.
The following packages (amongst others) are required:

- `tensorflow <https://www.tensorflow.org/>`_ >= 1.10.0
- `tensorflow_probability <https://www.tensorflow.org/probability>`_ >= 0.3.0
- `scipy <https://www.scipy.org/>`_ >=1.2
- `uproot <https://github.com/scikit-hep/uproot>`_
- `iminuit <https://github.com/scikit-hep/iminuit>`_

... and some minor packages. For a full list, check the `requirements <requirements.txt>`_.

Installing
==========

zfit is available on conda-forge and pip. If possible, use a conda or virtual environment and do:

For conda:

.. code-block:: console

    $ conda install zfit -c conda-forge

For pip (if you don't use conda):

.. code-block:: console

    $ pip install zfit


For the newest development version, you can install the version from git with

.. code-block:: console

   $ pip install git+https://github.com/zfit/zfit


Contributing
============

Any idea of how to improve the library? Or interested to write some code?
Contributions are always welcome, please have a look at the `Contributing guide`_.

.. _Contributing guide: CONTRIBUTING.rst


Contact
=======

You can contact us directly:
 - via e-mail: zfit@physik.uzh.ch
 - join our `Gitter channel <https://gitter.im/zfit/zfit>`_


Main Authors
============

| Jonas Eschle <jonas.eschle@cern.ch>
| Albert Puig <albert.puig@cern.ch>
| Rafael Silva Coutinho <rsilvaco@cern.ch>



Acknowledgements
================

zfit has been developed with support from the University of Zürich and the Swiss National Science Foundation (SNSF) under contracts 168169 and 174182.

The idea of zfit is inspired by the `TensorFlowAnalysis <https://gitlab.cern.ch/poluekt/TensorFlowAnalysis>`_ framework developed by Anton Poluektov using the TensorFlow open source library.

.. _documentation: https://zfit.readthedocs.io/en/latest/
.. _API: https://zfit.readthedocs.io/en/latest/API.html


*********
Changelog
*********

Develop
=======


Major Features and Improvements
-------------------------------

Behavioral changes
------------------

Bug fixes and small changes
---------------------------

Requirement changes
-------------------


Thanks
------

0.3.7 (6.12.19)
================

This is a legacy release to add some fixes, next release is TF 2 eager mode only release.


Major Features and Improvements
-------------------------------
 - mostly TF 2.0 compatibility in graph mode, tests against 1.x and 2.x

Behavioral changes
------------------

Bug fixes and small changes
---------------------------
 - `get_depentents` returns now an OrderedSet
 - errordef is now a (hidden) attribute and can be changed
 - fix bug in polynomials


Requirement changes
-------------------
 - added ordered-set

0.3.6 (12.10.19)
================

**Special release for conda deployment and version fix (TF 2.0 is out)**

**This is the last release before breaking changes occur**


Major Features and Improvements
-------------------------------
 - added ConstantParameter and `zfit.param` namespace
 - Available on conda-forge

Behavioral changes
------------------
 - an implicitly created parameter with a Python numerical (e.g. when instantiating a model)
   will be converted to a ConstantParameter instead of a fixed Parameter and therefore
   cannot be set to floating later on.

Bug fixes and small changes
---------------------------
 - added native support TFP distributions for analytic sampling
 - fix Gaussian (TFP Distribution) Constraint with mixed up order of parameters

 - `from_numpy` automatically converts to default float regardless the original numpy dtype,
   `dtype` has to be used as an explicit argument


Requirement changes
-------------------
 - TensorFlow >= 1.14 is required


Thanks
------
 - Chris Burr for the conda-forge deployment


0.3.4 (30-07-19)
================

**This is the last release before breaking changes occur**

Major Features and Improvements
-------------------------------
- create `Constraint` class which allows for more fine grained control and information on the applied constraints.
- Added Polynomial models
- Improved and fixed sampling (can still be slightly biased)

Behavioral changes
------------------
None

Bug fixes and small changes
---------------------------

- fixed various small bugs

Thanks
------
for the contribution of the Constraints to Matthieu Marinangeli <matthieu.marinangeli@cern.ch>



0.3.3 (15-05-19)
================

Fixed Partial numeric integration

Bugfixes mostly, a few major fixes. Partial numeric integration works now.

Bugfixes
 - data_range cuts are now applied correctly, also in several dimensions when a subset is selected
   (which happens internally of some Functors, e.g. ProductPDF). Before, only the selected obs was respected for cuts.
 - parital integration had a wrong take on checking limits (now uses supports).


0.3.2 (01-05-19)
================

With 0.3.2, bugfixes and three changes in the API/behavior

Breaking changes
----------------
 - tfp distributions wrapping is now different with dist_kwargs allowing for non-Parameter arguments (like other dists)
 - sampling allows now for importance sampling (sampler in Model specified differently)
 - `model.sample` now also returns a tensor, being consistent with `pdf` and `integrate`

Bugfixes
--------
 - shape handling of tfp dists was "wrong" (though not producing wrong results!), fixed. TFP distributions now get a tensor with shape (nevents, nobs) instead of a list of tensors with (nevents,)

Improvements
------------
 - refactor the sampling for more flexibility and performance (less graph constructed)
 - allow to use more sophisticated importance sampling (e.g. phasespace)
 - on-the-fly normalization (experimentally) implemented with correct gradient



0.3.1 (30-04-19)
================


Minor improvements and bugfixes including:

- improved importance sampling allowing to preinstantiate objects before it's called inside the while loop
- fixing a problem with `ztf.sqrt`



0.3.0 (2019-03-20)
==================


Beta stage and first pip release


0.0.1 (2018-03-22)
==================


* First creation of the package.


