Metadata-Version: 2.3
Name: dirichlet
Version: 1.0.0
Summary: Calculates Dirichlet test and plots 2-simplex Dirichlets
Author: Eric Suh
Author-email: Eric Suh <contact@ericsuh.com>
License: MIT
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Dist: scipy>=1.4.1
Requires-Dist: numpy>=1.18.1
Requires-Dist: matplotlib>=3.2.0 ; extra == 'simplex'
Maintainer: Eric Suh
Maintainer-email: Eric Suh <contact@ericsuh.com>
Requires-Python: >=3.10
Project-URL: Bug Tracker, https://github.com/ericsuh/dirichlet/issues
Project-URL: Homepage, http://github.com/ericsuh/dirichlet
Project-URL: Repository, https://github.com/ericsuh/dirichlet
Provides-Extra: simplex
Description-Content-Type: text/markdown

Dirichlet
=========

A Python package to estimate the Dirichlet distribution, calculate maximum
likelihood, and test for independence from a variable based on fitting nested
Dirichlet distribution hypotheses.

Most of this package is a port of Thomas P. Minka's wonderful
[Fastfit][fastfit] MATLAB code. Much thanks to him for that and his clear
paper ["Estimating a Dirichlet distribution"][estimating].

[estimating]: http://research.microsoft.com/en-us/um/people/minka/papers/dirichlet/
[fastfit]: http://research.microsoft.com/en-us/um/people/minka/software/fastfit/

Dirichlet Test
--------------

This likelihood ratio test for independence will determine whether two
Dirichlet-distributed data sets are likely to be from the same distribution
or from two different ones, much like a chi-square or G-test for independence,
but with Dirichlet models.

Simplex Plots
-------------

The `dirichlet.simplex` module creates scatter, contour, and filled contour 2-simplex plots. To use this, be sure to install the `simplex` package "extra" (e.g. `pip install dirichlet[simplex]`).

Caveats
-------

Note that this package at the moment doesn't support sparse data vectors due to the
numerical fitting algorithm that uses the gamma function. Possibly some sort of
[additive smoothing](https://en.wikipedia.org/wiki/Additive_smoothing) would
make this package work in your context, but that will depend on your application.

Installation
------------

    pip install dirichlet
    # or
    uv add dirichlet

This has only been tested with Python 3.10+. Other versions may work, but they
haven't been tested.

Development
-----------

To install dev tooling, run:

    uv sync --frozen --locked --all-groups --all-extras

    # To format
    uv run ruff format

    # To test
    uv run pytest

    # To lint
    uv run ruff

A Github workflow will run tests against several Python versions.
