Metadata-Version: 2.4
Name: one-pass-stats
Version: 1.0.0
Summary: One-pass, parallel (mergeable) descriptive statistics using Welford's algorithm
Project-URL: Repository, https://github.com/petergaultney/One-Pass-Parallel-Descriptive-Statistics
Author: Peter Gaultney
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Description-Content-Type: text/markdown

Contained are a tested/working Python and C++ implementation of the parallel version of Welford's algorithm for calculating simple descriptive statistics in a single pass over the data. https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm

The advantage of this approach is twofold: Not only can you calculate these statistics as your data points arrive, for, e.g., a live dashboard, but you can actually distribute your calculation across multiple 'groups', whether phyiscal hardware (servers) or time spans (e.g. in 5 minute 'windows'), and merge those groups at arbitrary times to get the overall statistics for your full population. 

In the long run, this provides the advantage of allowing reduced data storage, as you can aggregate and store the data in whatever granularity you need, allowing you to trade off storage, compute, and precision within a large parameter space.
