Metadata-Version: 2.1
Name: data-plumber
Version: 1.4.0
Summary: lightweight but versatile python-framework for multi-stage information processing
Home-page: https://pypi.org/project/data-plumber/
Author: Steffen Richters-Finger
Author-email: srichters@uni-muenster.de
License: MIT
Project-URL: Source, https://github.com/RichtersFinger/data-plumber
Platform: UNKNOWN
Requires-Python: >=3.10, <4
Description-Content-Type: text/markdown
License-File: LICENSE

![Tests](https://github.com/RichtersFinger/data-plumber/actions/workflows/tests.yml/badge.svg?branch=main)

# data-plumber
`data-plumber` is a lightweight but versatile python-framework for multi-stage
information processing. It allows to construct processing pipelines from both
atomic building blocks and via recombination of existing pipelines. Forks
enable more complex (i.e. non-linear) orders of execution. Pipelines can also
be collected into arrays that can be executed at once with the same input
data.

## Minimal usage example
Consider a scenario where the contents of a dictionary have to be validated
and a suitable error message has to be generated. Specifically, a valid input-
dictionary is expected to have a key "data" with the respective value being
a list of integer numbers. A suitable pipeline might look like this
```
>>> from data_plumber import Stage, Pipeline, Previous
>>> pipeline = Pipeline(
        Stage(
            primer=lambda **kwargs: "data" in kwargs,
            status=lambda primer, **kwargs: 0 if primer else 1,
            message=lambda primer, **kwargs: "" if primer else "missing key"
        ),
        Stage(
            requires={Previous: 0},
            primer=lambda data, **kwargs: isinstance(data, list),
            status=lambda primer, **kwargs: 0 if primer else 1,
            message=lambda primer, **kwargs: "" if primer else "bad type"
        ),
        Stage(
            requires={Previous: 0},
            primer=lambda data, **kwargs: all(isinstance(i, int) for i in data),
            status=lambda primer, **kwargs: 0 if primer else 1,
            message=lambda primer, **kwargs: "validation success" if primer else "bad type in data"
        ),
        exit_on_status=1
    )
>>> pipeline.run(**{}).stages
[('missing key', 1)]
>>> pipeline.run(**{"data": 1}).stages
[('', 0), ('bad type', 1)]
>>> pipeline.run(**{"data": [1, "2", 3]}).stages
[('', 0), ('', 0), ('bad type in data', 1)]
>>> pipeline.run(**{"data": [1, 2, 3]}).stages
[('', 0), ('', 0), ('validation success', 0)]
```


