Metadata-Version: 2.1
Name: pipedown
Version: 0.0.1
Summary: A data science pipelining framework for Python
Home-page: https://github.com/brendanhasz/pipedown
Author: Brendan Hasz
Author-email: winsto99@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Description-Content-Type: text/x-rst
Requires-Dist: matplotlib (>=3.1.0)
Requires-Dist: numpy (>=1.17.0)
Requires-Dist: pandas (>=1.0.0)
Requires-Dist: cloudpickle (>=1.3)
Provides-Extra: dev
Requires-Dist: autoflake (>=1.4) ; extra == 'dev'
Requires-Dist: black (>=19.10b0) ; extra == 'dev'
Requires-Dist: bumpversion (>=0.6.0) ; extra == 'dev'
Requires-Dist: flake8 (>=3.8.3) ; extra == 'dev'
Requires-Dist: isort (>=5.1.2) ; extra == 'dev'
Requires-Dist: pytest (>=6.0.0rc1) ; extra == 'dev'
Requires-Dist: pytest-cov (>=2.7.1) ; extra == 'dev'
Requires-Dist: sphinx (>=3.1.2) ; extra == 'dev'
Requires-Dist: sphinx-rtd-theme (>=0.5.0) ; extra == 'dev'
Requires-Dist: setuptools (>=49.1.0) ; extra == 'dev'
Requires-Dist: twine (>=3.2.0) ; extra == 'dev'
Requires-Dist: wheel (>=0.34.2) ; extra == 'dev'

# Pipedown :shushing_face:

A data science pipelining framework for Python.

Still in the super early stages - don't use this yet!

Roadmap:

* Actually getting the main package working.
* Ensembling? Could just have a node w/ multiple inputs from multiple models; combines predictions via stacking or averaging or another model or whatever you like.
* Design for integration with experiment tracking packages (MLFlow, Optuna, HyperparameterHunter, etc), esp. in terms of hyperparam optimization + feature selection:
* Hyperparameter optimization?  Node objs could just have a hyperparameters method which defines default vals and/or range, then also has a get_hyperparameter('name') method used in fit and run.  And Pipeline could have a optimize_hyperparameters fn which optimizes the hyperparams of all its nodes jointly.  Should think about integration with other hyperparam opt packages though (hyperopt, scikit-optimize, optuna, ray.tune)
* Feature selection (again, obvi can just have a single node which does, say PCA, or includes a model, but would be nice to have it take the output from a model for e.g. permutation importance-based selection or sequential feature selection)
* Reports? - generate plots or somesuch? I mean of course can just have a node which saves files etc
* Feature importance?
* Partial dependence + SHAP values?
* How to work probabilistic predictions into the framework?  Ie what about Bayesian models?  The current Model specification (returning just y_pred) won't work.



