Metadata-Version: 2.1
Name: ploomber
Version: 0.2
Summary: A Python library for developing great data pipelines
Home-page: https://github.com/ploomber/ploomber
Author: 
Author-email: 
License: A license
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: Unix
Classifier: Operating System :: POSIX
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Utilities
Requires-Dist: pyyaml
Requires-Dist: networkx
Requires-Dist: jinja2
Requires-Dist: tabulate
Requires-Dist: humanize
Requires-Dist: tqdm
Requires-Dist: numpydoc
Requires-Dist: sqlparse
Requires-Dist: autopep8
Requires-Dist: parso
Requires-Dist: mistune
Requires-Dist: sqlalchemy
Requires-Dist: importlib-resources ; python_version < "3.7"
Provides-Extra: all
Requires-Dist: pandas ; extra == 'all'
Requires-Dist: pyarrow ; extra == 'all'
Requires-Dist: paramiko ; extra == 'all'
Requires-Dist: matplotlib ; extra == 'all'
Requires-Dist: pygraphviz ; extra == 'all'
Requires-Dist: papermill ; extra == 'all'
Requires-Dist: jupytext ; extra == 'all'
Requires-Dist: jupyter ; extra == 'all'
Provides-Extra: nb
Requires-Dist: papermill ; extra == 'nb'
Requires-Dist: jupytext ; extra == 'nb'
Requires-Dist: jupyter ; extra == 'nb'
Provides-Extra: plot
Requires-Dist: matplotlib ; extra == 'plot'
Requires-Dist: pygraphviz ; extra == 'plot'

ploomber
========

.. image:: https://travis-ci.org/ploomber/ploomber.svg?branch=master
    :target: https://travis-ci.org/ploomber/ploomber.svg?branch=master

.. image:: https://readthedocs.org/projects/ploomber/badge/?version=latest
    :target: https://ploomber.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status


`Click here for focumentation <https://ploomber.readthedocs.io/>`_

ploomber is workflow management tool that accelerates experimentation and
facilitates building production systems. It achieves so by providing
incremental builds, interactive execution, tools to inspect pipelines, by
facilitating testing and reducing boilerplate code.

Install
-------

If you want to try out everything ploomber has to offer:

.. code-block:: shell

    pip install ploomber[all]

Note that installing everything will attemp to install pygraphviz, which
depends on graphviz, you have to install that first:

.. code-block:: shell

    # if you are using conda (recommended)
    conda install graphviz
    # if you are using homebew
    brew install graphviz
    # for other systems, see: https://www.graphviz.org/download/

If you want to start with the minimal amount of dependencies:

.. code-block:: shell

    pip install ploomber


Example
-------

.. code-block:: python

    from ploomber import DAG
    from ploomber.products import File
    from ploomber.tasks import PythonCallable, SQLDump
    from ploomber.clients import SQLAlchemyClient

    dag = DAG()

    # the first task dumps data from the db to the local filesystem
    task_dump = SQLDump('SELECT * FROM example',
                        File(tmp_dir / 'example.csv'),
                        dag,
                        name='dump',
                        client=SQLAlchemyClient(uri),
                        chunksize=None)

    def _add_one(upstream, product):
        """Add one to column a
        """
        df = pd.read_csv(str(upstream['dump']))
        df['a'] = df['a'] + 1
        df.to_csv(str(product), index=False)

    # we convert the Python function to a Task
    task_add_one = PythonCallable(_add_one,
                                  File(tmp_dir / 'add_one.csv'),
                                  dag,
                                  name='add_one')

    # declare how tasks relate to each other
    task_dump >> task_add_one

    # run the pipeline - incremental buids: ploomber will keep track of each
    # task's source code and will only execute outdated tasks in the next run
    dag.build()

    # a DAG also serves as a tool to interact with your pipeline, for example,
    # status will return a summary table
    dag.status()

CHANGELOG
=========

0.2 (2020-02-13)
-----------------

* Simplifies installation
* Deletes BashCommand, use ShellScript
* More examples added
* Refactored env module
* Renames SQLStore to SourceLoader
* Improvements to SQLStore
* Improved documentation
* Renamed PostgresCopy to PostgresCopyFrom
* SQLUpload and PostgresCopy have now the same API
* A few fixes to PostgresCopy (#1, #2)

0.1
---

* First release

