Metadata-Version: 2.0
Name: mettle
Version: 0.7.13
Summary: A micro service framework for data pipelines, providingscheduling, retrying, and error reporting.
Home-page: https://github.com/yougov/mettle
Author: YouGov, Plc.
Author-email: opensource@yougov.com
License: UNKNOWN
Description-Content-Type: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2.7
Requires-Python: >=2.7,<3.0
Requires-Dist: Beaker (==1.6.4)
Requires-Dist: croniter (==0.3.5)
Requires-Dist: functools32 (==3.2.3-1)
Requires-Dist: gevent (==1.0.1)
Requires-Dist: gunicorn (==19.1.1)
Requires-Dist: iso8601 (>=0.1.10)
Requires-Dist: pgpubsub (>=0.0.4)
Requires-Dist: psycogreen (==1.0)
Requires-Dist: psycopg2 (==2.5.4)
Requires-Dist: PyYAML (==3.11)
Requires-Dist: spa (==0.0.7)
Requires-Dist: sqlalchemy (==0.9.8)
Requires-Dist: Werkzeug (==0.10.1)
Requires-Dist: mettle-protocol (>=1.0.1)
Requires-Dist: pika (<0.10.0,>=0.9.14)
Requires-Dist: utc
Provides-Extra: docs
Requires-Dist: sphinx; extra == 'docs'
Requires-Dist: jaraco.packaging (>=3.2); extra == 'docs'
Requires-Dist: rst.linker (>=1.9); extra == 'docs'
Provides-Extra: testing
Requires-Dist: pytest (>=2.8); extra == 'testing'
Requires-Dist: pytest-sugar; extra == 'testing'
Requires-Dist: collective.checkdocs; extra == 'testing'

.. image:: https://img.shields.io/pypi/v/mettle.svg
   :target: https://pypi.org/project/mettle

.. image:: https://img.shields.io/pypi/pyversions/mettle.svg

.. image:: https://img.shields.io/pypi/dm/mettle.svg

.. image:: https://img.shields.io/travis/yougov/mettle/master.svg
   :target: http://travis-ci.org/yougov/mettle

Mettle is a framework for managing extract/transform/load (ETL) jobs.  ETL
processes present a number of problems that Mettle is designed to solve:

License
=======

License is indicated in the project metadata (typically one or more
of the Trove classifiers). For more details, see `this explanation
<https://github.com/jaraco/skeleton/issues/1>`_.

Description
===========

- Jobs need to be run at specific times.  Sometimes they need to be triggered by
  the completion of other jobs.   Mettle supports scheduling both time-based
  and trigger-based jobs.
- Various people in an organization need to be able to see job schedules and
  the state of recent runs.  Naive scripts running on cron jobs, scattered
  amongst a large number of servers, create a serious problem with visibility.
  Mettle solves this by centralizing the job scheduling, state reporting, and
  log viewing.
- Sometimes jobs fail because of temporary problems somewhere (a flaky network,
  a too-full disk).  Mettle will automatically retry jobs to deal with this.
- Sometimes jobs fail and will not be able to succeed until the job has been
  reconfigured (a changed password on a database, for example).  Mettle makes it
  easy to manually re-launch a job after such issues have been resolved.
- If you try to solve the above problems by centralizing all your ETL execution,
  you quickly run into a problem of proliferating dependencies.  A centralized
  ETL service can become hard to develop and hard to deploy because all those
  dependencies (libraries, external APIs, external databases) introduce more
  instability.  Mettle is designed to isolate those dependencies into separate
  ETL services, so instability in one ETL doesn't impact any others.

We picked the name "Mettle" because:

- It's got the letters E, T, and L in it.
- It means "ability to continue despite difficulties".
- It sounds like "metal", which is solid.

Mettle is comprised of several components:

- Web UI.  Features:
    - Configure schedules for pipelines.
    - Display past jobs, both successful and failed.
    - Display currently-executing jobs, with live status updates and streaming
      logs.
    - Manually launch jobs.
- Timer: Reads pipeline schedules from the database and sends out RabbitMQ messages
  when pipelines need to be kicked off.
- Dispatcher: Records which jobs are being executed by which workers, and their
  eventual success or failure.
- Logger: Receives log messages sent from ETL Services over RabbitMQ, and saves
  them to Postgres.
- ETL Services: Implement the actual business logic and systems integration to
  move data between systems.

Mettle uses Postgres to store state, and RabbitMQ for inter-process
communication.


