Metadata-Version: 2.0
Name: jsontableschema
Version: 0.8.2
Summary: A utility library for working with JSON Table Schema in Python
Home-page: https://github.com/frictionlessdata/jsontableschema-py
Author: Open Knowledge Foundation
Author-email: info@okfn.org
License: MIT
Keywords: frictionless data,open data,json schema,json table schema,data package,tabular data package
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Dist: click (>=3.3,<7.0a)
Requires-Dist: future (>=0.15,<1.0a)
Requires-Dist: jsonschema (>=2.5,<3.0a)
Requires-Dist: python-dateutil (>=2.4,<3.0a)
Requires-Dist: requests (>=2.5,<3.0a)
Requires-Dist: rfc3986 (>=0.4,<1.0a)
Requires-Dist: tabulator (>=0.7,<1.0a)
Requires-Dist: unicodecsv (>=0.14,<1.0a)
Provides-Extra: develop
Requires-Dist: pylama; extra == 'develop'
Requires-Dist: tox; extra == 'develop'

JSON Table Schema
=================

| |Travis|
| |Coveralls|
| |PyPi|
| |SemVer|
| |Gitter|

A utility library for working with `JSON Table
Schema <http://dataprotocols.org/json-table-schema/>`__ in Python.

    With v0.7 renewed API has been introduced in backward-compatibility
    manner. Documentation for deprecated API could be found
    `here <https://github.com/frictionlessdata/jsontableschema-py/tree/0.6.5#json-table-schema>`__.
    Deprecated API will be removed with v1 release.

Features
--------

-  ``Table`` to work with data tables described by JSON Table Schema
-  ``Schema`` representing JSON Table Schema
-  ``Field`` representing JSON Table Schema field
-  ``validate`` to validate JSON Table Schema
-  ``infer`` to infer JSON Table Schema from data
-  built-in command-line interface to validate and infer schemas
-  storage/plugins system to connect tables to different storage
   backends like SQL Database

Gettings Started
----------------

Installation
~~~~~~~~~~~~

.. code:: bash

    pip install jsontableschema

Example
~~~~~~~

.. code:: python

    from jsontableschema import Table

    # Create table
    table = Table('path.csv', schema='schema.json')

    # Print schema descriptor
    print(table.schema.descriptor)

    # Print cast rows in a dict form
    for keyed_row in table.iter(keyed=True):
        print(keyed_row)

Table
~~~~~

Table represents data described by JSON Table Schema:

.. code:: python

    # pip install sqlalchemy jsontableschema-sql
    import sqlalchemy as sa
    from pprint import pprint
    from jsontableschema import Table

    # Data source
    SOURCE = 'https://raw.githubusercontent.com/okfn/jsontableschema-py/master/data/data_infer.csv'

    # Create SQL database
    db = sa.create_engine('sqlite://')

    # Data processor
    def skip_under_30(erows):
        for number, headers, row in erows:
            krow = dict(zip(headers, row))
            if krow['age'] >= 30:
                yield (number, headers, row)

    # Work with table
    table = Table(SOURCE, post_cast=[skip_under_30])
    table.schema.save('tmp/persons.json') # Save INFERRED schema
    table.save('persons', backend='sql', engine=db) # Save data to SQL
    table.save('tmp/persons.csv')  # Save data to DRIVE

    # Check the result
    pprint(Table('persons', backend='sql', engine=db).read(keyed=True))
    pprint(Table('tmp/persons.csv').read(keyed=True))
    # Will print (twice)
    # [{'age': 39, 'id': 1, 'name': 'Paul'},
    #  {'age': 36, 'id': 3, 'name': 'Jane'}]

Schema
~~~~~~

A model of a schema with helpful methods for working with the schema and
supported data. Schema instances can be initialized with a schema source
as a filepath or url to a JSON file, or a Python dict. The schema is
initially validated (see `validate <#validate>`__ below), and will raise
an exception if not a valid JSON Table Schema.

.. code:: python

    from jsontableschema import Schema

    # Init schema
    schema = Schema('path.json')

    # Cast a row
    schema.cast_row(['12345', 'a string', 'another field'])

Methods available to ``Schema`` instances:

-  ``descriptor`` - return schema descriptor
-  ``fields`` - an array of the schema's Field instances
-  ``headers`` - an array of the schema headers
-  ``primary_key`` - the primary key field for the schema as an array
-  ``foreignKey`` - the foreign key property for the schema as an array
-  ``get_field(name)`` - return the field object for given name
-  ``has_field(name)`` - return a bool if the field exists in the schema
-  ``cast_row(row, no_fail_fast=False)`` - return row cast against
   schema
-  ``save(target)`` - save schema to filesystem

Where the option ``no_fail_fast`` is given, it will collect all errors
it encouters and an exceptions.MultipleInvalid will be raised (if there
are errors).

Field
~~~~~

.. code:: python

    from jsontableschemal import Field

    # Init field
    field = Field({'type': 'number'})

    # Cast a value
    field.cast_value('12345') # -> 12345

Data values can be cast to native Python objects with a Field instance.
Type instances can be initialized with `field
descriptors <http://dataprotocols.org/json-table-schema/#field-descriptors>`__.
This allows formats and constraints to be defined.

Casting a value will check the value is of the expected type, is in the
correct format, and complies with any constraints imposed by a schema.
E.g. a date value (in ISO 8601 format) can be cast with a DateType
instance. Values that can't be cast will raise an ``InvalidCastError``
exception.

Casting a value that doesn't meet the constraints will raise a
``ConstraintError`` exception.

validate
~~~~~~~~

Given a schema as JSON file, url to JSON file, or a Python dict,
``validate`` returns ``True`` for a valid JSON Table Schema, or raises
an exception, ``SchemaValidationError``. It validates only **schema**,
not data against schema!

.. code:: python

    import io
    import json

    from jsontableschema import validate

    with io.open('schema_to_validate.json') as stream:
        descriptor = json.load(stream)

    try:
        jsontableschema.validate(descriptor)
    except jsontableschema.exceptions.SchemaValidationError as exception:
       # handle error

It may be useful to report multiple errors when validating a schema.
This can be done with ``no_fail_fast`` flag set to True.

.. code:: python

    try:
        jsontableschema.validate(descriptor, no_fail_fast=True)
    except jsontableschema.exceptions.MultipleInvalid as exception:
        for error in exception.errors:
            # handle error

infer
~~~~~

Given headers and data, ``infer`` will return a JSON Table Schema as a
Python dict based on the data values. Given the data file,
data\_to\_infer.csv:

::

    id,age,name
    1,39,Paul
    2,23,Jimmy
    3,36,Jane
    4,28,Judy

Call ``infer`` with headers and values from the datafile:

.. code:: python

    import io
    import csv

    from jsontableschema import infer

    filepath = 'data_to_infer.csv'
    with io.open(filepath) as stream:
        headers = stream.readline().rstrip('\n').split(',')
        values = csv.reader(stream)

    schema = infer(headers, values)

``schema`` is now a schema dict:

.. code:: python

    {u'fields': [
        {
            u'description': u'',
            u'format': u'default',
            u'name': u'id',
            u'title': u'',
            u'type': u'integer'
        },
        {
            u'description': u'',
            u'format': u'default',
            u'name': u'age',
            u'title': u'',
            u'type': u'integer'
        },
        {
            u'description': u'',
            u'format': u'default',
            u'name': u'name',
            u'title': u'',
            u'type': u'string'
        }]
    }

The number of rows used by ``infer`` can be limited with the
``row_limit`` argument.

CLI
~~~

    It's a provisional API excluded from SemVer. If you use it as a part
    of other program please pin concrete ``goodtables`` version to your
    requirements file.

JSON Table Schema features a CLI called ``jsontableschema``. This CLI
exposes the ``infer`` and ``validate`` functions for command line use.

Example of ``validate`` usage:

::

    $ jsontableschema validate path/to-schema.json

Example of ``infer`` usage:

::

    $ jsontableschema infer path/to/data.csv

The response is a schema as JSON. The optional argument ``--encoding``
allows a character encoding to be specified for the data file. The
default is utf-8.

Storage
~~~~~~~

The library includes interface declaration to implement tabular
``Storage``:

|Storage|

| An implementor should follow ``jsontableschema.Storage`` interface to
write his
| own storage backend. This backend could be used with ``Table`` class.
See ``plugins``
| system below to know how to integrate custom storage plugin.

plugins
~~~~~~~

JSON Table Schema has a plugin system. Any package with the name like
``jsontableschema_<name>`` could be imported as:

.. code:: python

    from jsontableschema.plugins import <name>

If a plugin is not installed ``ImportError`` will be raised with a
message describing how to install the plugin.

A list of officially supported plugins:

-  BigQuery Storage -
   https://github.com/frictionlessdata/jsontableschema-bigquery-py
-  Pandas Storage -
   https://github.com/frictionlessdata/jsontableschema-pandas-py
-  SQL Storage -
   https://github.com/frictionlessdata/jsontableschema-sql-py

API Reference
-------------

Snapshot
~~~~~~~~

::

    Table(source, schema=None, post_cast=None, backend=None, **options)
        stream -> tabulator.Stream
        schema -> Schema
        name -> str
        iter(keyed/extended=False) -> (generator) (keyed/extended)row[]
        read(keyed/extended=False, limit=None) -> (keyed/extended)row[]
        save(target, backend=None, **options)
    Schema(descriptor)
        descriptor -> dict
        fields -> Field[]
        headers -> str[]
        primary_key -> str[]
        foreign_keys -> str[]
        get_field(name) -> Field
        has_field(name) -> bool
        cast_row(row, no_fail_fast=False) -> row
        save(target)
    Field(descriptor)
        descriptor -> dict
        name -> str
        type -> str
        format -> str
        constraints -> dict
        cast_value(value, skip_constraints=False) -> value
        test_value(value, skip_constraints=False, constraint=None) -> bool
    validate(descriptor, no_fail_fast=False) -> bool
    infer(headers, values) -> descriptor
    exceptions
    ~cli
    ---
    Storage(**options)
        buckets -> str[]
        create(bucket, descriptor, force=False)
        delete(bucket=None, ignore=False)
        describe(bucket, descriptor=None) -> descriptor
        iter(bucket) -> (generator) row[]
        read(bucket) -> row[]
        write(bucket, rows)
    plugins

Detailed
~~~~~~~~

-  `Docstrings <https://github.com/frictionlessdata/jsontableschema-py/tree/master/jsontableschema>`__
-  `Changelog <https://github.com/frictionlessdata/jsontableschema-py/commits/master>`__

Contributing
------------

Please read the contribution guideline:

`How to Contribute <CONTRIBUTING.md>`__

Thanks!

.. |Travis| image:: https://travis-ci.org/frictionlessdata/jsontableschema-py.svg?branch=master
   :target: https://travis-ci.org/frictionlessdata/jsontableschema-py
.. |Coveralls| image:: http://img.shields.io/coveralls/frictionlessdata/jsontableschema-py.svg?branch=master
   :target: https://coveralls.io/r/frictionlessdata/jsontableschema-py?branch=master
.. |PyPi| image:: https://img.shields.io/pypi/v/jsontableschema.svg
   :target: https://pypi.python.org/pypi/jsontableschema
.. |SemVer| image:: https://img.shields.io/badge/versions-SemVer-brightgreen.svg
   :target: http://semver.org/
.. |Gitter| image:: https://img.shields.io/gitter/room/frictionlessdata/chat.svg
   :target: https://gitter.im/frictionlessdata/chat
.. |Storage| image:: files/storage.png

