Metadata-Version: 2.1
Name: pg_jsonb_flattener
Version: 0.2.0
Summary: JSONB data flattener for Postgresl
Home-page: https://gitlab.com/askorski/pg_jsonb_flattener
Author: Anthony SKORSKI
Author-email: skorski.anthony+pg_jsonb_flattener@gmail.com
License: MIT
Description: .. |master_pipeline| image:: https://gitlab.skor.ski/askorski/pg_jsonb_flattener/badges/master/pipeline.svg
        .. |master_coverage| image:: https://gitlab.skor.ski/askorski/pg_jsonb_flattener/badges/master/coverage.svg
        .. |develop_pipeline| image:: https://gitlab.skor.ski/askorski/pg_jsonb_flattener/badges/develop/pipeline.svg
        .. |develop_coverage| image:: https://gitlab.skor.ski/askorski/pg_jsonb_flattener/badges/develop/coverage.svg
        
        
        pg_jsonb_flattener
        ##################
        
        
        The `pg_jsonb_flattener` is a tool to ease selection of Postgresql table which
        contains JSONB typed fields.
        
        Provided with a description of the data (a sort of jsonschema) contained in a
        JSONB field, it is able to build a selectable sqlalchemy expression which can
        be used as a plain Postgresql table for SELECT queries.
        
        Status
        ******
        
        +---------+---------------------+---------------------+
        | branch  | CI                  | Coverage            |
        +=========+=====================+=====================+
        | master  | |master_pipeline|   | |master_coverage|   |
        +---------+---------------------+---------------------+
        | develop | |develop_pipeline|  | |develop_coverage|  |
        +---------+---------------------+---------------------+
        
        Usage
        *****
        
        The tool is used by instantiating a `JsonbTable` providing it:
        
        * a sqlaclehmy table description as the only positionnal argument
        * a *JSON data description* for each *JSONB field* to flatten as named arguments
          where the argument name is the *JSONB field* name and the argument value is
          the *JSON data description*.
        
        The code below illustrates how the tool can be used to obtain a
        `flat selectable` from a data description written in a JSON file.
        
        .. code-block:: python
        
           from sqlalchemy import create_engine, MetaData, Table
           from pg_jsonb_flattener import JsonbTable
        
           ENGINE = create_engine('some database URL')
           METADATA = MetaData()
           ONE_JSON_FIELD_TABLE = Table(
               'one_json_field_table', METADATA,
               Column('field_one', JSONB)
           )
           DATA_DESCRIPTION = json.load('path/to/data_description.json')
        
           jsonb_table = JsonbTable(ONE_JSON_FIELD_TABLE, field_one=DATA_DESCRIPTION)
        
           flat_selectable = jsonb_table.get_query()
        
           with engine.connect() as conn:
               result = conn.execute(flat_selectable).fetchall()
        
        Then, the ``flat_selectable`` variable contains a selectable expression which acts
        as a table where the JSONB `field_one` field has been replaced by several
        scalar typed fields whose names are build from the original JSONB field name
        and the path of the data in the data structure, with a double undercore as
        separator.
        
        You can execute directly the ``flat_selectable`` to inspect what it contains
        (this is what is done in the example to get the ``result`` variable) or use it as
        a plain table to build more complex queries.
        
        See below examples for more details.
        
        
        Simple types
        ============
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "integer": {"type": "integer"},
               "number": {"type": "number"},
               "string": {"type": "string"}
               "boolean": {"type": "boolean"}
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__integer``
        * ``field_one__number``
        * ``field_one__string``
        * ``field_one__boolean``
        
        Given the following JSON data in one record::
        
           {
             "integer": 42,
             "number": 3.141592,
             "string": "pouet",
             "boolean": false
           }
        
        The ``result`` variable would contain this:
        
        +-----------------------+----------------------+----------------------+----------------------+
        | field_one__integer    | field_one__number    | field_one__string    | field_one__boolean   |
        +=======================+======================+======================+======================+
        | 42                    | 3.141592             | pouet                | false                |
        +-----------------------+----------------------+----------------------+----------------------+
        
        
        Simple objects
        ==============
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "object": {
                 "type": "object",
                 "properties": {
                   "obj_int": {"type": "integer"},
                   "obj_num": {"type": "number"},
                   "obj_str": {"type": "string"}
                 }
               }
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__object__obj_int``
        * ``field_one__object__obj_num``
        * ``field_one__object__obj_str``
        
        Given the following JSON data in one record::
        
           {
             "object": {
               "obj_int": 42,
               "obj_num": 3.141592,
               "obj_str": "pouet"
             }
           }
        
        The ``result`` variable would contain this:
        
        +----------------------------------+----------------------------------+----------------------------------+
        | field_one__object__obj_int       | field_one__object__obj_num       | field_one__object__obj_str       |
        +==================================+==================================+==================================+
        | 42                               | 3.141592                         | pouet                            |
        +----------------------------------+----------------------------------+----------------------------------+
        
        
        Simple arrays
        =============
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "int_arr": {
                 "type": "array",
                 "items": {"type": "integer"}
               },
               "num_arr": {
                 "type": "array",
                 "items": {"type": "number"}
               },
               "str_arr": {
                 "type": "array",
                 "items": {"type": "string"}
               }
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__int_arr__item``
        * ``field_one__int_arr__item__index``
        * ``field_one__num_arr__item``
        * ``field_one__num_arr__item__index``
        * ``field_one__str_arr__item``
        * ``field_one__str_arr__item__index``
        
        .. note::
        
           As you can see each array field is suffixed by an `item` part and each array
           data is identified by an index which allows to retrieve the order of the
           data in the JSONB array.
        
        Given the following JSON data in one record::
        
           {
             "int_arr": [42, 77, 1664],
             "num_arr": [3.141592, 1.618034, 37.2],
             "str_arr": ["pouet", "foo", "bar"]
           }
        
        The ``result`` variable would contain this:
        
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | field_one__int_arr__item | field_one__int_arr__item__index | field_one__num_arr__item | field_one__num_arr__item__index | field_one__str_arr__item | field_one__str_arr__item__index |
        +==========================+=================================+==========================+=================================+==========================+=================================+
        | 42                       | 1                               | NULL                     | NULL                            | NULL                     | NULL                            |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | 77                       | 2                               | NULL                     | NULL                            | NULL                     | NULL                            |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | 1664                     | 3                               | NULL                     | NULL                            | NULL                     | NULL                            |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | NULL                     | NULL                            | 3.141592                 | 1                               | NULL                     | NULL                            |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | NULL                     | NULL                            | 1.618034                 | 2                               | NULL                     | NULL                            |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | NULL                     | NULL                            | 37.2                     | 3                               | NULL                     | NULL                            |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | NULL                     | NULL                            | NULL                     | NULL                            | pouet                    | 1                               |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | NULL                     | NULL                            | NULL                     | NULL                            | foo                      | 2                               |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        | NULL                     | NULL                            | NULL                     | NULL                            | bar                      | 3                               |
        +--------------------------+---------------------------------+--------------------------+---------------------------------+--------------------------+---------------------------------+
        
        
        Array of objects
        ================
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "array": {
                 "type": "array",
                 "items": {
                   "type": "object",
                   "properties": {
                     "obj_int": {"type": "integer"},
                     "obj_num": {"type": "number"},
                     "obj_str": {"type": "string"}
                   }
                 }
               }
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__array__item__index``
        * ``field_one__array__item__obj_int``
        * ``field_one__array__item__obj_num``
        * ``field_one__array__item__obj_str``
        
        Given the following JSON data in one record::
        
           {
             "array": [
               {
                 "obj_int": 42,
                 "obj_num": 3.141592,
                 "obj_str": "pouet"
               },
               {
                 "obj_int": 77,
                 "obj_num": 1.618034,
                 "obj_str": "toto"
               }
             ]
           }
        
        The ``result`` variable would contain this:
        
        +-------------------------------+---------------------------------+---------------------------------+---------------------------------+
        | field_one__array__item__index | field_one__array__item__obj_int | field_one__array__item__obj_num | field_one__array__item__obj_str |
        +===============================+=================================+=================================+=================================+
        | 1                             | 42                              | 3.141592                        | pouet                           |
        +-------------------------------+---------------------------------+---------------------------------+---------------------------------+
        | 2                             | 77                              | 1.618034                        | toto                            |
        +-------------------------------+---------------------------------+---------------------------------+---------------------------------+
        
        
        Arrays in object
        ================
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "object": {
                 "type": "object",
                 "properties": {
                   "int_arr": {
                     "type": "array",
                     "items": {"type": "integer"}
                   },
                   "num_arr": {
                     "type": "array",
                     "items": {"type": "number"}
                   },
                   "str_arr": {
                     "type": "array",
                     "items": {"type": "string"}
                   }
                 }
               }
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__object__int_arr__item``
        * ``field_one__object__int_arr__item__index``
        * ``field_one__object__num_arr__item``
        * ``field_one__object__num_arr__item__index``
        * ``field_one__object__str_arr__item``
        * ``field_one__object__str_arr__item__index``
        
        Given the following JSON data in one record::
        
           {
             "object": {
               "int_arr": [42, 77, 1664],
               "num_arr": [3.141592, 1.618034, 37.2],
               "str_arr": ["pouet", "foo", "bar"]
             }
           }
        
        The ``result`` variable would contain this:
        
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | field_one__object__int_arr__item | field_one__object__int_arr__item__index | field_one__object__num_arr__item | field_one__object__num_arr__item__index | field_one__object__str_arr__item | field_one__object__str_arr__item__index |
        +==================================+=========================================+==================================+=========================================+==================================+=========================================+
        | 42                               | 1                                       | NULL                             | NULL                                    | NULL                             | NULL                                    |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | 77                               | 2                                       | NULL                             | NULL                                    | NULL                             | NULL                                    |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | 1664                             | 3                                       | NULL                             | NULL                                    | NULL                             | NULL                                    |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | NULL                             | NULL                                    | 3.141592                         | 1                                       | NULL                             | NULL                                    |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | NULL                             | NULL                                    | 1.618034                         | 2                                       | NULL                             | NULL                                    |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | NULL                             | NULL                                    | 37.2                             | 3                                       | NULL                             | NULL                                    |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | NULL                             | NULL                                    | NULL                             | NULL                                    | pouet                            | 1                                       |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | NULL                             | NULL                                    | NULL                             | NULL                                    | foo                              | 2                                       |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        | NULL                             | NULL                                    | NULL                             | NULL                                    | bar                              | 3                                       |
        +----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+----------------------------------+-----------------------------------------+
        
        
        Nested objects
        ==============
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "object": {
                 "type": "object",
                 "properties": {
                   "nested": {
                     "type": "object",
                     "properties": {
                       "obj_int": {"type": "integer"},
                       "obj_num": {"type": "number"},
                       "obj_str": {"type": "string"}
                     }
                   }
                 }
               }
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__object__nested__obj_int``
        * ``field_one__object__nested__obj_num``
        * ``field_one__object__nested__obj_str``
        
        Given the following JSON data in one record::
        
           {
             "object": {
               "nested": {
                 "obj_int": 42,
                 "obj_num": 3.141592,
                 "obj_str": "pouet"
               }
             }
           }
        
        The ``result`` variable would contain this:
        
        +------------------------------------+------------------------------------+------------------------------------+
        | field_one__object__nested__obj_int | field_one__object__nested__obj_num | field_one__object__nested__obj_str |
        +====================================+====================================+====================================+
        | 42                                 | 3.141592                           | pouet                              |
        +------------------------------------+------------------------------------+------------------------------------+
        
        
        Nested arrays
        =============
        
        Given the following data description::
        
           {
             "type": "object",
             "properties": {
               "nested_arrays": {
                 "type": "array",
                 "items": {
                   "type": "array",
                   "items": {"type": "string"}
                 }
               }
             }
           }
        
        ``flat_selectable`` would provide the following fields:
        
        * ``field_one__nested_arrays__item__index``
        * ``field_one__nested_arrays__item__item__index``
        * ``field_one__nested_arrays__item__item``
        
        Given the following JSON data in one record::
        
           {
             "nested_arrays": [
               ["abc", "def"],
               ["tuw", "xyz"]
             ]
           }
        
        The ``result`` variable would contain this:
        
        +---------------------------------------+---------------------------------------------+--------------------------------------+
        | field_one__nested_arrays__item__index | field_one__nested_arrays__item__item__index | field_one__nested_arrays__item__item |
        +=======================================+=============================================+======================================+
        | 1                                     | 1                                           | abc                                  |
        +---------------------------------------+---------------------------------------------+--------------------------------------+
        | 1                                     | 2                                           | def                                  |
        +---------------------------------------+---------------------------------------------+--------------------------------------+
        | 2                                     | 1                                           | tuw                                  |
        +---------------------------------------+---------------------------------------------+--------------------------------------+
        | 2                                     | 2                                           | xyz                                  |
        +---------------------------------------+---------------------------------------------+--------------------------------------+
        
        
        Data "preservation"
        ===================
        
        the `pg_jsonb_flattener` tool allows to specify chunks of JSON data not to be
        flattened by providing a full name of data to preserve in a list passed at
        `JsonbTable` instantiation via the `preserved` optional argument. See examples
        below with nested arrays:
        
        .. code-block:: python
        
           jsonb_table = JsonbTable(ONE_JSON_FIELD_TABLE, field_one=DATA_DESCRIPTION,
                                    preserved=['field_one__nested_arrays__item'])
        
           flat_selectable = jsonb_table.get_query()
        
           with engine.connect() as conn:
               result = conn.execute(flat_selectable).fetchall()
        
        In this case, the `flat_selectable` would provide the following fields:
        
        * ``field_one__nested_arrays__item__index``
        * ``field_one__nested_arrays__item``
        
        The ``result`` variable would contain this:
        
        +---------------------------------------+--------------------------------+
        | field_one__nested_arrays__item__index | field_one__nested_arrays__item |
        +=======================================+================================+
        | 1                                     | ["abc", "def"]                 |
        +---------------------------------------+--------------------------------+
        | 2                                     | ["tuw", "xyz"]                 |
        +---------------------------------------+--------------------------------+
        
        
        This also works for the nesting array and preserving the
        `field_one__nested_arrays` would result in the  following fields and result:
        
        * ``field_one__nested_arrays``
        
        +----------------------------------+
        | field_one__nested_arrays         |
        +==================================+
        | [["abc", "def"], ["tuw", "xyz"]] |
        +----------------------------------+
        
        
        Objects can also be preserved, see examples with nested objects.
        
        Preserving `field_one__object__nested` gives the following fields and result:
        
        * ``field_one__object__nested``
        
        +----------------------------------------------------------+
        | field_one__object__nested                                |
        +==========================================================+
        | {'obj_int': 42, 'obj_num': 3.141592, 'obj_str': 'pouet'} |
        +----------------------------------------------------------+
        
        Preserving `field_one__object` gives the following fields and result:
        
        * ``field_one__object``
        
        +----------------------------------------------------------------------+
        | field_one__object__nested                                            |
        +======================================================================+
        | {'nested': {'obj_int': 42, 'obj_num': 3.141592, 'obj_str': 'pouet'}} |
        +----------------------------------------------------------------------+
        
        .. note::
        
           Of course, it is possible to preserve objects in arrays and arrays in objects
        
Keywords: postgresql,jsonb,sqlalchemy
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 2
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Database
Provides-Extra: dev
Provides-Extra: test
