Metadata-Version: 2.1
Name: nob
Version: 0.1.0
Summary: Nested OBject manipulations
Home-page: https://gitlab.com/cerfacs/nob
Author-email: lapeyre@cerfacs.fr
License: UNKNOWN
Keywords: JSON,YAML,Nested Object
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown

# nob: the Nested OBject manipulator

JSON is a very popular format for nested data exchange. Object Relational
Mapping (ORM) is a popular method to help developers make sense of large JSON
objects, by mapping objects to the data. In some cases however, the nesting
can be very deep, and difficult to map with objects. This is where nob can be
useful: it offers a simple set of tools to explore and edit any nested data
(Python native dicts and lists).

For more, checkout the source page at gitlab.com/cerfacs/nob.

## Usage

### Instantiation

`nob.Tree` objects can be instantiated directly from a Python dictionary:

    t = Tree({
        'key1': 'val1',
        'key2': {
            'key3': 4,
            'key4': {'key5': 'val2'},
            'key5': [3, 4, 5]
            },
        'key5': 'val3'
        })

To create a `Tree` from a JSON (or YAML) file, simply read it in:


    import json
    with open('file.json') as fh:
        t2 = Tree(json.load(fh))

    import yaml
    with open('file.yml') as fh:
        t3 = Tree(yaml.load(fh))

### Basic manipulation

The variable `t` now holds a tree, *i.e* the reference to the actual data. However,
for many practical cases it is useful to work with a subtree. `nob` offers a useful
class `TreeView` to this end. It handles identically for the most part as the main tree,
but changes performed on a `TreeView` affect the main `Tree` instance that it is linked
to. In practice, any access to a key of `t` yields a `TreeView` instance, *e.g.*:

    tv1 = t['/key1']         # TreeView(/key1)
    tv2 = t['key1']          # TreeView(/key1)
    tv3 = t.key1             # TreeView(/key1)
    tv1 == tv2 == tv3        # True

Note that a *global path* `'/key1'`, as well as a simple key `'key1'` are valid
identifiers. Simple keys can also be called as attributes, using `t.key1`.

To access the actual value that is stored in the nested object, simply use the `.val`
method:

    tv1.val                  >>> 'val1'
    t.key1.val               >>> 'val1'

To assign a new value to this node, you can do it directly on the TreeView instance:

    t.key1 = 'new'
    print(tv1.val)           >>> 'new'
    print(t.val['key1']      >>> 'new'

Of course, because of how Python variables work, you cannot simply assign the value to
`tv1`, as this would just overwrite it's contents:

    tv1 = 'new'
    print(tv1.val)           >>> 'new'
    print(t.val['key1'])     >>> 'val1'

If you find yourself with a `TreeView` object that you would like to edit directly,
you can use the `.set` method:

    tv2.set('new')
    print(t.val['key1'])     >>> 'new'

Because nested objects can contain both dicts and lists, integers are sometimes needed
as keys:

    t['/key2/key5/0']        >>> TreeView(/key2/key5/0)
    t.key2.key5[0]           >>> TreeView(/key2/key5/0)
    t.key2.key5['0']         >>> TreeView(/key2/key5/0)

However, since Python does not support attributes starting with an integer, there is
no attribute support for lists. Only key access (both global and local) are supported.

### Smart key access

In a simple nested dictionary, the access to `'key1'` would be simply done with:

    nested_dict['key1']

If you are looking for *e.g.* `key3`, you would need to write:

    nested_dict['key2']['key3']

For deep nested objects however, this can be a chore, and become very difficult to
read. `nob` helps you here by supplying a smart method for finding unique keys:

    t['key3']                >>> TreeView(/key2/key3)
    t.key3                   >>> TreeView(/key2/key3)

Note that attribute access `t.key3` behaves like simple key access `t['key3']`. This
has some implications when the key is not unique in the tree. Let's say *e.g.* we wish
to access `key5`. Let's try using attribute access:

    t.key5                   >>> KeyError: Address key5 yielded 3 results instead of 1

Oups! Because `key5` is not unique (it appears 3 times in the tree), `t.key5` is not
specific, and `nob` wouldn't know which one to return. In this instance, we have
several possibilities, depending on which `key5` we are looking for:

    t.key4.key5              >>> TreeView(/key2/key4/key5)
    t.key2.key5              >>> TreeView(/key2/key5)
    t['/key5']               >>> TreeView(/key5)

There is a bit to unpack here:

  - The first `key5` is unique in the `TreeView` `t.key4` (and `key4` is itself
    unique), so `t.key4.key5` finds it correctly.
  - The second is similar, except all keys in the path end up being needed. An
    equivalent call could have been with global calls: `t['/key2/key5']`.
  - The last cannot be resolved using keys in its path, because there are none. The 
    only solution is to use a global path.

## Other tree tools

Any `Tree` (or `TreeView`) object can introspect itself to find all its valid paths
leading to actual data:

    t.paths                  >>> [Path('/'),
                                  Path('/key1'),
                                  Path('/key2'),
                                  Path('/key2/key3'),
                                  Path('/key2/key4'),
                                  Path('/key2/key4/key5'),
                                  Path('/key2/key5'),
                                  Path('/key2/key5/0'),
                                  Path('/key2/key5/1'),
                                  Path('/key2/key5/2'),
                                  Path('/key5')]

In order to easily search in this path list, the `.find` method is available:

    t.find('key5')           >>> [Path('/key2/key4/key5'),
                                  Path('/key2/key5'),
                                  Path('/key5')]

The elements of these lists are not strings, but `Path` objects, as described below.
If you wish to loop over all children of a given node, another option is to
do so directly:

    [tv for tv in t.key2]    >>> [TreeView(/key2/key3),
                                  TreeView(/key2/key4),
                                  TreeView(/key2/key5)]

## Path

All paths are stored internally using the `nob.Path` class. Paths are global
(w.r.t. their `Tree` or `TreeView`), and are in essence a list of the keys
constituting the nested address. They can however be viewed equivalently as
a unix-type path string with `/` separators. Here are some examples

    p1 = Path(['key1'])
    p1                       >>> Path(/key1)
    p2 = Path('/key1/key2')
    p2                       >>> Path(/key1/key2)
    p1 / 'key3'              >>> Path(/key1/key3)
    p2.parent                >>> Path(/key1)
    p2.parent == p1          >>> True
    'key2' in p2             >>> True
    [k for k in p2]          >>> ['key1', 'key1']
    p2[-1]                   >>> 'key2'
    len(p2)                  >>> 2

These can be helpful to manipulate paths yourself, as any global access with
a tring to a `Tree` or `TreeView` objects also accepts a `Path` object.

