Metadata-Version: 2.1
Name: chromadol
Version: 0.1.2
Summary: Data Object Layer for ChromaDB
Author: OtoSense
License: mit
Description-Content-Type: text/markdown
License-File: LICENSE

# chromadol
Data Object Layer for ChromaDB


To install:	```pip install chromadol```


# Example usage

To make a `ChromaClient` DOL, you can specify a `chromadb` `Client`, `PersistentClient` (etc.) 
instance, or specify a string (which will be interpreted as a path to a directory to
save the data to in a `PersistentClient` instance).

    >>> from chromadol import ChromaClient
    >>> import tempfile, os 
    >>> with tempfile.TemporaryDirectory() as temp_dir:
    ...     tempdir = os.path.join(temp_dir, "chromadol_test")
    ...     os.makedirs(tempdir)
    >>> client = ChromaClient(tempdir)

Removing all contents of client to be able to run a test on a clean slate

    >>> for k in client:
    ...     del client[k]


There's nothing yet:

    >>> list(client)
    []

Now let's "get" a collection. 

    >>> collection = client['chromadol_test']

Note that just accessing the collection creates it (by default)


    >>> list(client)
    ['chromadol_test']

Here's nothing in the collection yet:

    >>> list(collection)
    []

So let's write something.
Note that `chromadb` is designed to operate on multiple documents at once, 
so the "chromadb-natural" way of specifying it's keys and contents (and any extras) 
would be like this:

    >>> collection[['piece', 'of']] = {
    ...     'documents': ['contents for piece', 'contents for of'],
    ...     'metadatas': [{'author': 'me'}, {'author': 'you'}],
    ... }

Now we have two documents in the collection:

    >>> len(collection)
    2

Note, though, that the order of the documents is not guaranteed.

    >>> sorted(collection)
    ['of', 'piece']

    >>> assert collection['piece'] == {
    ...     'ids': ['piece'],
    ...     'embeddings': None,
    ...     'metadatas': [{'author': 'me'}],
    ...     'documents': ['contents for piece'],
    ...     'uris': None,
    ...     'data': None
    ... }

    >>> assert collection['of'] == {
    ...     'ids': ['of'],
    ...     'embeddings': None,
    ...     'metadatas': [{'author': 'you'}],
    ...     'documents': ['contents for of'],
    ...     'uris': None,
    ...     'data': None
    ... }

You can also read multiple documents at once.
But note that the order of the documents is not guaranteed.

    >>> collection[['piece', 'of']] == collection[['of', 'piece']]
    True

You can read or write one document at a time too.

    >>> collection['cake'] = {
    ...     "documents": "contents for cake",
    ... }
    >>> sorted(collection)  # sorting because order is not guaranteed
    ['cake', 'of', 'piece']
    >>> assert collection['cake'] == {
    ...     'ids': ['cake'],
    ...     'embeddings': None,
    ...     'metadatas': [None],
    ...     'documents': ['contents for cake'],
    ...     'uris': None,
    ...     'data': None,
    ... }

In fact, see that if you only want to specify the "documents" part of the information,
you can just write a string instead of a dictionary:

    >>> collection['cake'] = 'a different cake'
    >>> assert collection['cake'] == {
    ...     'ids': ['cake'],
    ...     'embeddings': None,
    ...     'metadatas': [None],
    ...     'documents': ['a different cake'],
    ...     'uris': None,
    ...     'data': None,
    ... }
