Metadata-Version: 1.1
Name: scidb-strm
Version: 16.9.1
Summary: Python library for SciDB streaming
Home-page: UNKNOWN
Author: Rares Vernica
Author-email: rvernica@gmail.com
License: AGPL-3.0
Download-URL: http://github.com/Paradigm4/Stream
Description-Content-Type: UNKNOWN
Description: SciDB-Strm: Python Library for SciDB Streaming
        ==============================================
        
        Requirements
        ------------
        
        SciDB ``16.9``
        
        Apache Arrow ``0.6.0`` or newer.
        
        Python ``2.7.x``, ``3.4.x``, ``3.5.x``, ``3.6.x`` or newer.
        
        Required Python packages::
        
          dill
          feather-format
          pandas
        
        Note
        ^^^^
        
        Apache Arrow versions older than ``0.8.0`` contain a bug which might
        affect Stream users. The bug manifests on chunks of more than ``128``
        records with null-able values. For more details, see the full bug
        description `here
        <https://issues.apache.org/jira/browse/ARROW-1676>`_. This bug has
        been `fixed <https://github.com/apache/arrow/pull/1204>`_ in Apache
        Arrow version ``0.8.0``.
        
        
        Installation
        ------------
        
        Install latest release::
        
          pip install scidbstrm
        
        Install development version from GitHub::
        
          pip install git+http://github.com/paradigm4/stream.git#subdirectory=py_pkg
        
        The Python library needs to be installed on the SciDB server. The
        library needs to be installed on the client as well, if Python code is
        to be send from the client to the server.
        
        
        SciDB-Strm Python API and Examples
        ----------------------------------
        
        Once installed the *SciDB-Strm* Python library can be imported with
        ``import scidbstrm``. The library provides a high and low-level access
        to the SciDB ``stream`` operator as well as the ability to send Python
        code to the SciDB server.
        
        High-level access is provided by the function ``map``:
        
        ``map(map_fun, finalize_fun=None)``
          Read SciDB chunks. For each chunk, call ``map_fun`` and stream its
          result back to SciDB. If ``finalize_fun`` is provided, call it after
          all the chunks have been processed.
        
        See `1-map-finalize.py <examples/1-map-finalize.py>`_ for an example
        using the ``map`` function. The Python script has to be copied onto
        the SciDB instance.
        
        Python code can be send to the SciDB server for execution using
        the ``pack_func`` and ``read_func`` functions:
        
        ``pack_func(func)``
          Serialize Python function for use as ``upload_data`` in ``input`` or
          ``load`` operators.
        
        ``read_func()``
          Read and de-serialize function from SciDB.
        
        See `2-pack-func.py <examples/2-pack-func.py>`_ for an example of
        using the ``pack_func`` and ``read_func`` functions.
        
        Low-level access is provided by the ``read`` and ``write`` functions:
        
        ``read()``
          Read a data chunk from SciDB. Returns a Pandas DataFrame or None.
        
        ``write(df=None)``
          Write a data chunk to SciDB.
        
        See `3-read-write.py <examples/3-read-write.py>`_ for an example using
        the ``read`` and ``write`` functions. The Python script has to be
        copied onto the SciDB instance.
        
        A convenience invocation of the Python interpreter is provided in
        ``python_map`` variable and it is set to::
        
          python -uc "import scidbstrm; scidbstrm.map(scidbstrm.read_func())"
        
        Finally, see `4-machine-learning.py <examples/4-machine-learning.py>`_
        for a more complex example of going throught the steps of using
        machine larning (preprocessing, training, and prediction).
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU Affero General Public License v3
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Topic :: Database :: Front-Ends
Classifier: Topic :: Scientific/Engineering
