==============
Code in DistKV
==============

DistKV can store Python code and modules, either for direct use by your
client or for a runner daemon.

TODO: There is no dependency resolution. Thus, while you can store Python
modules in DistKV, there's no guarantee yet that they'll actually be present
when your code loads.


++++
Code
++++

Python code stored in DistKV is wrapped with a procedure context, mainly to
make returning a result more straightforward. This is done by indenting the
code before compiling it: don't depend on multi-line strings to be flush
left.

Storage
=======

The location for executable scripts is configurable and defaults to
".distkv code proc". Scripts are stored as a dict with these attributes:

* ``code``: the actual text

* ``is_async``: a flag whether the procedure is synchronous (``None``),
  sync but should run in a worker thread (``False``), or async (``True``).

* ``vars``: Required input variables of your procedure. Parameters not
  mentioned here are still available as globals.

* ``requires``: modules which this code needs.  XXX TODO

There's no way for code written for a specific async library to run under
another, with the possible exception of "asyncio on top of Trio" (via
``trio-asyncio``). DistKV itself uses ``anyio`` in order to avoid the
problem. The author strongly recommends to follow this practice, if at all
possible.

The required modules must be stored in DistKV. Accessing modules from your
Python installation or the virtualenv you've set up for DistKV is of course
possible, but DistKV does not try to keep them up-to-date for you.

If you want to run user code in your DistKV module, call
``cr = await CodeRoot.as_handler(client)``. Then, run some code by
simply naming it: ``cr("forty.two")`` or ``cr(("forty","two"))`` will run
the code stored at ``.distkv code proc forty two``. All arguments will be
passed to the stored code.


+++++++
Modules
+++++++

Python modules are stored to DistKV as plain code.

Recursive dependencies are not allowed.

Storage
=======

The location for Python modules is configurable and defaults to
".distkv code module". Modules are stored as a dict with these attributes:

* ``code``: the actual program text

* ``requires``: other modules which this module needs to be loaded.

Usage
=====

Call ``await ModuleRoot.as_handler(client)``. All modules in your DistKV
store are loaded into the Python interpreter; use normal import statements
to access them.

TODO: Modules are not yet loaded incrementally.


+++++++
Runners
+++++++

The distributed nature of DistKV lends itself to running arbitrary code on
any node that can accomodate it. 

============
Runner types
============

DistKV has three built-in types of code runners. All are organized by a "group"
tag. The "distkv client run all" command starts all jobs of a type, in a
specific group.

``distkv client run`` accepts a ``-g ‹group›`` option that tells the
system which group to use. If you don't use this option, the default group
is named ``default``.

All groups and all runners are distinct. Which nodes actually execute the
code you enter into DistKV is determined solely by running ``distkv client
run all`` on them, with the appropriate options.

++++++++++++++++++
Single-node runner
++++++++++++++++++

This runner executes code on a specific node. This is useful e.g. if you
need to access non-redundant hardware, e.g. a 1wire bus connected to a
specific computer.

On the command line you access this runner with ``distkv client run -n
NAME``.


+++++++++++++++
Any-node runner
+++++++++++++++

This runner executes code on one of a group of nodes. Which node executes
the code is largely determined by chance, startup order, or phase of the
moon. This is useful when accessing redundant hardware, e.g. a radio
interface.

TODO: Load balancing is not yet implemented.

On the command line you access this runner with ``distkv client run``, i.e.
without using the ``-n ‹node›`` option.

+++++++++++++++
All-node runner
+++++++++++++++

This runner executes code on all members of a group of nodes. You access it
with ``distkv client run -n -``.

====================
Runner configuration
====================

Runner entries don't hold code; they merely point to it. The advantage is
that you can execute the same code with different parameters.

See :class:`distkv.runner.RunnerEntry` for details.

The actual runtime information is stored in a separate "state" node.
This avoids race conditions.
See :class:`distkv.runner.StateEntry` for details.

+++++++++
Variables
+++++++++

The runners pass a couple of variables to their code.

* _client

  The DistKV client instance. You can use it to access arbitraty DistKV
  data.

* _cfg

  The current configuration.

* _cls

  A dict (actually, `distkv.util.attrdict`) with various runner-related
  message classes. Convenient if you want to avoid a cumbersome ``import``
  statement in your code, since these are not part of DistKV's public API.

* _digits

  A reference to `distkv.util.digits`.

* _info (async only)

  A queue for events. This queue receives various messages. See below.

* _log

  A standard ``Logger`` object.

* _P

  `distkv.util.P`, to decode a Path string to a Path object.

* _Path

  `distkv.util.Path`, to convert a list of path elements to a Path object.

* _self (async only)

  The controller. See `distkv.runner.CallAdmin`, below.

These variables, as well as the contents of the data associated with the
runner, are available as global variables.

Node Groups
===========

All runners are part of a group of nodes. The Any-Node runners use the
group to synchronize job startup.

Runners also forward the group's membership information to your code as it
changes. You can use this information to implement "emergency operation
when disconnected" or similar fallback strategies.

=========
CallAdmin
=========

Your code has access to a ``_self`` variable which contains a `CallAdmin` object.
The typical usage pattern is to start monitoring some DistKV entries with
`CallAdmin.watch`, then iterate ``_info`` for the values of those entries.
When you get a `ReadyMsg` event, all values have been transmitted; you can
then set up some timeouts, set other values, access external services, and
do whatever else your code needs to do.

DistKV client code requires an async context manager for most scoped
operations. Since a `CallAdmin` is scoped by definition, it can manage
these scopes for you. Thus, instead of writing boilerplate code like
this::

   import anyio
   inport distkv.runner
   """
   Assume we want to process changes from these two subtrees
   for 100 seconds
   """
   async with _client.watch(_P("some.special.path")) as w1:
      async with _client.watch(P("some.other.path")) as w2:
         q = anyio.create_queue()  # q_s,q_r = anyio.create_memory_object_stream()
         async def _watch(w):
            async for msg in w:
               await q.put(msg)  # q_s.send(msg)
         async def _timeout(t):
            await anyio.sleep(t)
            await process_timeout()
         await _self.spawn(_watch, w1)
         await _self.spawn(_watch, w2)
         await _self.spawn(_timeout, 100)
         async for msg in q:  # q_r
            await process_data(msg)

you can simplify this to::

   await _self.watch(_P("some.special.path"))
   await _self.watch(_P("some.other.path"))
   await _self.timer(100)
   async for msg in _info:
      if msg is None:
         return  # system was stalled
      elif isinstance(msg, _cls.TimerMsg):
         await process_timeout()
      elif isinstance(msg, _cls.ChangeMsg):
         await process_data(msg.msg)

Distinguishing messages from different sources can be further simplified by
using distinct ``cls=`` parameters (subclasses of ``ChangeMsg`` and
``TimerMsg``) in your ``watch`` and ``timer`` calls, respectively.

By default, ``watch`` retrieves the current value on startup. Set
``fetch=False`` if you don't want that.

By default, ``watch`` only retrieves the named entry. Set ``max_depth=-1``
if you want all sub-entries. There's also ``min_depth`` if you should need
it.

If you use ``max_depth``, entries are returned in mostly-depth-first order.
It's "mostly" because updates may arrive at any time. A ``ReadyMsg``
message is sent when the subtree is complete.

The `CallAdmin.spawn` method starts a subtask.

`watch`, `timer`, and `spawn` each return an object which you can call
``await res.cancel()`` on, which causes the watcher, timer or task in
question to be terminated.

++++++++
Messages
++++++++

The messages in ``_info`` can be used to implement a state machine. If your
code is long-running and async, you should iterate them; if the queue is
full, your code may be halted. Alternately you'll get a `None` message.
That message indicates that the queue has stalled: you should exit.

The following message types are defined. You're free to ignore any you
don't recognize.

* CompleteState

  There are at least N runners in the group. (N is specified as an argument
  to ``run all``; making this configurable via DistKV is TODO.)

* PartialState

  There are some runners available, but more than one and fewer than N.

* DetachedState

  There is no other runner available.

* BrokenState

  Something else is wrong.

* ChangeMsg

  An entry you're watching has changed. The message's ``value`` and
  ``path`` attributes contain relevant details. ``value`` doesn't exist if
  the node has been deleted.

  You can use the watcher's ``cls`` argument to subclass this message, to
  simplify dispatching.

* TimerMsg

  A timer has triggered. The message's ``msg`` attribute is the timer, i.e.
  the value you got back from ``_self.timer``. You can use `Timer.run(delay)`
  to restart the timer.

  You can use the timer's ``cls`` argument to subclass this message, to
  simplify dispatching.

* ReadyMsg

  Startup is complete. This message is generated after all watchers have
  started and sent their initial data. The ``msg`` attribute contains the
  number of watchers.

  This message may be generated multiple times because of race conditions;
  you should check that the count is correct.


The ``…State`` messages can be useful to determine what level of redundancy
you currently have in the system. One application would be to send a
warning to the operator that some nodes might be down.

