Metadata-Version: 2.1
Name: smcore
Version: 0.0.41
Summary: Package providing base sm agent classes and clients
Author-email: "J. Hoffman, J. Genender" <johnmarianhoffman@gmail.com>
Project-URL: Homepage, https://gitlab.com/sm-ai-team/sm-core-go
Project-URL: Bug Tracker, https://gitlab.com/sm-ai-team/sm-core-go/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests ==2.31.0
Requires-Dist: protobuf ==4.21.12

# smcore

smcore provides base class agents and blackboard clients for use with [SimpleMind's go implementation](https://gitlab.com/sm-ai-team/sm-core-go).

**This is still a work in progress and not yet ready for "primetime."  Functionality and API are likely to change until this message is removed and we start taking version numbers seriously.**

## Getting started

1. Install `smcore` from PyPi (this is currently the *only* supported means to install and use this library)

```
pip install smcore
```

2. Generate an empty python agent using `sm-core-go`. Note: you will need to have the `sm-core-go` tool installed on your system.  Please see the root README (or ask John) for more instructions.

```
./sm-core-go generate python-agent --name hello-agent
```

You should use "kebab" case (all lower case, words separated with dashes) when naming agents.  For instance, the following **are** valid agent names:

- `my-new-agent`
- `kidney-segmentation-agent`
- `grab-daily-news-worker`
- `my-new-agent-2`

The following are **NOT** valid agent names:

- `My New Agent`
- `KidneySegmentationAgent`
- `grabDailyNewsWorker`
- `2-my-new-agent`

Although this may not be perfectly enforced at this point (eventually we'll validate it for the user), this is the naming scheme we intend to support.  This is a very deliberate choice as certain cloud infrastructure expects names to be in this format.  Note that numbers are find, just don't start an agent name with a number (example above).

3. Start a local SM blackboard in server mode

```
./sm-core-go server
```

4. Launch your agent!

```
python hello_agent.py --blackboard localhost:8080 --spec '{"name":"HelloAgent"}
```

You should see two messages appear on the blackboard: a "Hello" message and a "Goodbye" message.

**Congrats! You've just run your for SM agent.**

5. Extra credit: Customize the commented out Log message to see one mechanism with which agents can communicate with the blackboard.

## The Agent API

Under `sm-core-go` an Agent is anything that communicates with the blackboard.  As you can see from the generated example above, we provide a base class that handles all of networking and communication, and expose a minimal, (hopefully) easy to use set of functions to send different message types.  We'll work on better documentation over time, but this is the essentials.

In the below sections, "self" always refers to the agent object/class.

### Log Messages

Think of this as your "stdout" to the blackboard.  Useful to track updates for human readers.  Generally agents should not/will not parse log messages.

```
self.log(severity, message)

  'severity' is an enumerated type, one of:

    Log.DEBUG		- Messages that should generally be supressed once the agent is running correctly
    Log.INFO		- Messages that should appear in the blackboard; generally for human readers
    Log.WARNING		- Agent has encountered an unexpected state, but can continue processing unimpeded
    Log.ERROR		- Agent has encountered an unexpected state where it cannot continue processing normally, however may be able to continue running (i.e. recover, wait for new data, etc.)
    Log.CRITICAL	- Something has gone horribly wrong and the agent is shutting down b/c of the situation

  'message' is a string you'd like to appear on the blackboard
```
to use the enumerated Log types (recommended), you will need to import Log from smcore:

```
from smcore import Agent, Log
```

Some examples of logging:

```
MyAgent.log(Log.INFO,"Just checking in")
MyAgent.log(Log.ERROR,"DB connection dropped. Retrying...")
MyAgent.log(Log.CRITICAL,"Computer out of memory")
```

### Status Updates

Status updates indicate a change in the state of an agent.

Generally used for dependency resolution and by the controller for monitoring.

For server-style agent/blackboard interaction, these messages are not required, however using them will enable better support for monitoring via connected dashboards, metrics gathering, etc.

```
self.post_status_update(state, message):

  'state' is an enumerated type, one of:

    AgentState.UNKNOWN			- Essentially the "null" state
    AgentState.OK				- Agent has said hello
    AgentState.REQUESTED		- Agent requested, but has not said hello
    AgentState.AWAITING_PROMISE - Agent is awaiting promised results
    AgentState.WORKING			- All promises fulfilled, currently processing
    AgentState.MISSING			- No pings detected (set by the controller/monitor)
    AgentState.ERROR			- Agent has encountered an error and cannot proceed normally
    AgentState.COMPLETED		- All agent tasks completed (should precede the agent goodbye/shutdown)

  'message' is a string with information about the current state
```

to use the enumerated AgentState types (recommended), you will need to import AgentState from smcore:

```
from smcore import Agent, AgentState
```

Some examples of status updates:

```
MyAgent.post_status_update(AgentState.OK,"")
MyAgent.post_status_update(AgentState.AWAITING_PROMISE,"listening for lung segmentations...")
MyAgent.post_status_update(AgentState.AWAITING_PROMISE,"listening for nodule ROIs...")
MyAgent.post_status_update(AgentState.ERROR,"Expected an array of 512x512, but got 0x0")
MyAgent.post_status_update(AgentState.ERROR, traceback.format_exc())
```

### Posting Results

"Results" are your *key* mechanism for passing data between agents.  It is intended to be extremely permissive in that you can (and should!) just stuff data into it as if it were an envelope, and post it to the BB.  All of the data management and sharing is handled* for you.

(* There are some caveats to this statement in the early stages of this, but that's the direction that we're headed and really want to get to.  Currently, it's handled, but not in a truly scalable way.)

```
MyAgent.post_result(res_name: str, res_type: str, data: dict):

  'res_name' is string identifying the result. Required and should (must?) be non-empty.
  
  'res_type' is a string identfying the result type. Required and should (must?) be non-empty.
  
  'data' is a json-friendly struct containing any data that comprises the result.
  
'res_name' and 'res_type' are user/agent configurable and left entirely up to user choice.  Complexity/simplicitly of naming schemes will likely be problem specific so we can't offer much guidance other than it could be worth giving this some thought before you ever start coding.

We will eventually add (fix) wildcard '*' support so that promises can be resolved for any result, type, or agent.

'data' currently seems to work pretty well, however json serialization can sometimes be a bit... unpredictable?  Most of our experience is from Go, so python might do thing's we're not aware of. If you run into issues where serialization doesn't work correctly, or does something unexpected, please file an issue in the sm-core-go repository.
```

It's important to not limit yourself to just passing one output either.

The data argument is a python dictionary in the truest sense that you can stuff multiple (potentially unrelated?) chunks of data into it and save them to the blackboard.  Use this to pass data to make downstream processing of the results easier.

A somewhat trivial example might be attaching the input parameters used to generate the result:

```
data = {
    "erosion_element": "disk",
    "erosion_radius": 3,
    "input_data": "Promise(LeftLungMask as LungSegmentation from LeftLungSegmentationAgent)",
    "message": "unexpected or anomalous region detected (topology, too many holes). maybe emphysema or dld? could produced unexpected erosion results.",
	"eroded_mask": np.array(...),
}
```

Play with it and favor slightly more data than less!*

(*Again, in the early stages, we may get bitten passing excessive amounts of data, but in the long run we should be favoring passing additional/extra data that downstream agents can utilize in ways we may not expect or plan for in our agents.)

Some examples of posting results:

```
MyAgent.post_result(
    "LeftLung", 
    "LungSegmentation", 
    data = {
        "algorithm_version": "1.0.1", 
        "mask": np.Array(...)
})

MyAgent.post_result(
    "NoduleDetectionSummary", 
	"FinalMetric", 
	data = {
        "input_cases": ["case1", "case2"], 
        "sensitivity": 0.4, 
        "specificity": 1.0, 
        "AUC": 0.75, 
        "N": 2, 
        "message": "N is very low and these figures may not be accurate to the group from which data was sampled."
	}
)
```

#### Partial results and no results

"Partial result" and "no result" have the exact same signature as a regular result, however they indicate different things to downstream agents about the data provided.

```
MyAgent.post_partial_result(res_name, res_type, data)
```

"Partial result" indicates that "I've finished processing my input and produced results, but either there is additional processing I will continue doing, or I will wait for a new input."  You could also utilize a partial result as a multi-part solution, or incrementally improved results over time.


```
MyAgent.post_no_result(res_name: str, res_type: str, data: dict)
```

"No result" indicates that the agent was not able to find a result for a given input, despite *everything functioning correctly*.  Please do **not** use "no results" for error messages, although if you wish to indicate reasons why a result was not able to be found, that is appropriate.

#### Passing data with no results

Perhaps somewhat paradoxically, you can attach data to the "no results" message type.  This is by design and can be used to pass problem context even in the event of the agent failing to find a solution.

- **Example 1**: A lung segmentation agent was trained on a specific input domain of images from Siemens scanners, and should not be applied to any GE data.  The agent could parse the header for manufacturer information and reject cases from GE scanners.  The way that it would communicate this to any downstream agents waiting for lung segmentations would be to post a "no result" and perhaps the data `{"message":"not trained to segment from GE scanners"}`.

- **Example 2**: A more interesting example would be passing meaningful data.  Again, let's say that our lung  segmentation agent knows that it produce an accurate segmentation if the noise the trachea/aorta is below a certain threshold. The current case is well-above the needed threshold, however maybe we want to still provide an attempt that we will analyze later, just one that definitely cannot be trusted if a downstream agent depends on a good lung segmentation.  Here you would post the attempted lung segmentation as the *data* on the "no results" message.

- **Example 3**: You could also use no result data to pass hints and data as to what went wrong.  Perhaps our input data for our lung segmentation agent passes all of our sanity checks, however when it goes to process the case it notices that there is a topological inconsistency to the segmentation data (for instance, an unexpected hole, or a missing lung.

See the `post_result` example above for examples of using `post_partial_result` and `no_result` (you'll just need to change the function name, however the calling pattern is the same).

### Accessing attributes

```
MyAgent.attribute(name)
```

"Attributes" are the mechanism to pass data to your agent either through the knowledge base (config file) or via the command line.  It may feel a little cumbersome, or at least unnecessary, at first, but due to the distributed nature of this communication backbone, it's essential that you develop your agents using attributes rather than writing your own command line parsing. If your interested in the details of why this is, let use know via an issue and we'll add a section to this document explaining it in more detail.

Attributes are passed via the agent specification or "spec".

For example, to start the python ping agent included in the [sm-core-go](https://gitlab.com/sm-ai-team/sm-core-go) repository (found at `python/ping_agent.py`), the minimum acceptable command line call would be:

```
python ping_agent.py --blackboard localhost:8080 --spec '{"attributes":{"dwell":3}}'
```

As you can see, we've passed one attribute called "dwell" with a value of 3 to our agent.

To access this attribute inside of our agent, we simple have to use the `attribute` method provided by the agent API:

```
dwell_time = ping_agent.attributes("dwell")
jitter_sec = 0.3
time.sleep(dwell_time + jitter)
```

#### A word of caution

Because this system is SO permissive and python generally lack strong typing, we can't (won't) offer strong guarantees that attributes will be typed correctly.  *It is up to the agent developers* to ensure that data is being written and read as expected between agents.

As you find cases where there are failures or unexpected behaviors, please file [issues](https://gitlab.com/sm-ai-team/sm-core-go/-/issues) to help us understand where the current system is not working and can be improved.

### Awaiting data

Having your agent sit idle until it receives required input data (via promises) is likely to be a common action among many/most agents, so we've provide a convenience method for it:

```
self.await_data()
```

Due to the asynchronous nature of our computing model, you have to call it using python's `await` keyword:

```
await self.await_data()
```

Once the agent moves on from this hold, all of the desired/promised data can be retrieved using the attributes system described above!  This is just one of the reasons why it important to only use the attributes system to pass data between the user, blackboard, and agents.

(NOTE: This promise system may not be fully implemented in Python just yet, but this description is the intended functionality.  As soon as you run into issues with this, tell John so we can prioritize getting it working correctly.  Also, it's safe to expect that it's not working correctly if it's not working as described. You're probably not misunderstanding anything!:))

### That's it!

To recap, there's only really five things you need to worry about with our Agent API:
 - Logging: `agent.log(severity, message)`
 - Status updates: `agent.post_status_update(state, message)`
 - Results/partial results/no results: 
   - `agent.post_result(res_name, res_type, data)`
   - `agent.post_partial_result(res_name, res_type, data)`
   - `agent.post_no_result(res_name, res_type, data)`
 - Attributes:
   - `attribute_value = agent.attribute(attribute_name)`
 - Awaiting data: `await self.await_data()`

Hopefully this feels pretty barebones.  If so, that's very intentional!  If not, we want to hear how you would change things in our [issues](https://gitlab.com/sm-ai-team/sm-core-go/-/issues).

## A little bit about the "core" philosophy

This is a big shift from the previous simple mind implementation and conceptually we want to make a few things clear.

1. `sm-core-go` is intended to be a scalable, high-throughput communication layer for blackboard-based, cognitive computing.  It does not inherently do any computing of its own, beyond some basic agents that manage other agents (the controller and runners).

1. No domain-specific code (i.e. medical imaging, neural networks, etc.) should ever make it into `sm-core-go` or its python API.  All of that should be left to agent design and development.  We will close merge requests that include (even accidentally) any domain-specific code.

1. `sm-core-go` and its APIs should be though of essentially as a separate, 3rd party project/library from the rest of SimpleMind.

That final point is very important.  We don't want to be obtuse or difficult, but rather it is to ensure that we all have a highly stable base to build from, and shared language used to communicate between agents.  That, by design, should only seldomly change, and when it is changed, it should not be without extensive discussion, forethought, intention, and backwards-compatibility prioritization.

It also adds an interesting "pressure" the to overall SM project where the sm-core-go devs (John and Josh) will work to optimize the communication layer, while other teams and projects work to optimize the learning and knowledge layers.  I suspect those competing yet mutually beneficial goals will also allow for a cleaner, better, more performant implementation of each in the long run, however that may mean you don't always get exactly what you want as fast as you want it (or at all).

We definitely want to be responsive to the needs of the users though, but just know that requesting or proposing changes to the core should be a "last-ditch" option.  You should always try to solve your problems in the agent layer *first*, and you may need to get creative.

**Early CVIB users:** While all of the above is true and holds, we're still in the early days of this and we recognize that there will be bugs or choices that we've made early on that need to be revised.  As we're all getting up and running, please don't let the above statements discourage you from asking us questions or proposing alternatives to the initial implementations.  It's probably going to take a few versions before we really get this nailed down!
