Metadata-Version: 2.1
Name: ldp
Version: 0.2.0
Summary: Agent framework for constructing language model agents and training on constructive tasks.
Author-email: FutureHouse technical staff <hello@futurehouse.org>
Project-URL: issues, https://github.com/Future-House/ldp/issues
Project-URL: repository, https://github.com/Future-House/ldp
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiofiles
Requires-Dist: dm-tree
Requires-Dist: fhaviary
Requires-Dist: httpx
Requires-Dist: litellm
Requires-Dist: networkx[default] ~=3.0
Requires-Dist: numpy
Requires-Dist: openai >=1
Requires-Dist: pydantic ~=2.0
Requires-Dist: tenacity
Requires-Dist: tiktoken
Requires-Dist: tqdm
Requires-Dist: usearch >=2.13
Requires-Dist: typing-extensions ; python_version <= "3.11"
Provides-Extra: monitor
Requires-Dist: wandb ; extra == 'monitor'
Provides-Extra: nn
Requires-Dist: torch >=2.2 ; extra == 'nn'
Provides-Extra: server
Requires-Dist: fastapi >=0.109 ; extra == 'server'
Provides-Extra: typing
Requires-Dist: types-aiofiles ; extra == 'typing'
Requires-Dist: types-tqdm ; extra == 'typing'
Provides-Extra: visualization
Requires-Dist: pydot ~=2.0 ; extra == 'visualization'

# ldp

Agent framework for constructing language model agents and training on constructive tasks.

This repo models agent-environment interactions using a
[Partially Observable Markov Decision Process][pomdp] (POMDP).
Inspired by POMDP, this repo's name `ldp` stands for Language Decision Processes.

[pomdp]: https://en.wikipedia.org/wiki/Partially_observable_Markov_decision_process

## Installation

To install `ldp`:

```bash
pip install -e .
```

If you plan to export Graphviz visualizations,
make sure you also install the `graphviz` library into your OS via:

- Linux: `apt install graphviz`
- macOS: `brew install graphviz`

## Agent/Policy

An agent should have two functions:

```py
agent_state = await agent.init_state(tools=tools)
new_action, new_agent_state, value = await agent.get_asv(
    agent_state, obs
)
```

An agent should have a function `get_asv(agent_state, obs)`
that chooses an action (`a`) from the observation messages,
and returns the next agent state (`s`) and a value estimate (`v`).
The first argument, `agent_state`, is a state specific for the agent
that can be used for training from episodes.
You can make it `None` if you aren't using it.
It could contain things like agent memory.

The `obs` are not the complete list of observations, but rather the last list from `env.step`.
The agent should keep track of observations via its state if it would like to keep them.

The value can be `0`,
it is the agent's estimate of the future rewards given its state and observations.
This is used for training.

### Generic Support

The `Agent` (as well as classes in `agent.ops`)
are [generics](https://en.wikipedia.org/wiki/Generic_programming),
which means:

- `Agent` is designed to support arbitrary types
- Subclasses can exactly specify state types, making the code more readable

If you are new to Python generics (`typing.Generic`),
please read about them in [Python typing](https://docs.python.org/3/library/typing.html#generics).

Below is how to specify an agent with a custom state type.

```py
from dataclasses import dataclass, field
from datetime import datetime

from ldp.agents import Agent


@dataclass
class MyComplexState:
    vector: list[float]
    timestamp: datetime = field(default_factory=datetime.now)


class MyAgent(Agent[MyComplexState]):
    """Some agent who is now type checked to match the custom state."""
```

## Complete Example

```py
from ldp.agents import SimpleAgent
from aviary.env import DummyEnv

env = DummyEnv()
agent = SimpleAgent()

obs, tools = await env.reset()
agent_state = await agent.init_state(tools=tools)

done = False
while not done:
    action, agent_state, _ = await agent.get_asv(agent_state, obs)
    obs, reward, done, truncated = await env.step(action.value)
```
