Metadata-Version: 2.4
Name: llama_index_supervisor
Version: 0.1.9
Summary: This project is a Python-based multi-agent supervisor inspired by LangGraph, adapted for llama_index. It delegates tasks to specialized agents and tools, processes function-calling LLM responses, and manages conversation history. The structured workflow handles tool calls, processes agent handoffs with error handling, and provides flexible message management for hierarchical multi-agent systems.
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: llama-index-core>=0.12.27

# 🤖 llama-index-supervisor

A Python library for creating hierarchical multi-agent systems using [LlamaIndex](https://github.com/run-llama/llama_index). Inspired by LangGraph's supervisor, this package provides a framework for orchestrating specialized agents within the LlamaIndex ecosystem.

Hierarchical systems are a type of multi-agent architecture where specialized agents are coordinated by a central **supervisor** agent. The supervisor controls task delegation, making decisions about which agent or tool to invoke based on the current conversation context and task requirements.

## Features

-   🤖 **Supervisor Workflow**: Create a supervisor to orchestrate multiple specialized agents or tools using LlamaIndex's workflow engine.
-   🛠️ **Agent Handoff**: Built-in mechanism for the supervisor to hand off tasks to appropriate agents.
-   📝 **Flexible Message History**: Control how agent interactions are reflected in the overall chat history (`full_history` or `last_message`).
-   🧩 **Custom Agent Integration**: Use custom LlamaIndex `Workflow` classes as agents within the supervisor framework.
-   🌳 **Hierarchical Structures**: Build multi-level agent hierarchies by using supervisors as agents within other supervisors. Supports adding structure to the context (`add_tree_structure`).
-   🏷️ **Agent Name Attribution**: Automatically adds the agent's name to its messages for clarity in the history (`name_addition`).

## Installation

```bash
pip install llama-index-supervisor
```

Here's a simple example of a supervisor managing two specialized agents (a math expert and a research expert) using LlamaIndex components. The diagram below illustrates the basic architecture:

![Supervisor Architecture and Handoff](assets/supervisor.png)


```python
# Ensure you have OPENAI_API_KEY set in your environment
# import os
# os.environ["OPENAI_API_KEY"] = "sk-..."

from llama_index.llms.openai import OpenAI
from llama_index_supervisor import Supervisor
from llama_index.core.agent.function_calling import FunctionAgent
from llama_index.core.tools import FunctionTool
from llama_index.core.workflow import Context

# Initialize the LLM
llm = OpenAI(model="gpt-4o", temperature=0)

# Define tools
def add(a: float, b: float) -> float:
    """Add two numbers."""
    return a + b

def multiply(a: float, b: float) -> float:
    """Multiply two numbers."""
    return a * b

def web_search(query: str) -> str:
    """Search the web for information."""
    # Replace with a real web search implementation if needed
    print(f"--- Searching web for: {query} ---")
    return (
        "Here are the headcounts for each of the FAANG companies in 2024:\n"
        "1. **Facebook (Meta)**: 67,317 employees.\n"
        "2. **Apple**: 164,000 employees.\n"
        "3. **Amazon**: 1,551,000 employees.\n"
        "4. **Netflix**: 14,000 employees.\n"
        "5. **Google (Alphabet)**: 181,269 employees."
    )

# Create function tools
add_tool = FunctionTool.from_defaults(fn=add)
multiply_tool = FunctionTool.from_defaults(fn=multiply)
search_tool = FunctionTool.from_defaults(fn=web_search)

# Create specialized agents using LlamaIndex FunctionAgent
math_agent = FunctionAgent.from_tools(
    tools=[add_tool, multiply_tool],
    llm=llm,
    name="math_expert",
    system_prompt="You are a math expert. Always use one tool at a time.",
    description="Specialized in performing mathematical calculations like addition and multiplication." # Added description
)

research_agent = FunctionAgent.from_tools(
    tools=[search_tool],
    llm=llm,
    name="research_expert",
    system_prompt="You are a world class researcher with access to web search. Do not do any math.",
    description="Specialized in searching the web for information." # Added description
)

# Create supervisor workflow
# The supervisor automatically gets descriptions from the agents
supervisor = Supervisor(
    llm=llm,
    agents=[math_agent, research_agent],
    # Optional: Add a tree structure representation to the system prompt
    add_tree_structure=True,
    # Optional: Set a timeout for the workflow
    timeout=60
)

# Run the workflow
# Context manages state, including memory, across workflow steps
ctx = Context(supervisor)
response = await supervisor.run(
    input="what's the combined headcount of the FAANG companies in 2024?",
    ctx=ctx # Pass the context
)

# Print the response directly
print(response)
```

## Message History Management

The `output_mode` parameter in the `Supervisor` constructor controls how messages from delegated agents are added back to the supervisor's main chat history:

-   `output_mode="full_history"` (Default): Includes all messages (intermediate steps, tool calls, final response) from the delegated agent's run.

    ![Full History Output](assets/full_history.png)

-   `output_mode="last_message"`: Includes only the final message generated by the delegated agent.

    ![Last Message Output](assets/last_message.png)

```python
# Example: Only include the final response from agents
supervisor_last_only = Supervisor(
    llm=llm,
    agents=[math_agent, research_agent],
    output_mode="last_message"
)
```

## Hierarchical Supervisors

You can create multi-level hierarchies by using `Supervisor` instances as agents within another `Supervisor`. Ensure each agent (including supervisors acting as agents) has a unique `name` and a `description`.

```python
# Assume math_agent, research_agent, writing_agent, publishing_agent are defined
# (writing_agent and publishing_agent would need tools and descriptions)

# Create mid-level supervisors
research_supervisor = Supervisor(
    llm=llm,
    agents=[research_agent, math_agent],
    name="research_supervisor",
    description="Manages a research team with experts in web research and math.",
    system_prompt="You manage a research team. Delegate tasks appropriately.",
    timeout=60
)

writing_supervisor = Supervisor(
    llm=llm,
    # agents=[writing_agent, publishing_agent], # Add writing/publishing agents here
    name="writing_supervisor",
    description="Manages a content team with experts in writing and publishing.",
    system_prompt="You manage a content team. Delegate tasks appropriately.",
    timeout=60
)

# Create top-level supervisor
top_level_supervisor = Supervisor(
    llm=llm,
    agents=[research_supervisor, writing_supervisor],
    name="top_level_supervisor",
    description="Executive supervisor coordinating research and writing teams.",
    system_prompt="You are the executive supervisor. For research/math, use research_supervisor. For content/publishing, use writing_supervisor.",
    timeout=120, # Increase timeout for multi-level
    add_tree_structure=True # Useful for complex hierarchies
)

# Run the top-level supervisor
ctx_top = Context(top_level_supervisor)
query = "Research the FAANG headcounts for 2024, calculate the total, and then write a brief summary."
response_top = await top_level_supervisor.run(
    input=query,
    ctx=ctx_top,
)

# Print the final response
print(response)
```

## Using Custom Agents (Workflows)

You can define your own agent logic by creating a class that inherits from `llama_index.core.workflow.Workflow`. This custom workflow can then be passed as an agent to the `Supervisor`.

See the `CoolAgent` example in `CoolAgent.ipynb` ([Colab Link](https://colab.research.google.com/drive/1l3hDjXbJn5VrT6jFtZzReUibsN-AEUJ-?usp=sharing)) for a demonstration. The key is that your custom workflow class needs `name` and `description`.

```python
# simplified snippet from CoolAgent.ipynb
from llama_index.core.workflow import Workflow, Context, StartEvent, StopEvent, step
from llama_index.core.llms import LLM, ChatMessage
from llama_index.core.memory import ChatMemoryBuffer

class CoolAgent(Workflow):
    def __init__(self, name: str, description: str, llm: LLM, system_prompt: str = "You are a cool agent. Be cool."):
        super().__init__()
        self.name = name
        self.description = description
        self.llm = llm
        self.system_message = ChatMessage(role="system", content=system_prompt)

    @step
    async def start_flow(self, ctx: Context, ev: StartEvent) -> StopEvent:
        memory: ChatMemoryBuffer = await ctx.get("memory", default=ChatMemoryBuffer.from_defaults(llm=self.llm))
        # Agent logic here... interacts with memory and LLM
        # Example: Ask how it is doing based on supervisor's prompt
        user_input = await ctx.get("input", "") # Get input if passed directly
        await memory.aput(ChatMessage(role="user", content="How are you?")) # Example interaction
        response = await self.llm.achat([self.system_message] + memory.get())
        await memory.aput(response.message)
        await ctx.set("memory", memory) # Update context memory
        return StopEvent(result=response.message) # Return result

# Create the custom agent
cool_agent = CoolAgent(
    name="cool_agent",
    description="A cool agent that does cool things.",
    llm=llm,
    system_prompt="You are a cool agent. Be super super cool."
)

# Use it in the supervisor
supervisor_with_cool = Supervisor(
    llm=llm,
    agents=[cool_agent] # Pass the custom agent instance
)

# Run it
result = await supervisor_with_cool.run(
    input="Ask the cool agent how it is doing."
)
print(result)
```

## Agent Handoff Mechanism

-   The `Supervisor` automatically creates internal "handoff" tools (`transfer_to_<agent_name>`) for each agent provided.
-   When the supervisor's LLM decides to delegate a task, it calls the corresponding handoff tool.
-   The `Supervisor` intercepts this tool call, prepares the context (including chat history), and runs the designated agent's workflow (`agent.run(ctx=...)`).
-   If `add_handoff_back_messages=True` (default), special messages are added to the history when control returns to the supervisor, indicating the handoff completion.

## Adding Memory / Context

Memory in `llama-index-supervisor` is managed through the `llama_index.core.workflow.Context` object and uses `llama_index.core.memory.ChatMemoryBuffer`.

-   The `Context` is passed to the `supervisor.run()` method.
-   The supervisor automatically retrieves or initializes a `ChatMemoryBuffer` within the context under the key `"memory"`.
-   You can pre-populate the memory before running the supervisor:

```python
# filepath: examples.ipynb (snippet)
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.llms import ChatMessage
from llama_index.core.workflow import Context

# Assume supervisor, math_agent, research_agent are defined

ctx = Context(supervisor) # Create context linked to the supervisor workflow

# Pre-populate memory
memory=ChatMemoryBuffer.from_defaults(
    chat_history=[
        ChatMessage(role="user", content="What is 2+2?"),
        ChatMessage(role="assistant", content="2 + 2 = 4"), # Example previous turn
    ]
)
await ctx.set("memory", memory) # Set the pre-populated memory in the context

# Run the supervisor with the context containing pre-populated memory
response = await supervisor.run(
    input="Now multiply that by 3.", # Follow-up question
    ctx=ctx,
)
```

## Contributing

Contributions are welcome! Please feel free to open issues or submit pull requests for any enhancements, bug fixes, or new features.
