v1.11.0 — Python 3.10+

The sleekest way
to build AI agents.

Tools, workflows, memory, caching, sessions, evaluation, and observability. Two operators. Zero boilerplate. Works with any OpenAI-compatible model.

from agentu import Agent # Define a tool — any Python function def search_products(query: str) -> list: return db.products.search(query) # Create agent, attach tools agent = Agent("sales").with_tools([search_products]) # Natural language — LLM picks the tool + params result = await agent.infer("Find laptops under $1500")
Install from PyPI View Source
Overview

Everything you need

Twelve capabilities in one package. No orchestration framework. No dependency maze. Just pip install agentu.

01

Tool Calling

Any Python function becomes a tool. Direct execution with call() or natural language with infer(). Multi-turn reasoning built in.

02

Workflows

>> chains steps sequentially. & runs them in parallel. Combine both. That's the entire API for multi-agent orchestration.

03

Memory

Persistent SQLite memory with importance scoring. remember() stores, recall() searches. Survives restarts.

04

Caching

Transparent LLM response caching with TTL. Same prompt = instant cache hit. SQLite-backed with hit rate stats.

05

Sessions

Stateful conversations that remember context. Multi-user isolation. Auto-managed history with SQLite persistence.

06

Evaluation

Test agents with exact match, substring, LLM-as-judge, or custom validators. JSON export for CI/CD pipelines.

07

Ralph Mode

Autonomous loops. Agent works toward a goal with checkpoints in a PROMPT.md file. Loops until done or limits hit.

08

Observability

Auto-instrumented metrics. Console, JSON, or silent output. Built-in real-time dashboard at /dashboard.

09

Skills

On-demand domain expertise. Progressive loading: metadata (100 chars) then instructions then resources. 96% less context.

10

GitHub Skills

Import skills from any GitHub repo. with_skills(["user/repo/skill"]). Cached locally at ~/.agentu/skills/.

11

Tool Search

Scale to hundreds of tools. Deferred tools are discovered on-demand via semantic search. No context bloat.

12

MCP + REST

Connect to MCP servers with with_mcp(). Serve agents as REST APIs with auto-generated Swagger docs.

Core

Workflows v1.0

Two operators replace entire orchestration frameworks. >> for sequential. & for parallel. Compose them freely.

Sequential
# researcher → analyst → writer workflow = ( researcher("Find AI trends") >> analyst("Analyze findings") >> writer("Summarize report") ) result = await workflow.run()
Parallel
# Run 3 searches at once workflow = ( search("AI") & search("Machine Learning") & search("Crypto") ) results = await workflow.run()
Combined — Parallel then Merge
# Fan-out research, fan-in analysis, then write workflow = ( (search("AI") & search("ML") & search("Crypto")) >> analyst("Compare all three") >> writer(lambda prev: f"Write report about: {prev['result']}") ) result = await workflow.run(checkpoint="./checkpoints") # Resume on crash

What's happening

State

Memory v1.0

Persistent SQLite memory with importance scoring. Agents remember across sessions without external databases.

Store + Recall
# Remember something important agent.remember( "Customer prefers email over phone", importance=0.9 ) # Later, recall relevant memories memories = agent.recall(query="communication") # → ["Customer prefers email over phone"]
Caching
# Enable transparent LLM response caching agent = Agent("assistant", cache=True, cache_ttl=3600) await agent.infer("What is Python?") # API call await agent.infer("What is Python?") # Cache hit! stats = agent.cache.get_stats() # {"hits": 10, "misses": 5, "hit_rate": 0.667}

How it works

Extension

Skills v1.2

On-demand domain expertise. Skills auto-activate on matching prompts and load progressively to minimize context usage.

Local Skills
from agentu import Agent, Skill pdf_skill = Skill( name="pdf-processing", description="Extract text and tables from PDFs", instructions="skills/pdf/SKILL.md", resources={"forms": "skills/pdf/FORMS.md"} ) agent = Agent("assistant").with_skills([pdf_skill]) # Skills auto-activate on matching prompts await agent.infer("Extract tables from report.pdf")
GitHub Skills
# Import from any GitHub repo agent = Agent("assistant").with_skills([ "hemanth/agentu-skills/pdf-processor", "openai/skills/code-review@v1.0", ]) # Skill directory structure: # my-skill/ # ├── skill.json → name + description # ├── SKILL.md → loaded when triggered # └── resources/ → loaded on-demand # Cached at ~/.agentu/skills/

Progressive loading

Production

Sessions + Evaluation

Stateful multi-turn conversations and a testing framework that plugs into CI/CD.

Sessions
from agentu import SessionManager manager = SessionManager() session = manager.create_session(agent) # First turn await session.send("What's the weather in SF?") # Second turn — context remembered await session.send("What about tomorrow?") # Agent knows "tomorrow" = SF weather # History management session.get_history(limit=10) session.clear_history()
Evaluation
from agentu import evaluate test_cases = [ {"ask": "What's 5 + 3?", "expect": 8}, {"ask": "Capital of France?", "expect": "Paris"}, {"ask": "Weather in SF?", "expect": "sunny"}, ] results = await evaluate(agent, test_cases) print(f"Accuracy: {results.accuracy}%") print(results.to_json()) # CI/CD export

Details

Monitoring

Observability v1.5

Auto-instrumented monitoring. Every tool call, LLM request, and error is tracked without adding a single line of code.

Observe
from agentu import Agent, observe # Configure output format observe.configure(output="console") # json | console | silent agent = Agent("assistant").with_tools([...]) await agent.infer("Find me laptops") # View metrics metrics = agent.observer.get_metrics() print(f"Tool calls: {metrics['tool_calls']}") print(f"Duration: {metrics['total_duration_ms']}ms") print(f"Errors: {metrics['errors']}")
Dashboard + REST
from agentu import serve # Serve agent as API + dashboard serve(agent, port=8000) # Endpoints: # http://localhost:8000/dashboard → live metrics # http://localhost:8000/docs → Swagger API # http://localhost:8000/execute → tool execution # http://localhost:8000/process → inference # http://localhost:8000/memory/* → memory API

Event types

Autonomous

Ralph Mode v1.6

Continuous autonomous loops. Define a goal with checkpoints in a PROMPT.md file. The agent works until all checkpoints pass or limits are reached.

Ralph Loop
# Agent loops toward a goal result = await agent.ralph( prompt_file="PROMPT.md", max_iterations=50, timeout_minutes=30, on_iteration=lambda i, data: print(f"[{i}] {data['result'][:50]}...") ) print(f"Done in {result['iterations']} iterations")
PROMPT.md
# Goal Build a REST API for user authentication. ## Checkpoints - [ ] Create user model with email + password - [ ] Implement /login endpoint with JWT - [ ] Implement /register endpoint - [ ] Add input validation - [ ] Write tests for all endpoints
Example

Automated code review

A complete example: review a PR, run tests, synthesize findings, and post a comment to GitHub — all automated.

Full Example
import asyncio from agentu import Agent def get_pr_diff(pr_number: int) -> str: """Fetch PR changes from GitHub.""" return github_api.get_diff(pr_number) def run_tests(branch: str) -> dict: """Run test suite.""" return {"passed": 47, "failed": 2, "coverage": 94.2} def post_comment(pr_number: int, comment: str) -> bool: """Post review comment to GitHub.""" return True async def main(): reviewer = Agent("reviewer", model="gpt-4").with_tools([get_pr_diff]) tester = Agent("tester").with_tools([run_tests]) commenter = Agent("commenter").with_tools([post_comment]) # Parallel: review code + run tests simultaneously workflow = reviewer("Review PR #247") & tester("Run tests on PR #247") code_review, test_results = await workflow.run() # Natural language: synthesize findings summary = await commenter.infer( f"Create a review for PR #247. " f"Code review: {code_review}. Tests: {test_results}. " f"Be constructive and specific." ) # Post to GitHub await commenter.call("post_comment", {"pr_number": 247, "comment": summary}) asyncio.run(main())

What's happening

Reference

API

MethodDescription
Agent(name, model?)Create agent. Auto-detects Ollama model if none specified.
.with_tools([fns])Attach active tools (Python functions).
.with_tools(defer=[fns])Attach searchable tools, discovered on-demand.
.with_skills([skills])Attach local or GitHub skill objects.
.with_mcp([urls])Connect to MCP servers.
await .call(name, params)Direct tool execution by name.
await .infer(prompt)Natural language — LLM picks the right tool.
.remember(text, importance)Store to persistent SQLite memory.
.recall(query)Search stored memories.
await .ralph(prompt_file)Autonomous loop with PROMPT.md checkpoints.
step >> stepSequential workflow composition.
step & stepParallel workflow composition.
await workflow.run()Execute composed workflow.
serve(agent, port)REST API with Swagger docs + dashboard.
evaluate(agent, cases)Run test cases, get accuracy + JSON export.
Models

Works with everything

Any OpenAI-compatible API. Auto-detects Ollama models.

LLM Configuration
# Auto-detect (uses first available Ollama model) Agent("assistant") # Explicit local model Agent("assistant", model="qwen3") # OpenAI Agent("assistant", model="gpt-4", api_key="sk-...") # vLLM, LM Studio, etc. Agent("assistant", model="mistral", api_base="http://localhost:8000/v1") # MCP servers agent.with_mcp(["http://localhost:3000"]) agent.with_mcp([{"url": "https://api.com/mcp", "headers": {"Auth": "token"}}])