Metadata-Version: 2.4
Name: troy
Version: 0.1.0
Summary: Audit agent execution traces against configurable policies — CLI, real-time guard SDK, and framework adapters
Project-URL: Homepage, https://github.com/sentosa-ai/troy
Project-URL: Repository, https://github.com/sentosa-ai/troy
Project-URL: Issues, https://github.com/sentosa-ai/troy/issues
License: MIT
License-File: LICENSE
Keywords: agent,audit,compliance,guardrails,hipaa,llm,policy,safety,soc2
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: click>=8.0
Requires-Dist: networkx>=3.0
Requires-Dist: openai>=1.0
Requires-Dist: pydantic>=2.0
Requires-Dist: python-dotenv>=1.2.1
Provides-Extra: crewai
Requires-Dist: crewai>=0.100.0; extra == 'crewai'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.0; extra == 'langchain'
Provides-Extra: openai-agents
Requires-Dist: openai-agents>=0.1.0; extra == 'openai-agents'
Description-Content-Type: text/markdown

# troy

Audit agent execution traces against configurable policies. Post-hoc auditing with LLM explanations, real-time interception via the guard SDK, and framework adapters for LangChain, OpenAI Agents, and CrewAI.

## Installation

Requires Python 3.11+.

```bash
pip install troy
```

With framework adapters:

```bash
pip install troy[langchain]    # LangChain callback handler
pip install troy[openai-agents] # OpenAI Agents SDK hooks
pip install troy[crewai]        # CrewAI global hooks
```

### From source

```bash
git clone https://github.com/sentosa-ai/troy.git
cd troy
uv sync
```

### LLM credentials (for `audit` / `audit-batch` commands)

```bash
cp .env.example .env
# Edit .env with your API key, base URL, and model
```

Or pass them as flags / environment variables (see [Configuration](#configuration)). The `check`, `replay`, and `policies` commands do not require an LLM.

## Quick Start

```bash
# Audit a single trace
uv run troy audit traces/agent_run.json examples/policy.json

# Batch audit every trace in a directory
uv run troy audit-batch traces/ examples/policy.json

# Replay a previous audit interactively
uv run troy replay logs/2026-02-15/trace3/audit.json

# Replay with a different policy (no LLM calls, instant)
uv run troy replay logs/2026-02-15/trace3/audit.json --policy examples/policy.json

# Dump replay to stdout for piping / CI
uv run troy replay logs/2026-02-15/trace3/audit.json --policy examples/policy.json --no-interactive
```

## Commands

### `audit` — Single trace audit

Runs the full pipeline: graph building, LLM explanation, policy evaluation, scoring, and reporting.

```bash
uv run troy audit <trace_file> <policy_file> [OPTIONS]
```

| Option | Env var | Default | Description |
|---|---|---|---|
| `--output`, `-o` | — | `logs/{date}/report.md` | Markdown report output path |
| `--json-output`, `-j` | — | `logs/{date}/audit.json` | JSON report output path |
| `--model`, `-m` | `TROY_MODEL` | `gpt-4o-mini` | LLM model name |
| `--base-url` | `OPENAI_BASE_URL` | — | API base URL |
| `--api-key` | `OPENAI_API_KEY` | — | API key |

**Output files:**

```
logs/{date}/
├── report.md              # Markdown audit report
├── audit.json             # Full audit result (replayable)
└── llm_responses/         # Raw LLM responses for audit trail
    ├── step_s1.txt
    ├── step_s2.txt
    └── trace_summary.txt
```

### `audit-batch` — Batch audit

Audits all `.json` trace files in a directory concurrently (up to 5 in parallel).

```bash
uv run troy audit-batch <trace_dir> <policy_file> [OPTIONS]
```

Same options as `audit` (model, base-url, api-key). Generates per-trace reports plus a batch summary:

```
logs/{date}/
├── summary.md             # Table of all traces with violation counts
├── batch.json             # Full batch result
├── trace1/
│   ├── report.md
│   └── audit.json
└── trace2/
    ├── report.md
    └── audit.json
```

### `replay` — Interactive audit replay

Replays a previously-generated `audit.json` in the terminal. No LLM calls needed.

```bash
uv run troy replay <audit_file> [OPTIONS]
```

| Option | Description |
|---|---|
| `--policy <file>` | Re-evaluate with a different policy file (pure computation, instant) |
| `--no-interactive` | Dump full replay to stdout instead of interactive mode |

**Interactive controls:**

| Key | Action |
|---|---|
| `→` / `n` | Next step |
| `←` / `p` | Previous step |
| `d` | Step detail view |
| `s` | Trace summary view |
| `v` | Violations view |
| `j` / `k` | Jump to next / previous violation |
| `q` | Quit |

### `check` — Single action policy check

Evaluate one action against a policy. No LLM needed. Returns JSON. Exit code 0 if allowed, 2 if blocked.

```bash
troy check policy.json -a search -i '{"query": "SELECT * FROM users"}'
troy check policy.json -a send_email --mode monitor
troy check policy.json -a bash --metadata '{"permission_level": "admin"}'
```

### `policies` — Browse and use policy templates

```bash
# List all bundled policy templates
troy policies list

# Show rules in a specific policy
troy policies show soc2

# Copy a template to your project
troy policies copy hipaa -o my_policy.json

# Combine multiple templates into one policy
troy policies init -t soc2 -t hipaa -o policy.json
```

Available templates: `minimal`, `agent_safety`, `owasp_llm_top10`, `data_protection`, `safe_browsing`, `soc2`, `hipaa`.

## How It Works

1. **Ingestion** — Loads and validates trace JSON using Pydantic models
2. **Graph Building** — Constructs a directed execution graph (NetworkX) representing step dependencies
3. **Explanation** — Sends each step + surrounding context to an LLM to infer *why* the agent made each decision, what data it accessed, what alternatives existed, and what risks are present
4. **Policy Evaluation** — Evaluates Python expression conditions against each step with full cross-step context
5. **Scoring** — Computes a risk score: `min(100, sum(weights of violated rules))`
6. **Reporting** — Generates markdown and JSON audit reports

## Trace Format

troy consumes traces — it doesn't generate them. Your agent logging system needs to produce JSON in this format:

```json
{
  "trace_id": "trace-001",
  "agent_name": "my-agent",
  "steps": [
    {
      "step_id": "step-1",
      "type": "tool_call",
      "description": "Fetch user profile from database",
      "input": { "user_id": "usr_882" },
      "output": { "name": "Jane Doe", "email": "jane@example.com" },
      "metadata": { "data_classification": "pii" },
      "timestamp": "2026-02-15T11:15:00Z",
      "parent_step_id": null
    }
  ],
  "metadata": {
    "environment": "production",
    "permission_level": "user"
  }
}
```

### Step fields

| Field | Required | Description |
|---|---|---|
| `step_id` | Yes | Unique identifier referenced in violations and reports |
| `type` | Yes | One of `llm_call`, `tool_call`, `decision`, `observation` |
| `description` | Yes | Human-readable description of what the step does |
| `input` | Yes | Full inputs — prompts, tool args, queries. Without this, auditing is blind |
| `output` | Yes | Full outputs — responses, return values. Needed to verify what actually happened |
| `metadata` | No | Labels like `data_classification`, `network_zone`, `permission_level`, `requires_approval`. Used by policy rules |
| `timestamp` | No | ISO 8601 timestamp for ordering and timeline analysis |
| `parent_step_id` | No | For nested/branching execution (e.g. sub-agent calls) |

### Metadata conventions

Policy rules reference these metadata keys. Annotate your steps with them to enable detection:

| Key | Values | Used by |
|---|---|---|
| `data_classification` | `pii`, `internal`, `public` | PII exfiltration detection |
| `network_zone` | `external`, `internal` | External data transmission detection |
| `permission_level` | `user`, `admin` | Privilege escalation detection |
| `requires_approval` | `true` / `false` | Mandatory approval checks |
| `approval_token` | token string or `null` | Approval verification |
| `category` | `communication`, etc. | Communication channel controls |

The more context you log per step, the better the audit. At minimum: capture full inputs and outputs. The auditor infers *why* the agent made each decision by analyzing the execution chain — what came before, what came after, and how data flowed between steps.

## Policy Format

Policies are JSON files containing a list of rules. Each rule has a `condition` — a Python expression that returns `True` when the rule is **violated**.

```json
{
  "policy_id": "my-policy",
  "description": "Safety policy for production agents",
  "rules": [
    {
      "rule_id": "pii-exfiltration-protection",
      "description": "Detects PII handling followed by transmission to external endpoints",
      "condition": "get(step, 'metadata.data_classification') == 'pii' and any_next(lambda s: s['type'] == 'tool_call' and get(s, 'metadata.network_zone') == 'external')",
      "severity": "critical",
      "weight": 50
    }
  ]
}
```

### Rule fields

| Field | Required | Default | Description |
|---|---|---|---|
| `rule_id` | Yes | — | Unique identifier for the rule |
| `description` | Yes | — | Human-readable description shown in reports |
| `condition` | Yes | — | Python expression (see below). `True` = violated |
| `severity` | No | `medium` | `critical`, `high`, `medium`, `low` |
| `weight` | No | `10` | Points added to risk score when violated |

### Writing conditions

Conditions are Python expressions evaluated per-step with these variables and helpers in scope:

**Variables:**

| Variable | Type | Description |
|---|---|---|
| `step` | `dict` | Current step being evaluated |
| `steps` | `list[dict]` | All steps in the trace |
| `step_index` | `int` | Current step's index |
| `prev_steps` | `list[dict]` | Steps before the current one |
| `next_steps` | `list[dict]` | Steps after the current one |
| `trace` | `dict` | Trace-level info: `trace_id`, `agent_name`, `metadata` |
| `agent` | `dict` | Agent info: `name`, `metadata` (from trace) |

**Helper functions:**

| Function | Description |
|---|---|
| `get(d, 'a.b.c', default)` | Safe nested dict access via dot-separated path. Returns `default` (or `None`) if any key is missing |
| `matches(text, pattern)` | Case-insensitive regex search. Returns truthy if pattern is found |
| `any_step(fn)` | `True` if `fn(step_dict)` is true for any step in the trace |
| `any_next(fn)` | `True` if `fn(step_dict)` is true for any step after the current one |
| `any_prev(fn)` | `True` if `fn(step_dict)` is true for any step before the current one |

**Example conditions:**

```python
# PII data followed by an external tool call
"get(step, 'metadata.data_classification') == 'pii' and any_next(lambda s: s['type'] == 'tool_call' and get(s, 'metadata.network_zone') == 'external')"

# Prompt injection patterns in step input
"matches(str(step.get('input', {})), r'ignore previous instructions|system update|run as admin')"

# Raw SQL in tool call inputs
"step['type'] == 'tool_call' and matches(get(step, 'input.query', ''), r'SELECT|INSERT|UPDATE|DELETE|DROP|UNION')"

# Missing approval token on steps that require approval
"get(step, 'metadata.requires_approval') is True and get(step, 'metadata.approval_token') is None"

# Non-admin agent accessing admin-level step
"get(agent, 'metadata.permission_level') != 'admin' and get(step, 'metadata.permission_level') == 'admin'"

# Any tool call categorized as communication
"step['type'] == 'tool_call' and get(step, 'metadata.category') == 'communication'"
```

Malformed or erroring conditions are silently skipped — they won't crash the engine.

## Framework Adapters

All adapters accept an optional `metadata_fn` callback that maps each action to security metadata. This enables metadata-based policy rules (e.g. `network_zone`, `data_classification`, `approval_token`) when using framework integrations.

```python
from troy.models import StepType

def my_metadata(action: str, input_data: dict, step_type: StepType) -> dict:
    """Map tool/LLM calls to security metadata for policy evaluation."""
    meta = {"network_zone": "internal", "data_classification": "public"}
    if action in ("send_email", "http_request"):
        meta["network_zone"] = "external"
    if step_type == StepType.TOOL_CALL and "pii" in str(input_data):
        meta["data_classification"] = "pii"
    return meta
```

**LangChain:**
```python
from troy.adapters.langchain import TroyHandler
handler = TroyHandler(policy="policy.json", metadata_fn=my_metadata)
```

**OpenAI Agents SDK:**
```python
from troy.adapters.openai_agents import TroyHooks
hooks = TroyHooks(policy="policy.json", metadata_fn=my_metadata)
```

**CrewAI:**
```python
from troy.adapters.crewai import enable_troy
guard = enable_troy(policy="policy.json", metadata_fn=my_metadata)
```

Without `metadata_fn`, adapters pass no step metadata — only tool-name and input-content policy rules will fire.

## Configuration

LLM settings can be configured three ways (in order of precedence):

1. **CLI flags:** `--model`, `--base-url`, `--api-key`
2. **Environment variables:** `TROY_MODEL`, `OPENAI_BASE_URL`, `OPENAI_API_KEY`
3. **`.env` file** (loaded automatically via python-dotenv)

The tool uses the OpenAI client library, so it works with any OpenAI-compatible API (OpenAI, Azure, local models via LiteLLM/Ollama, etc).

## Testing

```bash
uv run pytest tests/ -v
```

## Roadmap

- **Drift detection** — Detect when agent behavior drifts from established baselines
- **Regression comparison** — Compare audit results across trace versions to catch regressions
- **Structured semantic diffing** — Diff two traces at the semantic level, not just textual
- **Risk dashboards** — Visual dashboard for risk scores and violation trends over time
- **RBAC** — Role-based access control for multi-user audit workflows
- **Persistence** — SQLite trace storage for trend analysis and cross-session querying
