Metadata-Version: 2.4
Name: traqo
Version: 0.2.0
Summary: Structured tracing for applications. JSONL files, hierarchical spans, zero infrastructure.
Project-URL: Homepage, https://github.com/Cecuro/traqo
Project-URL: Repository, https://github.com/Cecuro/traqo
Author: Cecuro
License-Expression: MIT
License-File: LICENSE
Keywords: jsonl,llm,observability,spans,tracing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Typing :: Typed
Requires-Python: >=3.10
Provides-Extra: all
Requires-Dist: anthropic>=0.40; extra == 'all'
Requires-Dist: boto3>=1.28; extra == 'all'
Requires-Dist: google-cloud-storage>=2.10; extra == 'all'
Requires-Dist: google-genai>=1.0; extra == 'all'
Requires-Dist: langchain-core>=0.3; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.40; extra == 'anthropic'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Provides-Extra: gcs
Requires-Dist: google-cloud-storage>=2.10; extra == 'gcs'
Provides-Extra: gemini
Requires-Dist: google-genai>=1.0; extra == 'gemini'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3; extra == 'langchain'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: s3
Requires-Dist: boto3>=1.28; extra == 's3'
Description-Content-Type: text/markdown

<p align="center">
  <img src="traqo/ui/static/favicon.svg" width="96" height="96" alt="Pedro the Raccoon — traqo mascot">
</p>

# traqo

Structured tracing for applications. JSONL files, hierarchical spans, zero infrastructure.

```python
from traqo import Tracer, trace
from pathlib import Path

@trace
def classify(text: str) -> str:
    response = llm.chat(text)
    return response

with Tracer(Path("traces/run.jsonl"), input={"query": "Is this a bug?"}):
    result = classify("Is this a bug?")
```

Your traces are just `.jsonl` files. Read them with `grep`, query them with DuckDB, or hand them to an AI assistant.

## Why traqo?

- **Zero infrastructure** -- no server, no database, no account. `pip install traqo` and go.
- **AI-first** -- JSONL is text. AI assistants read your traces directly, no browser needed.
- **Hierarchical spans** -- not flat logs. Reconstruct the full call tree across functions and files.
- **Everything is a span** -- LLM calls, DB queries, HTTP requests. All spans with metadata.
- **Zero dependencies** -- stdlib only. Integrations are optional extras.
- **Transparent** -- traces are portable files. No vendor lock-in, no proprietary format.

## Install

```bash
pip install traqo                   # Core (zero dependencies)
pip install traqo[openai]           # + OpenAI integration
pip install traqo[anthropic]        # + Anthropic integration
pip install traqo[langchain]        # + LangChain integration
pip install traqo[gemini]           # + Google Gemini integration
pip install traqo[all]              # Everything
```

## Quick Start

### 1. Trace a function

```python
from traqo import Tracer, trace
from pathlib import Path

@trace
def summarize(text: str) -> str:
    # your logic here
    return summary

@trace
def pipeline(docs: list[str]) -> list[str]:
    return [summarize(doc) for doc in docs]

with Tracer(
    Path("traces/my_run.jsonl"),
    input={"docs": ["doc1", "doc2"]},
    tags=["production"],
) as tracer:
    results = pipeline(["doc1", "doc2"])
    tracer.set_output({"count": len(results)})
```

`@trace` works with sync/async functions and generators. It detects and handles all automatically.

### 2. Auto-trace LLM calls

```python
from traqo.integrations.openai import traced_openai
from openai import OpenAI

client = traced_openai(OpenAI(), operation="summarize")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this..."}],
)
# Token usage, model, input/output all captured automatically as span metadata
```

Works the same way for Anthropic, Gemini, and LangChain:

```python
from traqo.integrations.anthropic import traced_anthropic
from traqo.integrations.gemini import traced_gemini
from traqo.integrations.langchain import traced_model
```

All integrations auto-capture token usage, model parameters, streaming with TTFT, and tool calls.

### 3. Use metadata, tags, and kind

```python
from traqo import Tracer, LLM, TOOL

with Tracer(Path("traces/run.jsonl"), tags=["prod"]) as tracer:
    with tracer.span(
        "classify",
        input={"text": "Is this a bug?"},
        metadata={"model": "gpt-4o", "provider": "openai"},
        tags=["llm"],
        kind=LLM,
    ) as span:
        result = call_llm(...)
        span.set_metadata("token_usage", {"input_tokens": 100, "output_tokens": 50})
        span.set_output(result)
```

Kind constants: `LLM`, `TOOL`, `RETRIEVER`, `CHAIN`, `AGENT`, `EMBEDDING`, `GUARDRAIL` (or use any string).

### 4. Access the current span from anywhere

```python
from traqo import trace, update_current_span

@trace
def classify(text: str) -> str:
    update_current_span(metadata={"confidence": 0.95, "model": "gpt-4o"})
    return result
```

`update_current_span()` is a convenience helper — no-op when no span is active. For full control, use `get_current_span()` directly.

### 5. Read your traces

```bash
# Last line is always trace_end with summary stats
tail -1 traces/my_run.jsonl | jq .

# All LLM spans
grep '"kind":"llm"' traces/my_run.jsonl | jq .

# Filter by tag
grep '"tags"' traces/my_run.jsonl | jq .

# Errors
grep '"status":"error"' traces/**/*.jsonl

# Token usage from span metadata
grep '"token_usage"' traces/**/*.jsonl | jq '.metadata.token_usage'
```

## Trace Viewer UI

Browse and inspect traces in your browser. Zero dependencies — uses Python's built-in HTTP server.

```bash
traqo ui ./traces                  # Serve traces on http://localhost:7600
traqo ui ./traces --port 8080     # Custom port
traqo ui s3://my-bucket/traces/   # Browse traces from S3
traqo ui gs://my-bucket/traces/   # Browse traces from GCS
python -m traqo ui ./traces       # Alternative invocation
```

Cloud sources list files instantly via API, then download on click. Previously viewed traces show full summary data (duration, stats, tags) on the next page load.

Features: folder navigation, search/filter, span tree with waterfall timing, JSON viewer with syntax highlighting, token usage visualization, keyboard shortcuts (Escape to go back, ? for help).

## API Reference

### `Tracer(path, *, input=None, metadata=None, tags=None, thread_id=None, capture_content=True, backends=None)`

Creates a trace session writing to a JSONL file. Use as a context manager.

```python
with Tracer(
    Path("traces/run.jsonl"),
    input={"query": "What is the weather?"},
    metadata={"run_id": "abc123"},
    tags=["production", "chatbot"],
    thread_id="conv-456",
    capture_content=False,  # Integrations omit LLM input/output
) as tracer:
    result = my_pipeline()
    tracer.set_output({"response": result})
```

| Parameter | Type | Default | Description |
|---|---|---|---|
| `path` | `Path` | required | JSONL file path. Parent dirs created automatically. |
| `input` | `Any` | `None` | Trace input, written to `trace_start`. |
| `metadata` | `dict` | `{}` | Arbitrary metadata written to `trace_start`. |
| `tags` | `list[str]` | `[]` | Tags for filtering/categorization, written to `trace_start`. |
| `thread_id` | `str` | `None` | Conversation/thread grouping ID, written to `trace_start`. |
| `capture_content` | `bool` | `True` | If `False`, integration wrappers omit LLM message inputs/outputs. The `@trace` decorator has separate `capture_input`/`capture_output` flags. |
| `backends` | `list[Backend]` | `None` | Storage backends notified on events and trace completion. The local JSONL file is always written regardless. |

**Methods:**

| Method | Description |
|---|---|
| `span(name, *, input=, metadata=, tags=, kind=)` | Span context manager. Yields a `Span` object. |
| `set_output(value)` | Set trace-level output (written to `trace_end`). |
| `log(name, data)` | Write a custom event. |
| `child(name, path)` | Create a child tracer writing to a separate file. |

### `Span`

Mutable handle yielded by `tracer.span()`. Set output and metadata during execution.

```python
with tracer.span("my_step", input=data, tags=["important"], kind="tool") as span:
    result = do_work()
    span.set_output(result)
    span.set_metadata("latency_ms", 42)
    span.update_metadata({"extra": "info"})
```

| Method | Description |
|---|---|
| `set_output(value)` | Set span output (written to `span_end`) |
| `set_metadata(key, value)` | Set a metadata key |
| `update_metadata(dict)` | Merge a dict into metadata |

### `@trace`

Decorator that wraps a function in a span. Works with sync/async functions and generators.

```python
@trace
def my_step(data: list) -> dict:
    return process(data)

@trace("custom_name", capture_input=False, kind=TOOL)
def sensitive_step(secret: str) -> str:
    return handle(secret)

@trace(ignore_arguments=["password"], kind=TOOL)
def login(user: str, password: str) -> bool:
    return authenticate(user, password)
```

Parameters: `name`, `capture_input`, `capture_output`, `ignore_arguments`, `metadata`, `tags`, `kind`.

When no tracer is active, `@trace` is a pure passthrough with zero overhead.

### `get_current_span() -> Span | None`

Returns the current active span, or `None`.

### `update_current_span(*, output=, metadata=, tags=, **kw_metadata)`

Convenience helper to update the active span. No-op when no span is active.

```python
from traqo import trace, update_current_span

@trace
def my_function(text: str) -> str:
    update_current_span(metadata={"custom_key": "custom_value"})
    return process(text)
```

### `get_tracer() -> Tracer | None`

Returns the active tracer for the current context, or `None`.

```python
from traqo import get_tracer

tracer = get_tracer()
if tracer:
    tracer.log("checkpoint", {"count": len(results)})
```

### `disable()` / `enable()`

```python
import traqo
traqo.disable()  # All tracing becomes no-op
traqo.enable()   # Re-enable
```

Or via environment variable: `TRAQO_DISABLED=1`

## Child Tracers

For concurrent agents or workers that produce many events. Each child writes to its own file, linked to the parent.

```python
with Tracer(Path("traces/pipeline.jsonl")) as tracer:
    child = tracer.child("reentrancy_agent", Path("traces/agents/reentrancy.jsonl"))
    with child:
        run_agent(...)
```

The parent trace records `child_started` / `child_ended` events and includes child summaries in `trace_end`.

## JSONL Format

Every line is a self-contained JSON object. Five event types:

| Type | When | Key Fields |
|---|---|---|
| `trace_start` | Tracer enters | `tracer_version`, `input`, `metadata`, `tags`, `thread_id` |
| `span_start` | Span begins | `id`, `parent_id`, `name`, `input`, `metadata`, `tags`, `kind` |
| `span_end` | Span ends | `id`, `duration_s`, `status`, `output`, `metadata`, `tags`, `kind` |
| `event` | Custom checkpoint | `name`, `data` |
| `trace_end` | Tracer exits | `duration_s`, `output`, `stats`, `children` |

The `kind` field categorizes spans (e.g. `"llm"`, `"tool"`, `"retriever"`). The `tags` field is a list of strings for filtering. Both are omitted when not set.

The `metadata` dict is the universal extension point. LLM-specific data like `model`, `provider`, and `token_usage` are stored there.

## Query with DuckDB

```sql
-- All LLM spans with token usage
SELECT metadata->>'model' as model,
       count(*) as calls,
       sum((metadata->'token_usage'->>'input_tokens')::int) as total_in,
       sum((metadata->'token_usage'->>'output_tokens')::int) as total_out,
       avg(duration_s) as avg_duration
FROM read_json('traces/**/*.jsonl')
WHERE kind = 'llm'
GROUP BY model;

-- All traces for a conversation thread
SELECT * FROM read_json('traces/**/*.jsonl')
WHERE thread_id = 'conv-123'
AND type = 'trace_start';
```

## License

MIT
