Metadata-Version: 2.4
Name: llm-observatory
Version: 0.2.4
Summary: Context0 SDK — one import, all calls logged
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: openai>=1.0.0
Requires-Dist: requests>=2.28.0
Requires-Dist: anthropic>=0.18.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-mock>=3.10; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: responses>=0.23; extra == "dev"

# llm-observatory

Drop-in LLM observability for Python. One import change, all calls logged automatically.

## Install

```bash
pip install llm-observatory
```

## Usage

```python
from llm_observatory import configure, OpenAI

configure(api_key="your-api-key", base_url="https://your-endpoint.com")

client = OpenAI(api_key="sk-...")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)
```

Supports `OpenAI`, `AsyncOpenAI`, `AzureOpenAI`, `AsyncAzureOpenAI`, `Anthropic`, and `AsyncAnthropic`.

## Wrapping third-party clients

Already using another LLM wrapper (e.g. Langfuse)? Use `wrap()` to add Context0 observability on top of any OpenAI- or Anthropic-compatible client — no need to change your existing setup:

```python
from langfuse.openai import OpenAI as LangfuseOpenAI
from llm_observatory import configure, wrap

configure(api_key="your-api-key", base_url="https://your-endpoint.com")

client = wrap(LangfuseOpenAI(api_key="sk-..."))

# Both Langfuse AND Context0 capture this call
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)
```

`wrap()` works with any client that has a `.chat` (OpenAI-shaped) or `.messages` (Anthropic-shaped) interface. If the client is already a Context0 wrapper, it's returned as-is — no double-wrapping.

## Tracing

Group related LLM calls into traces to see your full pipeline as a tree.

### `@observe` decorator

Automatically creates a trace (top-level) or span (nested). Captures function args as input and return value as output.

```python
import llm_observatory as obs

client = obs.OpenAI()

@obs.observe
def retrieve_docs(query: str):
    return vector_db.search(query)

@obs.observe
def rag_pipeline(question: str):
    docs = retrieve_docs(question)       # child span
    response = client.chat.completions.create(  # auto-captured as generation
        model="gpt-4o",
        messages=[{"role": "user", "content": f"{docs}\n\n{question}"}],
    )
    return response.choices[0].message.content
```

Options:
- `@obs.observe(name="custom-name")` — override the span name (default: function name)
- `@obs.observe(capture_input=False)` — don't log args (for sensitive data)
- `@obs.observe(capture_output=False)` — don't log return value

Works with both sync and async functions.

### Context managers

Use `trace()` and `span()` for inline code blocks that aren't standalone functions:

```python
with obs.trace(name="rag-pipeline", input={"query": q}) as t:
    with obs.span(name="vector-search") as s:
        results = search(q)
        s.set_output(results)

    response = client.chat.completions.create(model="gpt-4o", messages=[...])
    t.set_output(response.choices[0].message.content)
```

Mix freely — `@observe` and `with span()` compose within the same trace.

### Cross-service propagation

Pass trace context across services (e.g., API to SQS worker) via W3C traceparent:

```python
# Producer — get traceparent inside a trace
with obs.trace(name="api-request"):
    traceparent = obs.get_current_traceparent()
    sqs.send_message(Body=json.dumps({"traceparent": traceparent, ...}))

# Consumer — restore context
msg = json.loads(sqs_message["Body"])
with obs.trace(name="worker", traceparent=msg["traceparent"]):
    client.chat.completions.create(...)  # appears as child in same trace
```

Use `obs.emit_completed_span()` to backfill spans with explicit start/end times (e.g., queue wait duration).
