Metadata-Version: 2.4
Name: thinkhive
Version: 4.2.1
Summary: AI agent observability SDK with business metrics, ROI analytics, and 25+ trace format support
Home-page: https://github.com/Abdul-Omira/ThinkHiveMind
Author: ThinkHive
Author-email: support@thinkhive.ai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: opentelemetry-api>=1.20.0
Requires-Dist: opentelemetry-sdk>=1.20.0
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20.0
Requires-Dist: requests>=2.28.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# ThinkHive Python SDK v4.2.1

OpenTelemetry-based observability SDK for AI agents supporting 25 trace formats including LangSmith, Langfuse, Opik, Braintrust, Datadog, MLflow, and more.

## Installation

```bash
pip install thinkhive
```

## Quick Start

```python
import thinkhive

# Initialize SDK (sets up both OTLP tracing and REST client)
thinkhive.init(
    api_key="your-api-key",  # or set THINKHIVE_API_KEY
    agent_id="my-bot",
    service_name="my-ai-agent"
)

# Trace LLM calls
@thinkhive.trace_llm(name="generate-response", model_name="gpt-4", provider="openai")
def call_llm(prompt):
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response

# Trace retrieval operations
@thinkhive.trace_retrieval(name="search-kb", query="refund policy")
def search_documents(query):
    results = vector_db.search(query)
    return results

# Trace tool calls
@thinkhive.trace_tool(name="lookup-order", tool_name="web_search")
def search_web(query):
    return requests.get(f"https://api.example.com/search?q={query}")
```

## HTTP-Based Trace Creation

For direct trace creation with evaluation control, use the traces API:

```python
from thinkhive import traces

# Create trace with automatic evaluation
result = traces.create(
    agent_id="agent-123",
    user_message="What is the return policy?",
    agent_response="Items can be returned within 30 days.",
    outcome="success",
    run_evaluation=True  # Force evaluation on this trace
)

print(f"Trace ID: {result.id}")
if result.evaluation_queued:
    print("Evaluation will run asynchronously")

# Skip evaluation even if agent has auto_evaluate enabled
result = traces.create(
    agent_id="agent-123",
    user_message="Hello!",
    agent_response="Hi there!",
    run_evaluation=False
)

# Use agent's default auto_evaluate setting
result = traces.create(
    agent_id="agent-123",
    user_message="What are your hours?",
    agent_response="We are open 9 AM to 5 PM."
    # run_evaluation omitted - uses agent's setting
)
```

### run_evaluation Parameter

The `run_evaluation` parameter controls whether traces are automatically evaluated:

| Value | Behavior |
|-------|----------|
| `True` | Force evaluation on this trace |
| `False` | Skip evaluation even if agent has auto_evaluate |
| `None` (default) | Use agent's auto_evaluate setting |

## Environment Variables

- `THINKHIVE_API_KEY`: Your ThinkHive API key (auto-read by `thinkhive.init()` if not passed explicitly)
- `THINKHIVE_AGENT_ID`: Your agent ID (auto-read; used alongside API key for agent-scoped operations)
- `THINKHIVE_ENDPOINT`: Custom API endpoint (auto-read; default: `https://app.thinkhive.ai`). Override by passing `endpoint=` to `thinkhive.init()`

## Issues API (Clustered Failure Patterns)

The Issues API provides access to clustered failure patterns:

```python
from thinkhive import issues

# List issues for an agent
all_issues = issues.list(agent_id="agent-123")

# Get a specific issue
issue = issues.get(issue_id="issue-456")

# Create a new issue
new_issue = issues.create(
    agent_id="agent-123",
    title="Refund policy confusion",
    type="hallucination",
    severity="high"
)

# Update an issue
issues.update(
    issue_id="issue-456",
    status="in_progress",
    assignee="developer@example.com"
)

# Get fixes for an issue
fixes = issues.get_fixes(issue_id="issue-456")
```

## Analyzer API (User-Selected Analysis)

The Analyzer API provides user-selected trace analysis with cost estimation:

```python
from thinkhive import analyzer

# Estimate cost before running analysis
estimate = analyzer.estimate_cost(
    trace_ids=["trace-1", "trace-2"],
    tier="standard"
)
print(f"Estimated cost: ${estimate['estimated_cost']}")

# Analyze specific traces
analysis = analyzer.analyze(
    trace_ids=["trace-1", "trace-2"],
    tier="standard",
    include_root_cause=True
)

# Get aggregated insights
summary = analyzer.summarize(
    agent_id="agent-123",
    start_date="2024-01-01",
    end_date="2024-01-31"
)
```

## Business Metrics API

The Business Metrics API provides industry-driven metrics with historical tracking and external data support:

```python
from thinkhive.api import business_metrics

# Get current metric value with status
metric = business_metrics.get_current("agent-123", metric_name="Deflection Rate")
print(f"{metric.metric_name}: {metric.value_formatted}")

if metric.status == "insufficient_data":
    needed = metric.min_trace_threshold - metric.trace_count
    print(f"Need {needed} more traces")

# Get historical data for graphing
history = business_metrics.get_history(
    "agent-123",
    "Deflection Rate",
    start_date="2024-01-01T00:00:00Z",
    end_date="2024-01-31T23:59:59Z",
    granularity="daily"
)

print(f"{len(history.data_points)} data points")
print(f"Change: {history.summary.change_percent}%")

# Record external metric values (from CRM, surveys, etc.)
result = business_metrics.record_value(
    "agent-123",
    metric_name="CSAT/NPS",
    value=4.5,
    period_start="2024-01-01T00:00:00Z",
    period_end="2024-01-07T23:59:59Z",
    unit="score",
    source="survey_system",
    source_details={"survey_id": "survey_456", "response_count": 150}
)
print(f"Recorded: {result.id}")
```

### Metric Status Types

| Status | Description |
|--------|-------------|
| `ready` | Metric calculated and ready to display |
| `insufficient_data` | Need more traces before calculation |
| `awaiting_external` | External data source not connected |
| `stale` | Data is older than expected |

### Helper Functions

```python
from thinkhive import (
    is_metric_ready,
    needs_more_traces,
    awaiting_external_data,
    is_metric_stale,
    get_metric_status_message,
    format_metric_value
)

# Check metric status
if is_metric_ready(metric):
    print(f"Value: {metric.value_formatted}")
elif needs_more_traces(metric):
    print(get_metric_status_message(metric))
```

## ROI Analytics API

```python
from thinkhive.api import roi_analytics

# Get ROI summary
summary = roi_analytics.get_summary(
    start_date="2024-01-01T00:00:00Z",
    end_date="2024-01-31T23:59:59Z",
    agent_id="agent-123"
)
print(f"Revenue protected: ${summary.revenue_protected}")

# Get trends over time
trends = roi_analytics.get_trends(agent_id="agent-123")
for day in trends:
    print(f"{day.date}: {day.success_rate}% success")

# Calculate impact for specific data
impact = roi_analytics.calculate(
    user_message="Help me cancel my subscription",
    agent_response="I can help with that...",
    industry_config={"industry": "saas", "avg_customer_ltv": 10000}
)
```

## Calibration API

```python
from thinkhive.api import calibration

# Get calibration status for a prediction type
status = calibration.get_status("agent-123", "churn_risk")
print(f"Brier score: {status.brier_score}")
print(f"Is calibrated: {status.is_calibrated}")

# Get all calibration metrics
metrics = calibration.get_metrics("agent-123")
for m in metrics:
    print(f"{m.prediction_type}: Brier={m.brier_score:.4f}")

# Trigger recalibration
result = calibration.retrain("agent-123", prediction_types=["churn_risk"])
```

## API Key Management

```python
from thinkhive import api_keys

# Create a scoped API key
result = api_keys.create(
    name="Production Key",
    scope_type="agent",
    permissions={"read": True, "write": True, "delete": False},
    environment="production",
    allowed_agent_ids=["agent-123"]
)
print(f"Key: {result['key']}")  # Only shown once!

# List all keys
keys = api_keys.list()  # or api_keys.list_keys()

# Revoke a key
api_keys.revoke("key-id")

# Test connection
api_keys.test_connection()
```

## Quality Metrics (RAG & Hallucination Detection)

```python
from thinkhive.api import quality_metrics

# Get RAG quality scores for a trace
scores = quality_metrics.get_rag_scores("trace-123")
print(f"Groundedness: {scores.groundedness}")
print(f"Overall: {scores.overall_score} ({scores.grade})")

# Detect hallucinations
report = quality_metrics.get_hallucination_report("trace-123")
if report.has_hallucinations:
    print(f"Risk: {report.risk_level}")
    for h in report.instances:
        print(f"  - {h.type}: {h.text}")

# Ad-hoc RAG evaluation
result = quality_metrics.evaluate_rag(
    query="What is the refund policy?",
    response="You can return items within 30 days.",
    contexts=["Our refund policy allows returns within 30 days of purchase."],
    agent_id="agent-123"
)
```

## Ticket Linking

```python
from thinkhive.api import linking

# Link a run to a support ticket
linking.link_run_to_ticket(
    run_id="run-456",
    ticket_id="TICKET-789",
    method="sdk_explicit",
    platform="zendesk"
)

# Get linked ticket for a run
ticket = linking.get_linked_ticket("run-456")
print(f"Ticket: {ticket.ticket_id} (confidence: {ticket.confidence})")

# Generate Zendesk marker
marker = linking.generate_zendesk_marker("run-456")
```

## Customer Context

```python
from thinkhive.api import customer_context
from thinkhive.api import runs

# Create a customer context snapshot
snapshot = customer_context.create_snapshot(
    customer_id="cust-123",
    arr=50000,
    health_score=85,
    segment="enterprise"
)

# Use it when creating a run
run = runs.create(
    agent_id="agent-123",
    conversation_messages=[...],
    customer_context=snapshot
)

# Calculate ARR at risk from failed runs
risk = customer_context.calculate_arr_at_risk("cust-123", agent_id="agent-123")
```

## ROI Analytics V3 (Configurable)

```python
from thinkhive.api import roi_analytics

# Get/create ROI configuration
config = roi_analytics.get_config("agent-123")
roi_analytics.create_config("agent-123", {
    "costPerTask": 5.83,
    "tasksPerHour": 6,
    "deflectionRate": 0.63
})

# Calculate ROI with V3 engine
result = roi_analytics.calculate_v3("agent-123")

# Get ROI trend over time
trend = roi_analytics.get_trend_v3("agent-123", interval="weekly")
```

## Error Handling

```python
from thinkhive import (
    ThinkHiveApiError,
    PermissionDeniedError,
    AgentScopeError,
    RateLimitError,
    IpWhitelistError,
    NotFoundError,
)

try:
    result = runs.create(agent_id="agent-123", ...)
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}ms")  # or e.retry_after_ms
except AgentScopeError as e:
    print(f"No access to agent. Allowed: {e.allowed_agents}")
except NotFoundError as e:
    print(f"{e.resource_type} not found: {e.resource_id}")
except PermissionDeniedError:
    print("Insufficient permissions")
except ThinkHiveApiError as e:
    print(f"API error {e.status_code}: {e.message}")
```

## API Reference

### Core APIs

| Module | Description |
|--------|-------------|
| `traces` | Create and manage traces |
| `runs` | Run-centric trace management |
| `claims` | Facts vs inferences management |
| `calibration` | Prediction accuracy tracking |
| `issues` | Clustered failure patterns |
| `analyzer` | User-selected trace analysis |
| `business_metrics` | Industry-driven metrics with history |
| `roi_analytics` | Business ROI and financial impact (V1 + V3) |

### Quality & Integration APIs

| Module | Description |
|--------|-------------|
| `api_keys` | API key management (create, list, revoke) |
| `quality_metrics` | RAG evaluation and hallucination detection |
| `linking` | Ticket linking for runs |
| `customer_context` | Customer context snapshots and ARR tracking |
| `guardrails` | Content safety scanning |
| `eval_runs` | Evaluation run management |
| `signals` | Behavioral signal configuration and querying |
| `notifications` | Alert rules and notification management |
| `documents` | Agent document management (RAG) |
| `shadow_tests` | Shadow test execution |
| `sessions` | Trace session grouping |
| `drift` | Model/behavior drift detection |
| `llm_costs` | LLM usage cost tracking |

### Evaluation APIs

| Module | Description |
|--------|-------------|
| `human_review` | Human-in-the-loop review queues |
| `nondeterminism` | Multi-sample reliability testing |
| `eval_health` | Evaluation metric health monitoring |
| `deterministic_graders` | Rule-based evaluation |
| `conversation_eval` | Multi-turn conversation evaluation |
| `transcript_patterns` | Pattern detection in transcripts |

## Eval Runs

```python
from thinkhive.api import eval_runs

# Create and run an evaluation
run = eval_runs.create(agent_id="agent-123", confidence_level="medium")
print(f"Run ID: {run['id']}, Status: {run['status']}")

# Get results
results = eval_runs.get_results(run["id"], limit=50)

# Estimate cost before running
cost = eval_runs.estimate_cost(agent_id="agent-123", confidence_level="high")
print(f"Estimated cost: {cost['estimatedCredits']} credits")
```

## Signals

```python
from thinkhive.api import signals

# List all signals
all_signals = signals.list()  # or signals.list_signals()

# Create a custom signal
signals.create(
    name="Escalation Request",
    group="negative",
    detection_config={"type": "keywords", "keywords": ["speak to manager", "escalate"]}
)

# Get signal stats
stats = signals.get_stats(agent_id="agent-123", start_date="2026-03-01")
```

## Notifications

```python
from thinkhive import notifications

# Create an alert rule
notifications.create_rule({
    "agentId": "agent-123",
    "name": "High failure rate",
    "eventType": "failure_spike",
    "condition": {"threshold": 0.3},
    "channel": "email",
    "target": "team@company.com"
})

# List notifications
alerts = notifications.list("agent-123", unread_only=True)
# or: notifications.list_notifications("agent-123", unread_only=True)
```

## Documents (RAG)

```python
from thinkhive.api import documents

# Upload a document for RAG
documents.upload("agent-123", file_name="faq.txt", file_type="text/plain", file_size=1024)

# List documents
docs = documents.list("agent-123")  # or documents.list_documents("agent-123")
```

## Shadow Tests

```python
from thinkhive.api import shadow_tests

# Create a shadow test
shadow_tests.create(
    fix_id="fix-456",
    agent_id="agent-123",
    test_name="Refund policy test",
    input_data={"message": "How do I get a refund?"},
    expected_output="You can request a refund within 30 days."
)
```

## Sessions

```python
from thinkhive.api import sessions

# List conversation sessions
all_sessions = sessions.list("agent-123", limit=20)
# or: sessions.list_sessions("agent-123", limit=20)

# Get all traces in a session
traces = sessions.get_session_traces("session-789", "agent-123")
```

## Drift Detection

```python
from thinkhive.api import drift
from thinkhive import has_drift, get_drift_severity

# Detect drift for an agent
report = drift.detect("agent-123")
if has_drift(report):
    print(f"Drift severity: {get_drift_severity(report)}")
```

## LLM Costs

```python
from thinkhive.api import llm_costs
from thinkhive import format_cost

# Get cost summary
summary = llm_costs.summary(period="30d")
# or: llm_costs.get_summary(period="30d")
print(f"Total cost: {format_cost(summary.get('totalCost', 0))}")

# Get per-agent breakdown
breakdown = llm_costs.get_breakdown("agent-123")

# Get optimization savings
savings = llm_costs.get_savings()
```

### Error Classes

| Error | Status | When |
|-------|--------|------|
| `ThinkHiveApiError` | Any | Base API error |
| `ThinkHiveValidationError` | N/A | Client-side validation |
| `PermissionDeniedError` | 403 | Insufficient permissions |
| `AgentScopeError` | 403 | API key can't access agent |
| `RateLimitError` | 429 | Too many requests |
| `IpWhitelistError` | 403 | IP not whitelisted |
| `NotFoundError` | 404 | Resource not found |

## Upgrading

### v4.2.0 → v4.2.1

Improvements in v4.2.1:
- **Decorators** now accept `name=` parameter for custom span names (`@trace_llm(name="my-llm")`)
- **`configure()`** now accepts `service_name` parameter
- **`RateLimitError`** has `retry_after` alias (in addition to `retry_after_ms`)
- **Rule helpers** (`create_regex_rule`, etc.) now return named rule objects with `{type, name, config}` structure
- **`calculate_pass_at_k`** now supports both `(pass_rate, k)` and `(n, c, k)` calling conventions
- **`aggregate_worst`** / **`aggregate_average`** accept simple `[float]` lists in addition to `[dict]`
- **Method aliases added** for cross-SDK consistency:
  - `runs.list()` / `runs.delete()`
  - `calibration.status()` / `calibration.all_metrics()`
  - `signals.list()` / `signals.delete()`
  - `api_keys.list()`
  - `linking.create()` / `linking.get_for_run()` / `linking.stats()` / `linking.verify()` / `linking.delete()`
  - `roi_analytics.summary()` / `roi_analytics.correlations()`
  - `business_metrics.current()` / `business_metrics.history()` / `business_metrics.record()`
  - `quality_metrics.evaluate()`
  - `llm_costs.summary()`
  - `notifications.list()`
  - `documents.list()` / `documents.delete()`
  - `shadow_tests.list()`
  - `sessions.list()`
  - `customer_context.capture()`
  - `guardrails.evaluate()`

### v4.1.0 → v4.2.0

New in v4.2.0:
- **`eval_runs`** — create, list, and manage evaluation runs programmatically
- **`signals`** — create and query behavioral pattern signals
- **`notifications`** — configure alert rules and manage notifications
- **`documents`** — upload and manage RAG documents for agents
- **`shadow_tests`** — create and run shadow tests
- **`sessions`** — list and query trace sessions
- **`drift`** — detect model/behavior drift with helpers
- **`llm_costs`** — track LLM usage costs and optimization savings

### v4.0.1 → v4.1.0

New in v4.1.0:
- **`api_keys`** module for API key management
- **`quality_metrics`** module for RAG evaluation and hallucination detection
- **`linking`** module for ticket linking
- **`customer_context`** module for customer context tracking
- **ROI V3** endpoints added to `roi_analytics` (configurable ROI)
- **5 new error classes** for granular error handling
- All new modules follow existing SDK patterns

### v3 → v4.0.0

- **Single `init()` call** sets up both OTLP tracing and REST client
- **Default endpoint** changed to `https://app.thinkhive.ai`
- **OTLP init** now requires `api_key`

## License

MIT
