Metadata-Version: 2.4
Name: session-doctor
Version: 0.1.1
Summary: Automatic error recovery for AI agent sessions: detects stuck loops, rate limits, token overflows, and more.
Author-email: Water Woods <woodwater2026@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/woodwater2026/session-doctor
Project-URL: Repository, https://github.com/woodwater2026/session-doctor
Project-URL: Bug Tracker, https://github.com/woodwater2026/session-doctor/issues
Keywords: ai,agents,llm,monitoring,recovery,safety,watchdog
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Dynamic: license-file

# session-doctor

[![PyPI version](https://badge.fury.io/py/session-doctor.svg)](https://pypi.org/project/session-doctor/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Automatic error recovery for AI agent sessions.

**Loop detection. Rate limit handling. Token overflow recovery. Budget guards.**

Zero dependencies. Works with any AI agent framework.

---

## The problem

AI agent sessions get stuck and stay stuck:

| Failure | Symptom | Without Session Doctor |
|---|---|---|
| Rate limit (429) | Same API error on every retry | Burns budget, hits cap |
| Token overflow | Context window exceeded | Session crashes silently |
| JSON parse error | Malformed tool call response | Agent loops forever |
| Timeout spiral | Tool call never returns | Session hangs indefinitely |
| Auth error (401) | Invalid key / expired token | 100 retries before anyone notices |
| Budget overrun | Cost cap hit | Silent death |

Session Doctor sits alongside your agent, detects these patterns, and recovers from them automatically — before they become expensive problems.

---

## Install

```bash
pip install session-doctor
```

---

## Usage

### As a library

```python
from session_doctor import SessionDoctor, Config

# Custom config (all fields optional)
config = Config(
    error_repeat_count=3,      # trigger recovery after 3x same error
    error_window_sec=300,      # within a 5-minute window
    budget_cap=2.00,           # halt if budget exceeds $2
    max_auto_retries=2,        # max recovery attempts before HALT
    notifications_enabled=True,
    notifications_min_severity="MEDIUM",
)

doctor = SessionDoctor(config=config)
doctor.register_session("my-run", label="coding-agent")

# Feed events from your agent's output / logs
doctor.ingest_event("my-run", "Error: 429 rate limit exceeded")
doctor.ingest_event("my-run", "Error: context length exceeded max_tokens", budget_used=0.75)
```

### Recovery strategies by error type

| Error | Severity | Strategy |
|---|---|---|
| HTTP 400 / Bad Request | HIGH | Reset session |
| Token / context limit | HIGH | Compact context |
| JSON parse failure | MEDIUM | Retry with prompt fix |
| Timeout / ETIMEDOUT | MEDIUM | Exponential backoff |
| Repeated tool call | HIGH | Break loop + notify |
| Rate limit (429) | LOW | Exponential backoff |
| Auth error (401/403) | CRITICAL | Halt + notify |
| Budget exceeded | CRITICAL | Halt + notify |

### Backoff behavior

Exponential backoff: `2s → 4s → 8s → 16s → 32s (max)`.  
After `max_auto_retries` failures at any non-CRITICAL severity, escalates to HALT.

### As a CLI

```bash
# Show health dashboard for all registered sessions
session-doctor status

# Show recovery history (last 20)
session-doctor report

# Inject a test error to verify behavior
session-doctor inject my-session token_limit

# Start monitoring daemon (polls every 30s)
session-doctor monitor

# Run self-contained demo
session-doctor demo
```

Or via Python module:

```bash
python -m session_doctor status
python -m session_doctor demo
```

---

## Dashboard

```
╔══════════════════════════════════════════════════════════════╗
║             Session Doctor — Health Dashboard                ║
╠══════════════════╦════════════╦══════════╦═════════╦════════╣
║ Session          ║ Status     ║ Errors   ║ Budget  ║ Label  ║
╠══════════════════╬════════════╬══════════╬═════════╬════════╣
║ heartbeat        ║ 🟢 ok      ║ 0        ║ $0.12   ║ heart  ║
║ session-abc123   ║ 🟡 warn    ║ 2        ║ $0.34   ║ gh-iss ║
║ session-def456   ║ 🔴 error   ║ 5        ║ $1.43   ║ coding ║
╚══════════════════╩════════════╩══════════╩═════════╩════════╝

Last check: 2026-03-05 09:00:00 PST
```

---

## State

Session Doctor persists all events to a local SQLite database:

- `~/.openclaw/workspace/projects/session-doctor/state.db` — sessions, errors, recoveries
- `~/.openclaw/workspace/projects/session-doctor/session_doctor.log` — event log
- `~/.openclaw/workspace/projects/session-doctor/notifications.log` — alert history

---

## Advanced: components

```python
from session_doctor import (
    Detector,       # classify raw text → error_type
    Recoverer,      # execute recovery strategies
    StateStore,     # SQLite persistence
    Notifier,       # structured alert delivery
    ERROR_PATTERNS, # regex patterns dict
    ERROR_POLICY,   # error_type → (Severity, Strategy)
    Severity,
    Strategy,
    SessionStatus,
)

# Custom detector usage
detector = Detector()
error_type = detector.classify("Error: 429 Too Many Requests")
# → "rate_limit"

# Direct policy lookup
severity, strategy = ERROR_POLICY["token_limit"]
# → Severity.HIGH, Strategy.COMPACT_CONTEXT
```

---

## Integration

Session Doctor is designed to complement, not replace, existing tools:

- **[agent-watchdog](https://pypi.org/project/agent-watchdog/)** — loop detection + circuit breaker (run both together)
- **agent-budget-guard** — budget tracking skill (Session Doctor can subscribe to budget events)

---

## Development

```bash
git clone https://github.com/woodwater2026/session-doctor
cd session-doctor
pip install -e ".[dev]"
pytest tests/ -v
```

---

## Roadmap

- [ ] Real Watchdog hook integration
- [ ] Context compaction via OpenClaw runtime API
- [ ] Telegram notification via `message` tool
- [ ] Session process control (SIGTERM / restart via OpenClaw CLI)
- [ ] Config file loading (`config.yaml`)
- [ ] Budget Guard event subscription

---

## License

MIT © 2026 Water Woods
