Metadata-Version: 2.4
Name: veronica-core
Version: 0.7.1
Summary: Enforcement hooks for LLM agent runs. Budget limits, concurrency gating, and degradation control.
Project-URL: Repository, https://github.com/amabito/veronica-core
Project-URL: Issues, https://github.com/amabito/veronica-core/issues
Author: amabito
License-Expression: MIT
License-File: LICENSE
Keywords: agent,budget,enforcement,guardrails,llm,runtime
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Provides-Extra: dev
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# VERONICA

## LLM agents don't fail because of prompts. They fail because nothing stops them.

You don't lose money because your model hallucinated.
You lose money because it retried itself 3,000 times.

---

## The $12K Weekend Problem

It's Monday morning.

Your agent:
- hit a transient API failure
- retried with exponential backoff
- spawned subcalls
- looped on tool failures
- ignored budget signals

Observability tells you what happened.

**VERONICA makes sure it never happens.**

---

## What VERONICA Actually Does

VERONICA sits between your agent and the model.

It enforces execution safety.

- **Hard budget enforcement** (org / team / user / service)
- **Circuit breaker** on model instability
- **Retry containment**
- **Loop termination**
- **Tool timeout enforcement**
- **Degrade levels** (NORMAL / SOFT / HARD / EMERGENCY)

Not logging.
Not tracing.
**Stopping.**

---

## Quickstart (5 minutes)

### Install

```bash
pip install veronica-core
```

### All features, all opt-in

Every feature is disabled by default. Enable only what you need.

```python
from veronica_core import (
    ShieldConfig,
    BudgetWindowHook,
    AdaptiveBudgetHook,
    TimeAwarePolicy,
)
from veronica_core.shield import Decision, ShieldPipeline, ToolCallContext

config = ShieldConfig()

config.budget_window.enabled = True         # Call-count ceiling
config.budget_window.max_calls = 5
config.budget_window.window_seconds = 60.0

config.token_budget.enabled = True          # Token ceiling
config.token_budget.max_output_tokens = 500

config.input_compression.enabled = True     # Compress oversized inputs

config.adaptive_budget.enabled = True       # Auto-tighten on repeated HALTs

config.time_aware_policy.enabled = True     # Weekend / off-hour reduction

# Wire hooks
budget_hook = BudgetWindowHook(
    max_calls=config.budget_window.max_calls,
    window_seconds=config.budget_window.window_seconds,
)
adaptive = AdaptiveBudgetHook(base_ceiling=config.budget_window.max_calls)
time_policy = TimeAwarePolicy()
pipe = ShieldPipeline(pre_dispatch=budget_hook)

# Simulate: agent tries 6 calls (ceiling is 5)
for i in range(6):
    ctx = ToolCallContext(request_id=f"call-{i+1}", tool_name="llm")
    decision = pipe.before_llm_call(ctx)
    print(f"Call {i+1}: {decision.name}")
    if decision == Decision.HALT:
        break

# Safety events generated by the pipeline
for ev in pipe.get_events():
    print(f"  -> {ev.event_type} / {ev.decision.value}")

# Feed events into adaptive hook
for ev in pipe.get_events():
    adaptive.feed_event(ev)
result = adaptive.adjust()
print(f"Adaptive: {result.action}, ceiling={result.adjusted_ceiling}")

# Export state for observability dashboards
state = adaptive.export_control_state(
    time_multiplier=time_policy.evaluate(ctx).multiplier,
)
print(f"State: ceiling={state['adjusted_ceiling']}, "
      f"multiplier={state['effective_multiplier']}")
```

### Expected output

```
Call 1: ALLOW
Call 2: ALLOW
Call 3: ALLOW
Call 4: ALLOW
Call 5: DEGRADE
Call 6: HALT
  -> BUDGET_WINDOW_EXCEEDED / DEGRADE
  -> BUDGET_WINDOW_EXCEEDED / HALT
Adaptive: hold, ceiling=5
State: ceiling=5, multiplier=1.0
```

Output varies by time of day. During off-hours (outside 09:00-18:00 UTC) or weekends,
`TimeAwarePolicy` applies a multiplier < 1.0, which reduces `ceiling` in the exported state.

Events you may see in production:
- `BUDGET_WINDOW_EXCEEDED` -- call ceiling reached (DEGRADE or HALT)
- `TOKEN_BUDGET_EXCEEDED` -- token ceiling reached
- `TIME_POLICY_APPLIED` -- weekend or off-hour multiplier active
- `INPUT_COMPRESSED` / `INPUT_TOO_LARGE` -- input size gate triggered
- `ADAPTIVE_ADJUSTMENT` -- ceiling auto-adjusted (tighten / loosen)

See [docs/adaptive-control.md](docs/adaptive-control.md) for the full event reference.

### What to read next

- [docs/cookbook.md](docs/cookbook.md) -- copy-paste recipes for common patterns
- [docs/adaptive-control.md](docs/adaptive-control.md) -- full engineering doc for v0.7.0 stabilization
- [examples/adaptive_demo.py](examples/adaptive_demo.py) -- v0.7.0 demo (cooldown, direction lock, anomaly, replay)
- [examples/token_budget_minimal_demo.py](examples/token_budget_minimal_demo.py) -- token ceiling + minimal response
- [examples/budget_degrade_demo.py](examples/budget_degrade_demo.py) -- call ceiling + model fallback
- [examples/input_compression_skeleton_demo.py](examples/input_compression_skeleton_demo.py) -- input compression

---

## Ship Readiness (v0.7.1)

- [x] BudgetWindow stops runaway execution (ceiling enforced)
- [x] SafetyEvent records structured evidence for non-ALLOW decisions
- [x] DEGRADE supported (fallback at threshold, HALT at ceiling)
- [x] TokenBudgetHook: cumulative output/total token ceiling with DEGRADE zone
- [x] MinimalResponsePolicy: opt-in conciseness constraints for system messages
- [x] InputCompressionHook: real compression with Compressor protocol + safety guarantees (v0.5.1)
- [x] AdaptiveBudgetHook: auto-adjusts ceiling based on SafetyEvent history (v0.6.0)
- [x] TimeAwarePolicy: weekend/off-hours budget multipliers (v0.6.0)
- [x] Adaptive stabilization: cooldown, smoothing, floor/ceiling, direction lock (v0.7.0)
- [x] Anomaly tightening: spike detection with temporary ceiling reduction (v0.7.0)
- [x] Deterministic replay: export/import control state for observability (v0.7.0)
- [x] PyPI auto-publish on GitHub Release
- [x] Everything is opt-in & non-breaking (default behavior unchanged)

590 tests passing. Minimum production use-case: runaway containment + graceful degrade + auditable events + token budgets + input compression + adaptive ceiling + time-aware scheduling + anomaly detection.

---

## Token Budget + Minimal Response Demo (30 seconds)

```bash
pip install -e .
python examples/token_budget_minimal_demo.py
```

```
--- TokenBudgetHook demo ---
  Tokens used:    0 / 100  -> ALLOW
  Tokens used:   70 / 100  -> ALLOW
  Tokens used:   80 / 100  -> DEGRADE  (80% threshold reached)
  Tokens used:   95 / 100  -> DEGRADE
  Tokens used:  100 / 100  -> HALT  (ceiling reached)

  SafetyEvent: TOKEN_BUDGET_EXCEEDED / DEGRADE / TokenBudgetHook
  SafetyEvent: TOKEN_BUDGET_EXCEEDED / DEGRADE / TokenBudgetHook
  SafetyEvent: TOKEN_BUDGET_EXCEEDED / HALT    / TokenBudgetHook

--- MinimalResponsePolicy demo ---
  [disabled] system message unchanged: You are a helpful assistant.
  [enabled]  system message with constraints injected
```

---

## Input Compression Skeleton Demo (30 seconds)

```bash
pip install -e .
python examples/input_compression_skeleton_demo.py
```

```
--- InputCompressionHook demo ---
  Short input (22 tokens)  -> ALLOW
  Medium input (750 tokens) -> DEGRADE  (compression suggested)
  Large input (1250 tokens)  -> HALT  (input too large)

  Evidence (HALT):
    estimated_tokens: 1250
    input_sha256: c59d3c04...  (raw text NOT stored)
    decision: HALT
```

---

## Budget + Degrade Demo (30 seconds)

```bash
pip install -e .
python examples/budget_degrade_demo.py
```

```
Call  1 / model=gpt-4        -> ALLOW
Call  2 / model=gpt-4        -> ALLOW
Call  3 / model=gpt-4        -> ALLOW
Call  4 / model=gpt-4        -> ALLOW
Call  5 / model=gpt-4        -> DEGRADE (fallback to gpt-3.5-turbo)
Call  6 / model=gpt-3.5-turbo -> HALT
SafetyEvent: BUDGET_WINDOW_EXCEEDED / DEGRADE / BudgetWindowHook
SafetyEvent: BUDGET_WINDOW_EXCEEDED / HALT   / BudgetWindowHook
```

---

## Runaway Loop Demo (veronica_core)

```bash
pip install -e .
python examples/budget_degrade_demo.py
```

```python
from veronica_core import BudgetWindowHook
from veronica_core.shield import Decision, ToolCallContext

# 5 calls per minute hard limit
hook = BudgetWindowHook(max_calls=5, window_seconds=60.0)

for i in range(100):  # agent would loop forever
    ctx = ToolCallContext(request_id=f"call-{i+1}", tool_name="llm")
    decision = hook.before_llm_call(ctx)
    if decision == Decision.HALT:
        print(f"HALTED after {i} calls")
        break
    print(f"Call {i+1}: ALLOW" if decision is None else f"Call {i+1}: {decision.name}")
```

```
Call 1: ALLOW
Call 2: ALLOW
Call 3: ALLOW
Call 4: ALLOW
Call 5: DEGRADE
HALTED after 5 calls
```

Without VERONICA: infinite retries, $12,000 bill.
With VERONICA: 5 calls, hard stop, zero damage.

---

## Full Demo (Adaptive Budget)

```bash
python examples/adaptive_demo.py
```

| Demo | What happens |
|------|-------------|
| Basic tighten/loosen | Budget exceeded events reduce ceiling; no events loosen it back |
| Cooldown window | Rapid adjustments are rate-limited |
| Direction lock | Prevents premature loosening after tighten |
| Anomaly spike | Sudden event burst triggers temporary ceiling reduction |
| Export/import state | Full control state round-trip for observability |
| Event audit trail | All adjustment decisions recorded as SafetyEvents |

---

## Observability vs Enforcement

|                    | Observability Tools | VERONICA |
|--------------------|---------------------|----------|
| Acts when          | After failure       | **Before damage** |
| Prevents cost loss | No                  | **Yes** |
| Stops runaway loop | No                  | **Yes** |
| Circuit breaker    | No                  | **Yes** |
| Hard budget stop   | No                  | **Yes** |

Observability explains the fire.

**VERONICA pulls the fuse.**

---

## Integration

VERONICA sits between your agent and the model. Hook-based pipeline.

```python
from veronica_core import ShieldConfig, VeronicaIntegration
from veronica_core.shield import (
    BudgetWindowHook,
    TokenBudgetHook,
    AdaptiveBudgetHook,
)

# Configure all shields declaratively
config = ShieldConfig()
config.budget_window.enabled = True
config.budget_window.max_calls = 100
config.token_budget.enabled = True
config.token_budget.max_output_tokens = 50_000

# Or load from YAML/JSON
config = ShieldConfig.from_yaml("shield.yaml")

# Wire into your agent
integration = VeronicaIntegration(shield=config)
```

Drop-in enforcement layer. All features opt-in and disabled by default.

---

## Why This Category Matters

As agents become autonomous, retries compound.

A single transient failure can:
- explode cost
- cascade into recursive calls
- bypass soft limits
- create orphan state
- burn through budget at 3 AM with no one watching

This is not a prompt problem.

It is an **execution control** problem.

VERONICA defines the Enforcement Layer category.

---

## Roadmap

**v0.8.x**
- OpenTelemetry export (opt-in SafetyEvent export)
- Multi-agent coordination (shared budget pools)
- Webhook notifications on HALT/DEGRADE

**v0.9.x**
- Redis-backed distributed budget enforcement
- Middleware mode (ASGI/WSGI)
- Dashboard for real-time shield status

---

## Install

```bash
pip install -e .

# With dev tools
pip install -e ".[dev]"
pytest
```

![CI](https://img.shields.io/badge/tests-590%20passing-brightgreen)
![Coverage](https://img.shields.io/badge/coverage-92%25-brightgreen)
![Python](https://img.shields.io/badge/python-3.10%2B-blue)

---

### v0.7.0 — Adaptive Budget Stabilization

Adaptive budget control with production-grade stabilization.
[Full engineering doc](docs/adaptive-control.md)

New features:
- **Cooldown window**: minimum interval between adjustments (prevents oscillation)
- **Adjustment smoothing**: per-step cap on multiplier change (gradual convergence)
- **Hard floor/ceiling**: absolute bounds on multiplier
- **Direction lock**: blocks loosen after tighten until exceeded events clear
- **Anomaly tightening**: spike detection with temporary ceiling reduction + auto-recovery
- **Deterministic replay**: export/import control state for observability dashboards

```bash
python examples/adaptive_demo.py
```

---

### v0.4.0 — Execution Shield Foundation

Design and diagrams:
[docs/v0.4.0-technical-artifacts.md](docs/v0.4.0-technical-artifacts.md)

SafeModeHook is optional and disabled by default.
BudgetWindowHook is optional and disabled by default.
DEGRADE support allows model fallback before hard stop.
[Execution Boundary concept](docs/execution-boundary.md)

---

## License

MIT
