Metadata-Version: 2.4
Name: langgraph-checkpoint-timeseries
Version: 0.2.0
Summary: Time-series and event-streaming checkpointers for LangGraph using TimescaleDB, QuestDB, and Kafka
Author: jersobh
License-Expression: MIT
Project-URL: Homepage, https://github.com/jersobh/langgraph-checkpoint-timeseries
Project-URL: Bug Tracker, https://github.com/jersobh/langgraph-checkpoint-timeseries/issues
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langgraph-checkpoint>=2.0.0
Requires-Dist: psycopg>=3.1.0
Requires-Dist: psycopg-pool>=3.2.0
Requires-Dist: confluent-kafka>=2.3.0
Dynamic: license-file

# LangGraph Time-Series Checkpointers

Custom [LangGraph](https://github.com/langchain-ai/langgraph) checkpointer implementations optimized for **Time-Series Databases and event-streaming backends**. Drop-in replacements for `PostgresSaver` that deliver superior write throughput and richer state history for AI agent workloads.

## Why Time-Series Memory?

Standard checkpointers (`PostgresSaver`) store state using row-level locking and WAL-based durability. This is ideal for simple apps, but **becomes a bottleneck** when you have:

- 🏭 **Multiple agents writing concurrently** (fleet management, parallel pipelines)
- 📈 **High-frequency state updates** (trading bots, real-time monitoring)
- 🔄 **Long-running agents** accumulating millions of checkpoints over weeks
- ⚡ **Event-sourced pipelines** where checkpoints are durable, replayable events

Time-Series databases and event-streaming systems are purpose-built for these exact workloads.

## Installation

```bash
pip install langgraph-checkpoint-timeseries
```

## Supported Backends

### 1. TimescaleDB (`TimescaleDBSaver`)
Perfect for relational agent memory interspersed with IoT or metrics data. Uses UNLOGGED tables and pipeline mode for high-throughput. Full PostgreSQL ecosystem compatibility.

### 2. QuestDB (`QuestDBSaver`)
Extremely high-throughput ingestion using `timestamp(ts) PARTITION BY DAY`. Ideal for write-heavy, append-only workloads with sub-millisecond query latency.

### 3. Kafka (`KafkaSaver`)
Event-sourced agent memory on Apache Kafka. Writes are fire-and-forget produce calls (no round-trip wait), making this the fastest option for pure write throughput. Ideal for event-sourced architectures where checkpoints fan out to downstream consumers.

## Quick Start

### TimescaleDB

```python
from langgraph_checkpoint_timeseries import TimescaleDBSaver

with TimescaleDBSaver.from_conn_string("postgresql://postgres:postgres@localhost:5432/postgres") as saver:
    saver.setup()
    app = workflow.compile(checkpointer=saver)
```

### QuestDB

```python
from langgraph_checkpoint_timeseries import QuestDBSaver

with QuestDBSaver.from_conn_string("postgresql://admin:quest@localhost:8812/qdb") as saver:
    saver.setup()
    app = workflow.compile(checkpointer=saver)
```

### Kafka

```python
from langgraph_checkpoint_timeseries import KafkaSaver

with KafkaSaver.from_bootstrap_servers("localhost:9092") as saver:
    saver.setup()  # creates topics, replays existing state
    app = workflow.compile(checkpointer=saver)
```

## Docker

All backends are available via Docker Compose:

```bash
docker compose up -d
```

| Service | URL |
| :--- | :--- |
| TimescaleDB | `localhost:5432` |
| QuestDB | `localhost:9000` (UI), `localhost:8812` (PG wire) |
| Kafka | `localhost:9092` |
| Kafka UI | `localhost:8080` |

## Benchmarks

Benchmarked against the standard `PostgresSaver` on the same hardware (2026-03-24):

| Scenario | PostgresSaver | TimescaleDB | QuestDB | KafkaSaver | 🏆 Winner |
| :--- | ---: | ---: | ---: | ---: | :--- |
| Sequential Writes (1K) | 345 ops/s | 411 ops/s | 341 ops/s | **37,959 ops/s** | KafkaSaver |
| Concurrent Writes (15T×200) | 333 ops/s | 1,192 ops/s | 1,145 ops/s | **20,616 ops/s** | KafkaSaver |
| High-Volume Writes (5K) | 332 ops/s | 375 ops/s | 378 ops/s | **36,044 ops/s** | KafkaSaver |
| History Query (list 100) | 6,788 ops/s | 1,259 ops/s | 577 ops/s | **93,186 ops/s** | KafkaSaver |

### Concurrent Writes — Where TSDB Shines Over Postgres

```
  PostgresSaver    333 ops/s
  TimescaleDB     ██ 1,192 ops/s
  QuestDB         ██ 1,145 ops/s
  KafkaSaver      ████████████████████████████████████████ 20,616 ops/s
```

> **KafkaSaver is ~110x faster** on write throughput. **TimescaleDB and QuestDB are ~3.5x faster** than PostgresSaver under concurrent load.

> **Note on Kafka read numbers**: The history query advantage for Kafka reflects an in-memory read projection (state is replayed from the topic at startup). Writes are genuinely faster due to async produce eliminating network round-trips. For production, pair Kafka with a secondary read store (Redis, DuckDB) for durable cross-restart reads.

Full results: see [benchmark_results.md](benchmark_results.md).

## When to Use This

| Use Case | Recommended Backend |
| :--- | :--- |
| Simple apps, prototyping | `PostgresSaver` |
| **Multi-agent, high concurrency** | **`TimescaleDBSaver`** |
| **Maximum write throughput** (IoT, trading) | **`QuestDBSaver`** |
| Full PostgreSQL ecosystem + time-series | **`TimescaleDBSaver`** |
| **Event-sourced agents, audit log, fan-out** | **`KafkaSaver`** |

### 🏭 Multi-Agent / High-Concurrency Systems
When you have **multiple AI agents writing state simultaneously** (fleet of IoT monitoring agents, parallel customer service bots), TimescaleDB and QuestDB handle write contention far better than standard Postgres. Our benchmarks show **~3.5x throughput** under 15-thread concurrent load.

### 📈 High-Frequency Decision Agents
Trading bots, real-time bidding agents, or any system making **hundreds of decisions per second** benefit from the optimized ingestion pipelines of time-series databases. UNLOGGED tables and disabled synchronous commit eliminate WAL overhead entirely.

### ⚡ Event-Sourced & Streaming Pipelines
When agent checkpoints need to be consumed by multiple downstream services (analytics, alerting, replay), `KafkaSaver` makes each state transition a first-class Kafka event. Any consumer group can subscribe independently — no DB access needed.

### 🔄 Long-Running Agents with Data Retention
Agents that run for **weeks or months** accumulate millions of checkpoints. Time-series databases offer efficient partition-based cleanup (`DROP PARTITION`) instead of expensive `DELETE` operations, keeping performance stable over time.

### 🔍 Debugging & Compliance Auditing
When you need to answer *"What was the agent thinking at 14:03:22?"*, time-series databases provide native timestamp-indexed queries. Correlate agent decisions with real-world events stored in the same database.

## Examples

See the `examples/` directory for practical demos:

- **IoT Monitoring Agent** (`examples/timescaledb_iot_agent.py`) — Streaming sensor data with time-series checkpointing
- **Algorithmic Trading Agent** (`examples/questdb_trading_agent.py`) — High-frequency state updates and rapid decision preservation
- **Event-Sourced Moderation Pipeline** (`examples/kafka_event_sourced_agent.py`) — Parallel moderation agents with Kafka audit log and fan-out

## Running Tests

```bash
# Ensure Docker services are up
docker compose up -d

# Install dependencies
pip install -e ".[dev]"

# Run all tests
pytest tests/ -v

# Run Kafka tests only
pytest tests/test_kafka.py -v
```

## License

MIT — see `LICENSE` for details.
