Built on Sandstorm

Stop babysitting
your AI agents.

Sandcastle runs your agent workflows so you don't have to. Parallel execution, budget guardrails, human-in-the-loop approvals, policy enforcement - all out of the box. Just sandcastle serve and you're live.

Quick Start
$ pip install sandcastle-ai $ sandcastle serve

That's it. Two commands. No PostgreSQL, Redis, or S3 needed - Sandcastle auto-detects empty config and starts in local mode with SQLite + in-process queue.

Or from source: git clone https://github.com/gizmax/Sandcastle.git && cd Sandcastle && uv sync && uv run sandcastle serve

SQLite database In-process queue Filesystem storage

Why Sandcastle?

Sandstorm is a brilliant piece of engineering - one API call, a full agent, completely sandboxed. It nails the core problem: giving agents full system access without worrying about what they do with it.

But when you start building real products on top of it, the glue code piles up fast. You need orchestration, retries, scheduling, cost controls, approvals, and monitoring - none of which are Sandstorm's job.

Sandstorm = the engine.

Sandcastle = the product you build with it.

  • ๐Ÿ”—Multi-step pipelines - "Step A scrapes, step B enriches, step C scores."
  • โšกParallel fan-out - "Fan out over 50 leads, then merge results."
  • ๐Ÿ’ฐCost tracking - "Bill per enrichment, track spend per run."
  • ๐Ÿ”„Retry with backoff - "Alert me if it fails, retry automatically."
  • โฐScheduling - "Run this every 6 hours, POST results to Slack."
  • ๐Ÿ‘คHuman approvals - "A human should review this before continuing."
  • ๐Ÿ›ก๏ธPolicy enforcement - "Block the output if it contains PII."
  • ๐ŸŽฏModel routing - "Pick the cheapest model that meets quality SLOs."

Everything you need for production agents

Sandcastle adds the orchestration layer Sandstorm doesn't need to have.

๐Ÿ”€

DAG Workflow Engine

Define multi-step pipelines in YAML. Dependencies, parallel branches, data passing between steps.

โšก

Parallel Execution

Steps at the same DAG layer run concurrently. Fan out over lists with configurable concurrency.

โช

Time Machine

Replay from any step or fork with changes. Checkpoints restore prior outputs so you only pay for what re-runs.

๐Ÿ‘ค

Approval Gates

Pause workflows for human review. Approve, reject, or edit the data before the next step runs.

๐Ÿงช

AutoPilot

A/B test models and prompts per step. Automatic quality evaluation and best-variant deployment.

๐Ÿ›ก๏ธ

Policy Engine

Declarative rules for PII redaction, secret blocking, cost guards. Applied per step or globally.

๐ŸŽฏ

Cost-Latency Optimizer

SLO-based model routing. Define quality, cost, and latency constraints - let the optimizer pick the model.

๐Ÿš€

Zero-Config Local Mode

Just sandcastle serve. Auto-detects local vs production - SQLite or PostgreSQL, in-process or Redis.

Workflows as YAML

No SDKs, no boilerplate. Declare your pipeline, Sandcastle handles the rest.

lead-enrichment.yaml
name: "Lead Enrichment"
sandstorm_url: "${SANDSTORM_URL}"
default_model: sonnet
default_timeout: 300

steps:
  - id: "scrape"
    prompt: |
      Visit {input.target_url} and extract:
      company name, employees, product, contact.
    output_schema:
      type: object
      properties:
        company_name: { type: string }
        employees: { type: integer }

  - id: "enrich"
    depends_on: ["scrape"]
    prompt: |
      Given: {steps.scrape.output}
      Research revenue, industry, decision makers.
    retry:
      max_attempts: 3
      backoff: exponential

  - id: "score"
    depends_on: ["enrich"]
    prompt: "Score this lead 1-100: {steps.enrich.output}"
    model: haiku  # cheaper for simple scoring

on_complete:
  storage_path: "leads/{run_id}/result.json"

Three steps. One file. Production-ready.

Each step runs inside a Sandstorm sandbox - full system access, completely isolated. Sandcastle handles the orchestration between them.

  • Dependencies - steps run in order or parallel based on depends_on
  • Data passing - reference prior outputs with {steps.id.output}
  • Schema validation - enforce structured output per step
  • Automatic retries - exponential backoff on failure
  • Per-step models - use expensive models only where it matters
  • Persistent storage - results saved to disk (local) or S3 (production)

See everything. Control everything.

Runs, costs, schedules, dead letters, approvals, experiments, policy violations - all in one place. Includes a visual workflow builder for drag-and-drop pipeline design.

Boring tech, reliable results

No exotic dependencies. Battle-tested tools you already know. Local mode needs zero infrastructure.

๐Ÿ
Python 3.12
API Server
โšก
FastAPI
REST + SSE
๐Ÿ—„๏ธ
SQLite / PostgreSQL
Local / Production DB
๐Ÿ”ด
In-process / Redis
Local / Production Queue
๐Ÿ“ฆ
Filesystem / S3
Local / Production Storage
โš›๏ธ
React + TS
Dashboard
๐ŸŽจ
Tailwind CSS
Styling
๐ŸŒช๏ธ
Sandstorm
Agent Runtime

Free forever. Seriously.

Sandcastle is fully open source under the MIT license. No paid tiers, no feature gates, no "contact sales" buttons. Every feature you see on this page ships in the free version - because there is only one version.

โ˜•

Buy me a coffee

I built Sandcastle in my free time and I plan to keep improving it - new features, better docs, bug fixes, community support. If it saves you time or you just think it's cool, a coffee goes a long way.

It keeps me caffeinated and motivated to ship the next update.

โ˜• Buy Me a Coffee
โ˜• Fuels new features ๐Ÿ› Faster bug fixes ๐Ÿ“– Better documentation