Metadata-Version: 2.4
Name: bunkervm
Version: 0.9.4
Summary: Hardware-isolated Linux sandbox for AI agents — Firecracker MicroVM + MCP
Author: Ashish
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/ashishgituser/bunkervm
Project-URL: Documentation, https://github.com/ashishgituser/bunkervm#readme
Project-URL: Issues, https://github.com/ashishgituser/bunkervm/issues
Keywords: mcp,sandbox,firecracker,ai,agent,microvm,isolation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: System :: Emulators
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: langgraph
Requires-Dist: langchain-openai>=0.1.0; extra == "langgraph"
Requires-Dist: langgraph>=0.1.0; extra == "langgraph"
Requires-Dist: langchain-core>=0.1.0; extra == "langgraph"
Requires-Dist: python-dotenv>=1.0.0; extra == "langgraph"
Provides-Extra: openai-agents
Requires-Dist: openai-agents>=0.1.0; extra == "openai-agents"
Requires-Dist: python-dotenv>=1.0.0; extra == "openai-agents"
Provides-Extra: crewai
Requires-Dist: crewai>=0.50.0; extra == "crewai"
Requires-Dist: python-dotenv>=1.0.0; extra == "crewai"
Provides-Extra: all
Requires-Dist: langchain-openai>=0.1.0; extra == "all"
Requires-Dist: langgraph>=0.1.0; extra == "all"
Requires-Dist: langchain-core>=0.1.0; extra == "all"
Requires-Dist: openai-agents>=0.1.0; extra == "all"
Requires-Dist: crewai>=0.50.0; extra == "all"
Requires-Dist: python-dotenv>=1.0.0; extra == "all"
Dynamic: license-file

<p align="center">
  <img src="docs/logo.svg" alt="BunkerVM" width="120" />
</p>

<h1 align="center">BunkerVM</h1>

<p align="center">
  <strong>Time-travel debugging for AI agent sandboxes.</strong><br>
  Hardware-isolated Firecracker microVMs with snapshot, replay, and diff — not containers.
</p>

<p align="center">
  <a href="https://pypi.org/project/bunkervm/"><img src="https://img.shields.io/pypi/v/bunkervm?color=7c5cfc" alt="PyPI"></a>
  <a href="https://github.com/ashishgituser/bunkervm/actions/workflows/ci.yml"><img src="https://github.com/ashishgituser/bunkervm/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="https://github.com/ashishgituser/bunkervm"><img src="https://img.shields.io/github/stars/ashishgituser/bunkervm?style=social" alt="Stars"></a>
  <img src="https://img.shields.io/badge/isolation-hardware%20(KVM)-22d3ee" alt="Isolation">
  <img src="https://img.shields.io/badge/python-3.10+-blue" alt="Python">
  <a href="https://github.com/ashishgituser/bunkervm/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache--2.0-green" alt="License"></a>
</p>

---

## The problem

AI agents execute code on your machine. When something goes wrong — and it will — you have no way to see **what the agent actually did**, rewind to the moment **before** it broke, or compare **why one agent succeeded and another failed**.

Containers share your kernel ([escapes are real](https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=docker+escape)).  
Cloud sandboxes send your data to someone else's server.  
Neither gives you observability into agent behaviour.

**BunkerVM** solves all three: isolation, observability, and time-travel.

---

## What it does

Each sandbox is a [Firecracker](https://firecracker-microvm.github.io/) microVM — the same technology behind AWS Lambda. Own kernel, own filesystem, hardware-level (KVM) isolation. Not a container.

On top of that, BunkerVM adds capabilities that no other sandbox provides:

### Record every execution

```python
from bunkervm import Sandbox

with Sandbox(record=True) as sb:
    sb.run("import pandas as pd")
    sb.run("df = pd.read_csv('/data/input.csv')")
    sb.run("df['total'] = df.price * df.qty")
    sb.run("df.to_csv('/output/result.csv')")

# Every step recorded: command, output, filesystem changes, VM snapshot
```

### Rewind to any point

```python
sb.restore(step=2)  # VM state rewinds to after read_csv
sb.run("df.describe()")  # explore from that exact point
```

The VM's memory, CPU registers, filesystem — everything reverts to exactly what it was after step 2. Not a re-run. An actual restore from a Firecracker snapshot.

### See what changed

```python
for cp in sb.history():
    print(f"step {cp['step']}: {cp['command']}")
    if cp['trace']:
        for f in cp['trace']['files_created']:
            print(f"  + {f['path']} ({f['size']} bytes)")
```

```
step 1: import pandas as pd
step 2: df = pd.read_csv('/data/input.csv')
  ~ /data/input.csv (read)
step 3: df['total'] = df.price * df.qty
step 4: df.to_csv('/output/result.csv')
  + /output/result.csv (1247 bytes)
```

### Compare two agents

```bash
bunkervm diff session-abc session-def
```

```
Agent Diff
  Session A: abc  (12 steps, 3400ms)
  Session B: def  (8 steps, 1200ms)

  Files only in A:  /tmp/debug.log, /tmp/retry_3.py
  Files only in B:  /output/result.csv

  step  1  [same]  import pandas as pd
  step  2  [same]  df = pd.read_csv('/data/input.csv')
  step  3  [diff]
    A: df = df.dropna()
    B: df = df.fillna(0)
  step  4  [diff]
    A: # crashed — KeyError: 'total'
    B: df['total'] = df.price * df.qty  ← OK
```

Agent A dropped rows and lost a required column. Agent B filled missing values and succeeded. Without diff, you'd never know why.

---

## Quick start

```bash
pip install bunkervm
```

```python
from bunkervm import run_code

result = run_code("print('Hello from a microVM!')")
print(result)  # Hello from a microVM!
```

VM boots, code runs, VM dies. Your host was never touched.

---

## How it works

```
AI Agent
   │
   ▼
bunkervm (host)  ──vsock──▶  Firecracker MicroVM
   │                          ┌────────────────────┐
   │  record=True             │  Alpine Linux       │
   │  ─────────▶              │  Own kernel         │
   │  snapshot()              │  exec_agent.py      │
   │  trace()                 │  (filesystem trace) │
   │  restore()               └────────────────────┘
   │                          KVM hardware isolation
   ▼
~/.bunkervm/sessions/         ~/.bunkervm/snapshots/
  session-abc.json              step1/ vmstate + memory
  session-def.json              step2/ vmstate + memory
```

**Firecracker** provides the isolation. BunkerVM adds the instrumentation layer:

| Layer | What it does |
|---|---|
| **exec_agent** (inside VM) | Traces filesystem changes per command — files created, modified, deleted, bytes written |
| **Firecracker API** (host→VM) | Pauses VM, snapshots CPU + memory state to disk, resumes — all via Firecracker's built-in snapshot API |
| **Snapshot manager** (host) | Stores and indexes snapshots at `~/.bunkervm/snapshots/`, manages lifecycle |
| **Session recorder** (host) | Chains commands → traces → snapshots into a replayable session JSON |

No custom kernel modules. No eBPF. No ptrace. The VM is the isolation boundary; the API socket is the control plane. Pure Python, stdlib-only transport.

---

## The four capabilities

### 1. Filesystem tracing

Every command execution can return a trace of what changed on disk.

```python
result = client.exec("python3 train.py", trace=True)
print(result["trace"])
# {
#   "files_created": [{"path": "/output/model.pkl", "size": 4820}],
#   "files_modified": [{"path": "/tmp/loss.log", "old_size": 0, "new_size": 312}],
#   "files_deleted": [],
#   "bytes_written": 5132
# }
```

This happens inside the VM — a pre/post filesystem snapshot diff. No host-side hooks, no strace, no overhead on non-traced commands.

### 2. VM snapshots

Full VM state (CPU, memory, filesystem) saved to disk. Restore boots a new Firecracker process from that state instead of cold-booting.

```python
from bunkervm import Sandbox

with Sandbox() as sb:
    sb.run("import torch; model = torch.load('bert.pt')")
    sb.checkpoint("model-loaded")       # snapshot: 45ms
    sb.run("output = model(bad_input)") # crashes
    sb.restore(step=1)                  # restore: <100ms
    sb.run("output = model(good_input)")# works
```

Snapshot = Firecracker's native `PUT /snapshot/create`. Not a filesystem copy. The memory file is sparse and CoW-friendly.

### 3. Session recording & replay

`record=True` automatically chains traces and snapshots into a session timeline.

```python
with Sandbox(record=True) as sb:
    sb.run("x = 42")
    sb.run("print(x * 2)")
    sb.run("open('/output/result.txt', 'w').write(str(x))")

# Session auto-saved to ~/.bunkervm/sessions/
```

```bash
bunkervm replay a1b2c3 --trace
```

```
Session: a1b2c3
  Steps: 3
  Recorded: 2026-03-29 14:30

Timeline:

  📸 step   1  [ok]     12ms  python3 /tmp/_runner.py
  📸 step   2  [ok]      8ms  python3 /tmp/_runner.py
  📸 step   3  [ok]     15ms  python3 /tmp/_runner.py
            + 1 files created (42 bytes)
              + /output/result.txt (42b)
```

Each 📸 = a restorable VM snapshot. You can `restore(step=2)` and branch from there.

### 4. Agent diff

Run the same task with two different agents (or prompts, or models). Record both. Diff.

```bash
bunkervm diff session-gpt4 session-claude --format json
```

The diff shows: which files each agent created, which steps diverged, which agent was faster, and where failures happened. This is how you debug agent quality — not by reading logs, but by comparing filesystem-level behaviour.

---

## Framework integrations

Every integration auto-boots a VM and exposes 6 sandboxed tools. One base class, identical behaviour across frameworks.

<details>
<summary><strong>LangChain / LangGraph</strong></summary>

```bash
pip install bunkervm[langgraph] langchain-openai
```

```python
from bunkervm.langchain import BunkerVMToolkit

with BunkerVMToolkit() as toolkit:
    tools = toolkit.get_tools()  # run_command, write_file, read_file, ...
    # pass tools to your agent
```

</details>

<details>
<summary><strong>OpenAI Agents SDK</strong></summary>

```bash
pip install bunkervm[openai-agents]
```

```python
from bunkervm.openai_agents import BunkerVMTools

tools = BunkerVMTools()
agent_tools = tools.get_tools()
# ...
tools.stop()
```

</details>

<details>
<summary><strong>CrewAI</strong></summary>

```bash
pip install bunkervm[crewai]
```

```python
from bunkervm.crewai import BunkerVMCrewTools

tools = BunkerVMCrewTools()
crew_tools = tools.get_tools()
# ...
tools.stop()
```

</details>

<details>
<summary><strong>Claude Desktop / VS Code Copilot (MCP)</strong></summary>

```bash
bunkervm vscode-setup     # generates .vscode/mcp.json, works on Windows WSL2
bunkervm server            # stdio for Claude Desktop
bunkervm server --transport sse  # SSE for web
```

8 MCP tools: `sandbox_exec`, `sandbox_write_file`, `sandbox_read_file`, `sandbox_list_dir`, `sandbox_upload_file`, `sandbox_download_file`, `sandbox_status`, `sandbox_reset`.

</details>

```bash
pip install bunkervm[all]  # all framework integrations
```

---

## Install

```bash
pip install bunkervm
```

**Requirements:** Linux with `/dev/kvm`, or Windows WSL2 ([enable nested virtualization](https://learn.microsoft.com/en-us/windows/wsl/wsl-config#main-wsl-settings)). Python 3.10+.

The Firecracker binary + kernel + rootfs (~100MB) auto-download on first run. Or download from [Releases](https://github.com/ashishgituser/bunkervm/releases).

<details>
<summary><strong>WSL2 setup (Windows)</strong></summary>

Add to `%USERPROFILE%\.wslconfig`:
```ini
[wsl2]
nestedVirtualization=true
```
Then: `wsl --shutdown`

</details>

<details>
<summary><strong>Troubleshooting</strong></summary>

| Problem | Fix |
|---|---|
| `/dev/kvm` not found | `sudo modprobe kvm` or enable nested virtualization |
| Permission denied | `sudo usermod -aG kvm $USER` then re-login |
| Bundle download fails | Manual download from [Releases](https://github.com/ashishgituser/bunkervm/releases) → `~/.bunkervm/bundle/` |
| VM won't start | `bunkervm info` — diagnoses all prerequisites |

</details>

<details>
<summary><strong>Build from source</strong></summary>

```bash
git clone https://github.com/ashishgituser/bunkervm.git
cd bunkervm
sudo bash build/setup-firecracker.sh
sudo bash build/build-sandbox-rootfs.sh
pip install -e ".[dev]"
pytest tests/
```

</details>

---

## CLI

```
bunkervm demo                              # see it in action
bunkervm run script.py                     # run a script in a sandbox
bunkervm run -c "print(42)"               # inline code
bunkervm replay <session-id> --trace       # replay recorded session
bunkervm diff <session-a> <session-b>      # compare two agent runs
bunkervm snapshot list                     # list VM snapshots
bunkervm snapshot delete <name>            # delete a snapshot
bunkervm server --transport sse            # MCP server
bunkervm info                              # system readiness check
```

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

## Security

See [SECURITY.md](SECURITY.md).

## License

Apache-2.0

---

<p align="center">
  <strong>If BunkerVM helps you build safer agents, <a href="https://github.com/ashishgituser/bunkervm">star the repo</a></strong>
</p>
