Metadata-Version: 2.4
Name: tappi
Version: 0.8.2
Summary: Lightweight CDP browser control for Python — with an AI agent that can browse, read PDFs, manage files, and automate tasks.
Project-URL: Homepage, https://github.com/shaihazher/tappi
Project-URL: Repository, https://github.com/shaihazher/tappi
Project-URL: Issues, https://github.com/shaihazher/tappi/issues
Author-email: Azeruddin Sheik <shaihazher@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: agent,ai,automation,browser,cdp,chrome,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Requires-Dist: apscheduler>=3.10.0
Requires-Dist: boto3>=1.28.0
Requires-Dist: fastapi>=0.110.0
Requires-Dist: litellm>=1.40.0
Requires-Dist: mcp>=1.0
Requires-Dist: openpyxl>=3.1.0
Requires-Dist: pymupdf>=1.24.0
Requires-Dist: uvicorn[standard]>=0.27.0
Requires-Dist: weasyprint>=62.0
Requires-Dist: websockets>=12.0
Description-Content-Type: text/markdown

# tappi

**Your own AI agent that controls a real browser and manages files — running entirely on your machine.**

🌐 **[tappi.synthworx.com](https://tappi.synthworx.com)** — Official home page & docs. Tappi is and will always be fully open source (MIT).

Give it a task in plain English. It opens your browser, navigates pages, clicks buttons, fills forms, reads content, creates PDFs, updates spreadsheets, and schedules recurring jobs. All your logins and cookies carry over. Everything stays local — your data never leaves your machine.

Think of it as a personal automation assistant with two superpowers: **browser control** and **file management**, sandboxed to one directory. Secure enough for work. Powerful enough to replace most browser automation scripts you've ever written.

### Why tappi?

Every AI browser tool today pays a tax — either in tokens or in capability:

- **Screenshot-based agents** (Operator, Computer Use) send full page images to the LLM. The model squints at pixels, guesses coordinates, and prays it clicks the right button. A single interaction can burn 5-10K tokens on vision alone.
- **DOM/accessibility tree tools** (Playwright MCP, browser tools) dump the entire page structure into context. A single Reddit page can produce 50K+ tokens of nested elements. The LLM reads a novel just to find a button.

Tappi does neither. It indexes interactive elements into a compact numbered list:

```
[0] (link) Homepage → https://github.com/
[1] (button) Sign in
[2] (link) Explore → /explore
[3] (button) Submit Order
```

The LLM says `click 3`. Done. **~200 tokens instead of 5-50K.** That's the difference.

- **10x more token-efficient** than both screenshot-based and DOM-dump approaches. Structured element lists give the model exactly what it needs — nothing more.
- **Better LLM decisions.** Numbered elements with semantic labels (`[3] (button) Submit Order`) are unambiguous. No hallucinated CSS selectors. No coordinate guessing. No wading through thousands of DOM nodes.
- **Real browser, real sessions.** Connects to Chrome via CDP — your saved logins, cookies, and extensions are all there. Log in once, automate forever.
- **Sandboxed by design.** One workspace directory. One browser. No filesystem access beyond the sandbox. Safe for corporate environments where you can't install full automation platforms.
- **Works everywhere.** Linux, macOS, Windows. Python 3.10+. Single `pip install`.

```bash
pip install tappi            # Everything: CDP + MCP server + AI agent
```

---

## Table of Contents

- [Installation](#installation)
- [Quick Start](#quick-start)
- [AI Agent Mode](#ai-agent-mode) ← **New**
- [Web UI](#web-ui) ← **New**
- [Tutorial: Your First Automation](#tutorial-your-first-automation)
- [How It Works](#how-it-works)
- [Python Library](#using-as-a-python-library)
- [CLI Reference](#cli-reference)
- [Profiles](#profiles)
- [Shadow DOM Support](#shadow-dom-support)
- [MCP Server](#mcp-server) ← **New**
- [FAQ](#faq)
- [License](#license)

---

## Installation

### One-Line Installer (recommended)

Downloads Python if needed, creates a virtual environment, installs tappi, and drops a **"Launch tappi"** shortcut on your Desktop. Inspect the scripts first if you like — they're [in the repo](install/).

**macOS:**
```bash
curl -fsSL https://raw.githubusercontent.com/shaihazher/tappi/main/install/install-macos.sh | bash
```

**Linux (Debian/Ubuntu, Fedora, Arch):**
```bash
curl -fsSL https://raw.githubusercontent.com/shaihazher/tappi/main/install/install-linux.sh | bash
```

**Windows (PowerShell):**
```powershell
irm https://raw.githubusercontent.com/shaihazher/tappi/main/install/install-windows.ps1 | iex
```

After install, double-click **"Launch tappi"** on your Desktop — it starts the browser, launches the web UI, and opens it automatically. Pick your AI provider and API key in the Settings page on first launch. See the [Web UI Tutorial](docs/blog/tappi-web-ui-tutorial.md) for a visual walkthrough.

### Manual Install (with venv)

If you prefer to set things up yourself. Requires Python 3.10+.

<details>
<summary><b>macOS</b></summary>

```bash
# Install Python 3.13 (skip if you already have 3.10+)
brew install python@3.13

# Create and activate a virtual environment
python3.13 -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate

# Install tappi
pip install --upgrade pip
pip install tappi

# Verify
bpy --version
```

To auto-activate on every new terminal, add to your `~/.zshrc` (or `~/.bash_profile`):
```bash
source ~/.tappi-venv/bin/activate
```

</details>

<details>
<summary><b>Linux (Debian/Ubuntu)</b></summary>

```bash
# Install Python 3.13 and venv (skip if you already have 3.10+)
sudo apt update
sudo apt install -y python3 python3-pip python3-venv

# Create and activate a virtual environment
python3 -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate

# Install tappi
pip install --upgrade pip
pip install tappi

# Verify
bpy --version
```

To auto-activate on every new terminal, add to your `~/.bashrc`:
```bash
source ~/.tappi-venv/bin/activate
```

</details>

<details>
<summary><b>Linux (Fedora/RHEL)</b></summary>

```bash
# Install Python 3.13 (skip if you already have 3.10+)
sudo dnf install -y python3 python3-pip

# Create and activate a virtual environment
python3 -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate

# Install tappi
pip install --upgrade pip
pip install tappi

# Verify
bpy --version
```

</details>

<details>
<summary><b>Linux (Arch)</b></summary>

```bash
# Install Python (skip if you already have 3.10+)
sudo pacman -Sy python python-pip

# Create and activate a virtual environment
python -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate

# Install tappi
pip install --upgrade pip
pip install tappi

# Verify
bpy --version
```

</details>

<details>
<summary><b>Windows</b></summary>

```powershell
# Install Python 3.13 (skip if you already have 3.10+)
winget install Python.Python.3.13

# Create and activate a virtual environment
python -m venv $env:USERPROFILE\.tappi-venv
& "$env:USERPROFILE\.tappi-venv\Scripts\Activate.ps1"

# Install tappi
pip install --upgrade pip
pip install tappi

# Verify
bpy --version
```

To auto-activate on every new terminal, add to your PowerShell profile (`notepad $PROFILE`):
```powershell
. "$env:USERPROFILE\.tappi-venv\Scripts\Activate.ps1"
```

</details>

### Quick Install (no venv)

If you just want to get going and don't care about virtual environments:

```bash
pip install tappi
```

---

## Quick Start

If you used the one-line installer, just double-click **"Launch tappi"** on your Desktop. Done.

From the terminal:

```bash
# Launch browser + web UI (opens http://127.0.0.1:8321)
bpy launch && bpy serve

# Or use the CLI agent directly
bpy setup                    # one-time: pick provider + API key
bpy launch                   # start the browser
bpy agent "Go to github.com and find today's trending Python repos"
```

---

## AI Agent Mode

The agent is an LLM with 6 tools that can browse the web, read/write files, create PDFs, manage spreadsheets, run shell commands, and schedule recurring tasks — all within a sandboxed workspace directory.

### Setup

```bash
bpy setup
```

The wizard walks you through:

1. **LLM Provider** — OpenRouter, Anthropic, Claude Max (OAuth), OpenAI, AWS Bedrock, Azure, Google Vertex
2. **API Key** — paste your key (or OAuth token for Claude Max)
3. **Model** — defaults per provider, fully configurable
4. **Workspace** — sandboxed directory for all file operations
5. **Browser Profile** — which browser profile the agent uses
6. **Shell Access** — toggle on/off

All config lives in `~/.tappi/config.json`.

### Providers

| Provider | Auth | Status |
|----------|------|--------|
| **OpenRouter** | API key | ✅ Ready |
| **Anthropic** | API key | ✅ Ready |
| **Claude Max (OAuth)** | OAuth token (`sk-ant-oat01-...`) | ✅ Ready |
| **OpenAI** | API key | ✅ Ready |
| **AWS Bedrock** | AWS credentials | ✅ Ready (via LiteLLM) |
| **Azure OpenAI** | API key + endpoint | ✅ Ready (via LiteLLM) |
| **Google Vertex AI** | Service account | ✅ Ready (via LiteLLM) |

All providers work through [LiteLLM](https://github.com/BerriAI/litellm) — one interface, any model.

#### Claude Max (OAuth) — Use Your Subscription

If you have a Claude Pro/Max subscription ($20-200/mo), you can use your **OAuth token** instead of paying per-API-call. This is the same token Claude Code uses.

```bash
bpy setup
# Choose "Claude Max (OAuth)"
# Paste your token: sk-ant-oat01-...
```

**Where to find your token:**

- If you use Claude Code: check your credentials file or environment
- The token format is `sk-ant-oat01-...` (different from API keys which are `sk-ant-api03-...`)
- It works as a drop-in replacement — no proxy, no special config

### CLI Usage

#### Interactive mode

```bash
bpy agent
```

```
tappi agent (type 'quit' to exit, 'reset' to clear)

You: Go to hacker news and find the top post about AI
  🔧 browser → launch
  🔧 browser → open
  🔧 browser → elements
  🔧 browser → text

Agent: The top AI-related post on Hacker News right now is "GPT-5 Released"
with 342 points. It links to openai.com/blog/gpt5 and the discussion has
127 comments. Want me to read the article or the comments?
```

#### One-shot mode

```bash
bpy agent "Create a PDF report of today's weather in Houston"
```

The agent figures out the steps: open a weather site → extract data → create HTML → convert to PDF → save to workspace.

### Tools

The agent has 6 tools, each exposed as a JSON schema the LLM calls natively:

| Tool | What it does |
|------|-------------|
| **browser** | Navigate, click, type, read pages, screenshots, tab management. Uses your real browser with saved logins. |
| **files** | Read, write, list, move, copy, delete files — sandboxed to workspace. |
| **pdf** | Read text from PDFs (PyMuPDF), create PDFs from HTML (WeasyPrint). |
| **spreadsheet** | Read/write CSV and Excel (.xlsx) files, create new ones with headers. |
| **shell** | Run shell commands (cwd = workspace). Can be disabled in settings. |
| **cron** | Schedule recurring tasks with cron expressions or intervals. |

### How the Agent Loop Works

```
User message
    ↓
┌──────────────────────────┐
│   LLM (via LiteLLM)      │ ◄── Sees all 6 tools as JSON schemas
│   Decides what to do      │
└──────────┬───────────────┘
           │
           ▼
    ┌─ Tool calls? ──┐
    │                 │
   Yes               No → Return text response
    │
    ▼
Execute each tool call
    │
    ▼
Append results to conversation
    │
    ▼
Loop back to LLM ────────────►  (max 50 iterations)
```

The loop is synchronous — each tool call blocks until complete. No timeouts. The LLM sees tool results and decides the next step, just like a human would.

### Cron (Scheduled Tasks)

Tell the agent to schedule recurring tasks:

```
You: Schedule a job to check trending repos on GitHub every morning at 9 AM
Agent: Done. Created job "GitHub Trends" with schedule "0 9 * * *".
```

Jobs are stored in `~/.tappi/jobs.json` and persist across restarts. When `bpy serve` is running, APScheduler fires each job in its own agent session.

```bash
# Via CLI
bpy agent "List my scheduled jobs"
bpy agent "Pause the GitHub Trends job"
bpy agent "Remove job abc123"
```

---

## Web UI

```bash
bpy serve                    # http://127.0.0.1:8321
bpy serve --port 9000        # custom port
```

📖 **[Full visual walkthrough →](docs/blog/tappi-web-ui-tutorial.md)**

The web UI has 4 sections:

### 💬 Chat

Full chat interface with live tool call visibility. As the agent works, you see each tool call and its result in real-time via WebSocket.

### 🌍 Browser Profiles

View and create browser profiles. Each profile has its own Chrome sessions (cookies, logins) and CDP port. Create profiles for different use cases — work, personal, social media.

### ⏰ Scheduled Jobs

View all cron jobs with their schedule, status (active/paused), and task description. Jobs are created via chat ("schedule a task to...").

### ⚙️ Settings

- **Model** — change the LLM model
- **Browser Profile** — select which profile the agent uses
- **Shell Access** — enable/disable shell commands
- **Workspace** — view the sandboxed directory

> **Note:** Provider and API key changes require `bpy setup` (CLI) — these aren't exposed in the web UI for security.

---

## Tutorial: Your First Automation

### Step 1: Launch the browser

```bash
bpy launch
```

```
✓ Chrome launched on port 9222
  Profile: ~/.tappi/profiles/default

⚡ First launch — a fresh Chrome window opened.
   Log into the sites you want to automate (Gmail, GitHub, etc.).
   Those sessions will persist for all future launches.
```

**First time only:** A fresh Chrome window opens. Log into the websites you want to automate. Close the window when done. Your sessions are saved in the profile.

### Step 2: Control it

```bash
bpy open github.com         # Navigate
bpy elements                # See what's clickable
bpy click 3                 # Click element [3]
bpy type 5 "hello world"    # Type into element [5]
bpy text                    # Read the page
bpy screenshot page.png     # Screenshot
```

Every interactive element gets a number. Use that number with `click` and `type`.

---

## How It Works

### The connection

```
┌─────────────┐     CDP (WebSocket)     ┌──────────────────┐
│  tappi  │ ◄──────────────────────► │  Chrome/Chromium  │
│  (your code) │     localhost:9222       │  (your sessions)  │
└─────────────┘                          └──────────────────┘
```

`bpy launch` starts Chrome with `--remote-debugging-port=9222` and a persistent `--user-data-dir`. All commands connect to that port via WebSocket.

### Real mouse events

`click` uses CDP's `Input.dispatchMouseEvent` — real mouse presses, not `.click()`. Works with React, Vue, Angular, and every framework.

### Shadow DOM piercing

The element scanner recursively enters every shadow root. Reddit, GitHub, Salesforce, Angular Material — all work automatically.

### Framework-aware typing

`type` dispatches proper `input` and `change` events using React's native value setter. SPAs with controlled components get the value update correctly.

---

## Using as a Python Library

```python
from tappi import Browser

Browser.launch()              # Start Chrome
b = Browser()                 # Connect

b.open("https://github.com")
elements = b.elements()       # List interactive elements
b.click(1)                    # Click by index
b.type(2, "search query")     # Type into input
text = b.text()               # Read visible text
b.screenshot("page.png")      # Screenshot
b.upload("~/file.pdf")        # Upload file
```

### Profile management

```python
from tappi.profiles import create_profile, list_profiles, get_profile

create_profile("work")        # → port 9222
create_profile("personal")    # → port 9223

# Run multiple simultaneously
work = get_profile("work")
Browser.launch(port=work["port"], user_data_dir=work["path"])
b = Browser(f"http://127.0.0.1:{work['port']}")
```

### Agent as a library

```python
from tappi.agent.loop import Agent

agent = Agent(
    browser_profile="default",
    on_tool_call=lambda name, params, result: print(f"🔧 {name}"),
)

response = agent.chat("Go to github.com and find trending repos")
print(response)

# Multi-turn
response = agent.chat("Now check the first one and summarize the README")
print(response)

# Reset conversation
agent.reset()
```

---

## CLI Reference

### Agent Commands

| Command | Description |
|---------|-------------|
| `bpy setup` | Configure LLM provider, workspace, browser |
| `bpy agent [message]` | Chat with the agent (interactive or one-shot) |
| `bpy serve [--port 8321]` | Start the web UI |

### Browser Commands

| Command | Description |
|---------|-------------|
| `bpy launch [name]` | Start Chrome with a named profile |
| `bpy launch new [name]` | Create a new profile |
| `bpy launch list` | List all profiles |
| `bpy launch --default <name>` | Set the default profile |

### Navigation

| Command | Description |
|---------|-------------|
| `bpy open <url>` | Navigate to URL |
| `bpy url` | Print current URL |
| `bpy back` / `forward` / `refresh` | History navigation |

### Interaction

| Command | Description |
|---------|-------------|
| `bpy elements [selector]` | List interactive elements (numbered) |
| `bpy click <index>` | Click element by number |
| `bpy type <index> <text>` | Type into element |
| `bpy upload <path> [selector]` | Upload file |

### Content

| Command | Description |
|---------|-------------|
| `bpy text [selector]` | Extract visible text |
| `bpy html <selector>` | Get element HTML |
| `bpy eval <js>` | Run JavaScript |
| `bpy screenshot [path]` | Save screenshot |

### Other

| Command | Description |
|---------|-------------|
| `bpy tabs` / `tab <n>` / `newtab` / `close` | Tab management |
| `bpy scroll <dir> [px]` | Scroll the page |
| `bpy wait <ms>` | Wait (for scripts) |

---

## Profiles

Each profile is a separate Chrome session with its own logins, cookies, and CDP port.

```bash
bpy launch                  # Default profile (port 9222)
bpy launch new work         # Create "work" (port 9223)
bpy launch work             # Launch it
bpy launch list             # See all profiles
bpy launch --default work   # Set default
bpy launch delete old       # Remove a profile

# Run multiple simultaneously
bpy launch                  # Terminal 1: default on 9222
bpy launch work             # Terminal 2: work on 9223
CDP_URL=http://127.0.0.1:9223 bpy tabs   # Control work profile
```

Profiles live at `~/.tappi/profiles/<name>/`. Config at `~/.tappi/config.json`.

---

## Shadow DOM Support

tappi automatically pierces shadow DOM boundaries. No configuration needed.

```bash
bpy open reddit.com
bpy elements        # Finds elements inside shadow roots
bpy click 5         # Works normally
```

---

## Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `CDP_URL` | CDP endpoint URL | `http://127.0.0.1:9222` |
| `NO_COLOR` | Disable colored output | (unset) |
| `ANTHROPIC_API_KEY` | Anthropic/Claude Max key | (from config) |
| `OPENROUTER_API_KEY` | OpenRouter key | (from config) |
| `OPENAI_API_KEY` | OpenAI key | (from config) |

---

## MCP Server

tappi includes a built-in MCP (Model Context Protocol) server, so you can use it with **Claude Desktop**, **Cursor**, **Windsurf**, **OpenClaw**, or any MCP-compatible AI agent.

### Claude Desktop — One-Click Install (.mcpb)

The easiest way to add tappi to Claude Desktop is the **`.mcpb` bundle** — a single file that installs everything:

1. Download [`tappi-0.5.1.mcpb`](https://github.com/shaihazher/tappi/releases/latest) from the latest release
2. Double-click it — Claude Desktop installs the extension automatically
3. Start Chrome with `tappi launch` or `--remote-debugging-port=9222`
4. Ask Claude to browse the web

No `pip install`. No config editing. No Python on your PATH. The bundle includes all source code and dependencies — Claude Desktop manages the runtime via `uv`.

> **See it in action:** [Real Claude Desktop conversation using tappi MCP](https://claude.ai/share/c8a162bd-d35b-4db6-a704-c3bc57ee0498)

### Manual Setup (pip)

If you prefer manual installation or use other MCP clients:

```bash
pip install tappi
```

Add to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "tappi": {
      "command": "tappi",
      "args": ["mcp"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}
```

**Don't want to install anything?** Use `uvx` (comes with [uv](https://docs.astral.sh/uv/)):

```json
{
  "mcpServers": {
    "tappi": {
      "command": "uvx",
      "args": ["tappi", "mcp"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}
```

**Prefer npm?** There's a thin wrapper that delegates to the Python server:

```bash
npx tappi-mcp
```

Claude Desktop config with npx:

```json
{
  "mcpServers": {
    "tappi": {
      "command": "npx",
      "args": ["tappi-mcp"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}
```

### Cursor / Windsurf

Same config format — add the `tappi` server to your MCP settings with the command above.

### OpenClaw

tappi is available as an [OpenClaw](https://openclaw.ai) skill on [ClawHub](https://clawhub.com):

```bash
clawhub install tappi
```

### HTTP/SSE Transport

For MCP clients that prefer HTTP instead of stdio:

```bash
tappi mcp --sse                    # default: 127.0.0.1:8377
tappi mcp --sse --port 9000        # custom port
```

### Available Tools

The MCP server exposes 23 tools:

| Tool | Description |
|------|-------------|
| `tappi_open` | Navigate to a URL |
| `tappi_elements` | List interactive elements (numbered, shadow DOM piercing) |
| `tappi_click` | Click element by index |
| `tappi_type` | Type into element by index |
| `tappi_text` | Extract visible page text |
| `tappi_eval` | Run JavaScript in page context |
| `tappi_screenshot` | Capture page screenshot |
| `tappi_tabs` | List open tabs |
| `tappi_tab` | Switch tab |
| `tappi_scroll` | Scroll page |
| `tappi_upload` | Upload file (bypasses OS dialog) |
| `tappi_click_xy` | Click at coordinates (cross-origin iframes) |
| `tappi_iframe_rect` | Get iframe bounding box |
| ... and 10 more | `newtab`, `close`, `url`, `back`, `forward`, `refresh`, `html`, `hover_xy`, `drag_xy`, `wait` |

### How It's Different

Unlike Playwright MCP or browser tool ARIA snapshots, tappi's MCP server:

- **Connects to your existing Chrome** — all sessions, cookies, extensions carry over
- **Pierces shadow DOM** — Gmail, Reddit, GitHub all work natively
- **Returns compact indexed output** — `[3] (button) Submit` instead of a 50K-token accessibility tree
- **Uses 3-10x fewer tokens** per interaction
- **No headless browser** — runs in your real Chrome, invisible to bot detection

### Prerequisites

Start Chrome with remote debugging enabled:

```bash
# Option 1: tappi launch (manages profiles for you)
tappi launch

# Option 2: Manual
google-chrome --remote-debugging-port=9222
```

Set `CDP_URL` in your MCP config to point to your Chrome instance (default: `http://127.0.0.1:9222`).

---

## FAQ

**Q: What's the difference between `bpy agent` and `bpy` commands?**
`bpy agent` talks to an LLM that decides what to do. `bpy click 3` directly executes a browser command. Use agent mode for complex multi-step tasks; use direct commands for scripting.

**Q: Can I use my Claude Max subscription instead of paying per-API-call?**
Yes. Choose "Claude Max (OAuth)" during `bpy setup` and paste your OAuth token (`sk-ant-oat01-...`). Same token Claude Code uses.

**Q: Do I need to log in every time?**
No. Log in once during your first `bpy launch`. Sessions persist in the profile directory.

**Q: What browsers are supported?**
Chrome, Chromium, Brave, Microsoft Edge — anything Chromium-based with CDP support.

**Q: Does it work headless?**
Yes. `bpy launch --headless` runs without a visible window. Log in with a visible window first to set up sessions.

**Q: Is my data safe?**
File operations are sandboxed to your workspace directory. The agent cannot access files outside it. Shell access can be disabled. API keys are stored locally in `~/.tappi/config.json`.

**Q: How is this different from Selenium/Playwright?**

| | tappi | Selenium | Playwright |
|---|:---:|:---:|:---:|
| Session reuse | ✅ | ❌ | Partial |
| AI agent | ✅ | ❌ | ❌ |
| Shadow DOM | ✅ | ❌ | ❌ |
| Dependencies | 1 (core) | Heavy | Heavy |
| Install size | ~100KB | ~50MB | ~200MB+ |

---

## Architecture

```
tappi/
├── tappi/
│   ├── core.py                 # CDP engine (Phase 1)
│   ├── cli.py                  # bpy CLI
│   ├── profiles.py             # Named profile management
│   ├── js_expressions.py       # Injected JS for element scanning
│   ├── agent/
│   │   ├── loop.py             # Agentic while-loop (LiteLLM)
│   │   ├── config.py           # Provider/workspace/model config
│   │   ├── setup.py            # Interactive setup wizard
│   │   └── tools/
│   │       ├── browser.py      # Browser tool (wraps core.py)
│   │       ├── files.py        # Sandboxed file ops
│   │       ├── pdf.py          # PDF read (PyMuPDF) + create (WeasyPrint)
│   │       ├── spreadsheet.py  # CSV + Excel (openpyxl)
│   │       ├── shell.py        # Sandboxed shell execution
│   │       └── cron.py         # APScheduler cron jobs
│   └── server/
│       └── app.py              # FastAPI web UI + API
└── pyproject.toml
```

---

## Blog Posts

- 🏆 [Tappi Is the Most Token-Efficient Browser Tool for AI Agents](https://dev.to/azeruddin_sheikh_f75230b5/tappi-is-the-most-token-efficient-browser-tool-for-ai-agents-nothing-else-comes-close-33gk) — Deep competitive analysis with live benchmarks vs Agent-Browser, Playwright CLI, and more
- 🚀 [Every AI Browser Tool Is Broken Except One](https://dev.to/azeruddin_sheikh_f75230b5/every-ai-browser-tool-is-broken-except-one-33i0) — The original benchmark (59K tokens 3/3 ✅ vs 252K for browser tools)
- 🔌 [Tappi MCP Is Live](https://dev.to/azeruddin_sheikh_f75230b5/tappi-mcp-is-live-give-claude-desktop-a-real-browser-1kk4) — MCP server for Claude Desktop
- 🖥️ [Tappi Web UI Tutorial](https://dev.to/azeruddin_sheikh_f75230b5/tappi-web-ui-the-complete-setup-guide-4o5p) — Visual walkthrough

---

## License

MIT
