Metadata-Version: 2.4
Name: pocket-analyst
Version: 0.1.0
Summary: Embeddable database exploration toolkit — connect, explore, and query any SQL database in Python.
License: MIT
Project-URL: Homepage, https://github.com/MattMcnally118/pocket-analyst-containerized
Keywords: database,sql,duckdb,sqlite,postgresql,analytics
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: sqlalchemy>=2.0.0
Requires-Dist: duckdb-engine>=0.13.0
Requires-Dist: psycopg2-binary>=2.9.0
Requires-Dist: pymysql>=1.1.0
Requires-Dist: pytz>=2024.1
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"

# Pocket Analyst

**Connect any database. Ask questions in plain English. Get answers backed by real SQL.**

Pocket Analyst is an AI-powered database analyst you run on your own infrastructure. Your data never leaves your environment — it stays in your container, on your machine or server, and only a plain-text question travels to the LLM API.

---

## See It In Action

> *Connect a database → ask "which products had the highest revenue last quarter?" → get a plain-English answer with the SQL shown*

The web UI is the fastest way to see what Pocket Analyst does. Run the full package (3 commands below) and open your browser.

---

## Get Started — Choose Your Setup

### Full Package — open browser, connect a database, done

No coding required. Everything included.

```bash
git clone https://github.com/MattMcnally118/pocket-analyst-containerized.git
cd pocket-analyst-containerized
cp .env.example .env        # add your ANTHROPIC_API_KEY
docker compose up --build
```

Open [http://localhost:8080](http://localhost:8080)

---

### MCP Add-on — use Pocket Analyst inside Claude Desktop

Already using Claude Desktop? Add Pocket Analyst as a tool. The backend runs in Docker, the connector is a 2-minute setup.

**Step 1 — start the backend:**
```bash
docker compose -f docker-compose.core.yml up --build
```

**Step 2 — install the MCP connector:**
```bash
pip install -r requirements-mcp.txt
```

**Step 3 — add to Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
```json
{
  "mcpServers": {
    "pocket-analyst": {
      "command": "/path/to/your/python3",
      "args": ["/path/to/pocket-analyst-containerized/mcp_server.py"]
    }
  }
}
```

Restart Claude Desktop. Pocket Analyst tools will appear automatically.

---

### Core API — build your own interface on top

REST API access to all database tools. No UI, no assumptions about how you'll use it. Connect from R Shiny, Python, curl, or any HTTP client.

```bash
docker compose -f docker-compose.core.yml up --build
```

Key endpoints at `http://localhost:8080`:

| Method | Endpoint | What it does |
|--------|----------|--------------|
| `POST` | `/api/v1/connect` | Connect to a database |
| `GET` | `/api/v1/schema` | Full schema introspection |
| `GET` | `/api/v1/tables` | List all tables |
| `GET` | `/api/v1/tables/{name}` | Describe a table |
| `POST` | `/api/v1/query` | Run a SELECT query |
| `POST` | `/api/v1/disconnect` | End session |

Full API docs at [http://localhost:8080/docs](http://localhost:8080/docs) when running.

---

## Supported Databases

| Database | Connection string format |
|----------|--------------------------|
| DuckDB | `duckdb:////data/mydb.duckdb` |
| SQLite | `sqlite:///mydb.sqlite` |
| PostgreSQL | `postgresql://user:pass@host/dbname` |
| MySQL | `mysql+pymysql://user:pass@host/dbname` |

Drop `.duckdb` or `.sqlite` files into `datasets_safe/` and they'll be available inside the container at `/data/`.

---

## Data Privacy

Pocket Analyst runs entirely on your infrastructure. Your database connection, queries, and results stay inside your container. The only data that leaves is the plain-text question and schema summary sent to the LLM API (Anthropic, AWS Bedrock, or GCP Vertex AI — your choice).

---

## LLM Providers

Set `LLM_PROVIDER` in your `.env`:

**Anthropic (default)**
```env
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL_ID=claude-sonnet-4-6
```

**AWS Bedrock**
```env
LLM_PROVIDER=bedrock
BEDROCK_REGION=us-east-1
BEDROCK_MODEL_ID=anthropic.claude-sonnet-4-6-v1:0
```

**GCP Vertex AI**
```env
LLM_PROVIDER=vertex
VERTEX_PROJECT_ID=my-gcp-project
VERTEX_LOCATION=us-central1
VERTEX_MODEL_ID=claude-sonnet-4-6@20251001
```

---

## Configuration

| Variable | Description | Default |
|----------|-------------|---------|
| `LLM_PROVIDER` | `anthropic`, `bedrock`, or `vertex` | `anthropic` |
| `ANTHROPIC_API_KEY` | Anthropic API key | — |
| `ANTHROPIC_MODEL_ID` | Model to use | `claude-sonnet-4-6` |
| `MAX_AGENT_TURNS` | Max tool-calling rounds per message | `10` |
| `MAX_RESPONSE_TOKENS` | Max tokens per response | `4096` |
| `LOG_LEVEL` | `DEBUG`, `INFO`, `WARNING` | `INFO` |
| `PORT` | Port to serve on | `8080` |

---

## Deployment

### AWS ECS (Fargate)

See [`cloudformation/`](cloudformation/) for ready-to-use CloudFormation templates.

### GCP Cloud Run

See [`scripts/cloud-run-service.yaml`](scripts/cloud-run-service.yaml) and [`scripts/setup-gcp-iam.sh`](scripts/setup-gcp-iam.sh).

---

## Project Structure

```
├── app/
│   ├── agent/loop.py          # Agentic tool-calling loop (LLM decides what to query)
│   ├── llm/                   # LLM clients (Anthropic, Bedrock, Vertex)
│   ├── routers/               # FastAPI routes (chat, REST API, health)
│   ├── tools/                 # Tool registry and executor
│   ├── config.py              # Settings
│   ├── session.py             # Per-session state (connection, history)
│   └── main.py                # App entry point
├── db/
│   ├── connection.py          # SQLAlchemy engine factory
│   └── schema.py              # Schema introspection (all 4 dialects)
├── tools_impl/
│   ├── connect.py             # connect / disconnect / set_context
│   ├── explore.py             # explore_schema / describe_table
│   └── query.py               # run_query / list_tables
├── static/index.html          # Web UI
├── datasets_safe/             # Drop database files here (gitignored)
├── mcp_server.py              # Claude Desktop MCP connector
├── docker-compose.yml         # Full package (same as docker-compose.full.yml)
├── docker-compose.full.yml    # Tier 3: full package
├── docker-compose.core.yml    # Tier 1: core API only
└── ARCHITECTURE.md            # Layer diagrams and design decisions
```

---

## Architecture

See [ARCHITECTURE.md](ARCHITECTURE.md) for the full layer diagram, 3-tier distribution model, and explanation of how the agent, REST API, and MCP server relate to each other.
