Metadata-Version: 2.4
Name: pymetabase
Version: 0.1.1
Summary: A Python client and CLI for the Metabase API with automatic chunking for large dataset exports
Project-URL: Homepage, https://github.com/mokarimi19/pymetabase
Project-URL: Documentation, https://github.com/mokarimi19/pymetabase#readme
Project-URL: Repository, https://github.com/mokarimi19/pymetabase
Project-URL: Issues, https://github.com/mokarimi19/pymetabase/issues
Author-email: 0xmrk <mmrzk1995@gmail.com>
License: MIT
License-File: LICENSE
Keywords: analytics,api,business-intelligence,cli,data,etl,export,metabase
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.25.0
Requires-Dist: sqlglot>=20.0.0
Provides-Extra: dev
Requires-Dist: black>=23.0; extra == 'dev'
Requires-Dist: isort>=5.0; extra == 'dev'
Requires-Dist: mypy>=1.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: excel
Requires-Dist: openpyxl>=3.1.5; extra == 'excel'
Requires-Dist: pandas>=2.0.3; extra == 'excel'
Provides-Extra: parquet
Requires-Dist: pyarrow>=10.0.0; extra == 'parquet'
Description-Content-Type: text/markdown

<p align="center">
  <h1 align="center">PyMetabase</h1>
  <p align="center">
    <strong>The most complete Python API wrapper and CLI for Metabase</strong>
  </p>
  <p align="center">
    <a href="https://pypi.org/project/pymetabase/"><img alt="PyPI" src="https://img.shields.io/pypi/v/pymetabase?color=blue&label=PyPI"></a>
    <a href="https://pypi.org/project/pymetabase/"><img alt="Python" src="https://img.shields.io/pypi/pyversions/pymetabase"></a>
    <a href="https://github.com/mokarimi19/pymetabase/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/mokarimi19/pymetabase"></a>
    <a href="https://github.com/mokarimi19/pymetabase/actions"><img alt="Tests" src="https://img.shields.io/badge/tests-194%20passed-brightgreen"></a>
    <a href="https://pypi.org/project/pymetabase/"><img alt="Downloads" src="https://img.shields.io/pypi/dm/pymetabase?color=orange"></a>
  </p>
</p>

---

Export millions of rows, manage cards, dashboards, collections, execute parameterized queries, and access every Metabase API endpoint — from Python or the command line.

## 🔍 Why PyMetabase?

| Feature | PyMetabase | Other Libraries |
|---------|:----------:|:---------------:|
| Full Metabase API coverage | ✅ | Partial |
| CLI tool | ✅ | ❌ |
| Automatic chunking for large exports | ✅ | ❌ |
| Parameterized queries (`{{param}}`) | ✅ | ❌ |
| Session token persistence & reuse | ✅ | ❌ |
| Resume interrupted exports | ✅ | ❌ |
| rclone-style remote management | ✅ | ❌ |
| Cards, Dashboards, Collections API | ✅ | Rare |
| Generic API access (any endpoint) | ✅ | Rare |
| Search, Users, Tables/Fields API | ✅ | ❌ |

## 📦 Installation

```bash
pip install pymetabase
```

**Optional extras:**
```bash
pip install pymetabase[excel]    # Excel export (pandas + openpyxl)
pip install pymetabase[parquet]  # Parquet export (pyarrow)
```

## ⚡ Quick Start

### Python — Export your first dataset in 4 lines

```python
from pymetabase import Metabase

with Metabase(url="https://metabase.example.com", username="analyst@company.com", password="pass") as mb:
    result = mb.export(database="Production", query="SELECT * FROM customers", output="customers.jsonl")
    print(f"Exported {result.total_rows:,} rows in {result.duration_seconds:.1f}s")
```

### CLI — Zero code, full power

```bash
# Step 1: Configure your Metabase server (one-time setup)
pymetabase config

# Step 2: Start working
pymetabase export -d Production -q "SELECT * FROM customers" -o customers.jsonl
pymetabase cards                    # see all saved questions
pymetabase search "revenue"         # find anything across your instance
pymetabase query -d Production -q "SELECT COUNT(*) FROM orders WHERE status = {{status}}" -p status=completed
```

---

## 📖 Features

### 🔌 Generic API Access

Hit **any** Metabase endpoint — no wrapper method needed:

```python
with Metabase(url=url, username=user, password=pwd) as mb:
    # Fetch all cards
    all_cards = mb.get("/api/card")

    # Create a new question
    mb.post("/api/card", json={
        "name": "Daily Active Users",
        "dataset_query": {"database": 1, "type": "native", "native": {"query": "SELECT ..."}},
        "display": "line",
        "visualization_settings": {},
    })

    # Update and delete
    mb.put("/api/card/1", json={"name": "Weekly Active Users"})
    mb.delete_resource("/api/card/1")
```

```bash
# Same from the command line
pymetabase api get /api/card
pymetabase api post /api/card --json-body '{"name": "New Question", ...}'
```

### 📋 Cards (Saved Questions)

```python
with Metabase(...) as mb:
    # Browse saved questions
    cards = mb.list_cards()
    card = mb.get_card(42)

    # Run a saved question with parameters
    result = mb.execute_card(42, parameters={"date_filter": "2024-01-01"})

    # Create a new question programmatically
    mb.create_card(
        name="Monthly Revenue by Region",
        dataset_query={
            "database": 1,
            "type": "native",
            "native": {"query": "SELECT region, SUM(amount) FROM sales GROUP BY region"},
        },
        collection_id=5,
    )

    # Housekeeping
    mb.update_card(42, name="Monthly Revenue by Region (v2)")
    mb.delete_card(99)
```

### 📊 Dashboards

```python
with Metabase(...) as mb:
    dashboards = mb.list_dashboards()
    dash = mb.get_dashboard(1)   # includes all cards on the dashboard

    mb.create_dashboard("Q4 Executive Report", collection_id=3, description="KPIs and trends")
    mb.update_dashboard(1, name="Q4 Executive Report — Final")
    mb.delete_dashboard(1)
```

### 📁 Collections

```python
with Metabase(...) as mb:
    collections = mb.list_collections()
    items = mb.get_collection_items("root")   # list everything at the top level
    items = mb.get_collection_items(7)         # or inside a specific collection

    mb.create_collection("Data Engineering", parent_id=1, color="#509EE3")
```

### 🔎 Search

Find cards, dashboards, collections, and tables in one call:

```python
results = mb.search("revenue", models=["card", "dashboard"])
```

```bash
pymetabase search "revenue"
```

### 👤 Users

```python
me = mb.get_current_user()               # who am I?
user = mb.get_user(1)                     # specific user
users = mb.list_users()                   # all users (admin only)
```

```bash
pymetabase whoami
pymetabase users
```

### 🗄️ Tables & Fields

Explore your database schema without SQL:

```python
tables = mb.list_tables("Production")     # all tables in a database
table = mb.get_table(1)                   # table details
metadata = mb.get_table_metadata(1)       # columns, types, FKs
field = mb.get_field(42)                  # single field info
values = mb.get_field_values(42)          # distinct values (great for filters)
```

---

## 🧩 Parameterized Queries

Use `{{param}}` placeholders in SQL — PyMetabase handles the Metabase template-tag wiring for you:

```python
with Metabase(...) as mb:
    # Filter orders by status and minimum amount
    rows = mb.query(
        "SELECT * FROM orders WHERE status = {{status}} AND total > {{min_total}}",
        database="Production",
        parameters={"status": "shipped", "min_total": 500},
    )

    # Export with parameters — same syntax
    mb.export(
        database="Production",
        query="SELECT * FROM signups WHERE created_at > {{since}}",
        output="recent_signups.jsonl",
        parameters={"since": "2024-01-01"},
    )
```

```bash
# CLI — pass parameters with -p key=value
pymetabase query -d Production \
    -q "SELECT * FROM orders WHERE status = {{status}}" \
    -p status=shipped

pymetabase export -d Production \
    -q "SELECT * FROM users WHERE role = {{role}}" \
    -o admins.jsonl -p role=admin
```

---

## 📤 Export

### Large Dataset Export

Automatic chunking for datasets with millions of rows. No memory issues, no timeouts:

```python
with Metabase(...) as mb:
    result = mb.export(
        database="Production",
        query="SELECT * FROM events",       # 10M+ rows? No problem
        output="events.jsonl",
        chunk_size=500_000,                  # fetch 500K rows at a time
        checkpoint_file="progress.json",     # resume if interrupted
        progress_callback=lambda cur, total, rate: print(f"{cur:,}/{total:,} rows ({rate:.0f}/s)"),
    )
    print(f"Done — {result.total_rows:,} rows, {result.chunks} chunks, {result.duration_seconds:.0f}s")
```

**How it works under the hood:**
1. Counts total rows with `SELECT COUNT(*)`
2. Wraps your query with `ROW_NUMBER() OVER ()` and fetches in offset chunks
3. Retries failed chunks with exponential backoff
4. Saves progress to a checkpoint file so you can resume after crashes
5. Works with any SQL — CTEs, JOINs, subqueries, window functions

### Export Formats

Output format is detected from the file extension:

```bash
pymetabase export -d DB -q "SELECT * FROM users" -o users.jsonl   # JSON Lines (default)
pymetabase export -d DB -q "SELECT * FROM users" -o users.json    # JSON array
pymetabase export -d DB -q "SELECT * FROM users" -o users.csv     # CSV
```

### Table Export with Filtering

Export a table directly — no SQL needed:

```python
with Metabase(...) as mb:
    mb.export_table(
        database="Production",
        table="users",
        output="active_users.csv",
        columns=["id", "name", "email", "plan"],
        where="active = true AND plan = 'pro'",
        order_by="created_at DESC",
        limit=10_000,
    )
```

---

## 🌐 Remote Management

Manage multiple Metabase servers like rclone manages cloud storage:

```bash
# Configure a remote (interactive)
pymetabase config

# List all configured remotes
pymetabase listremotes
# Configured remotes (2):
#   - production (default): https://metabase.company.com
#   - staging: https://staging-metabase.company.com

# Use a specific remote for any command
pymetabase -r staging export -d DB -q "SELECT 1" -o test.jsonl

# Manage remotes
pymetabase testremote production     # verify connection
pymetabase showremote production     # show config details
pymetabase selectremote staging      # change default
pymetabase deleteremote old-server   # remove a remote

# Quick identity check
pymetabase whoami
```

> 🔑 **Session tokens are persisted** — PyMetabase caches your auth token and reuses it across runs. No re-login on every command.

Config location:
- **macOS / Linux:** `~/.config/pymetabase/config.json`
- **Windows:** `%APPDATA%\pymetabase\config.json`

---

## ⚙️ Configuration

PyMetabase supports multiple configuration methods. Highest priority wins:

### 1. Direct Parameters

```python
mb = Metabase(url="https://metabase.company.com", username="analyst@company.com", password="...")
```

### 2. Remotes (recommended for CLI)

```bash
pymetabase config           # interactive setup
pymetabase -r myserver export ...
```

### 3. Credentials File

```json
[{"SERVER_NAME": "https://metabase.company.com", "USERNAME": "analyst@company.com", "PASSWORD": "..."}]
```

```python
mb = Metabase(credentials_file="credentials.json")
```

### 4. Config File (YAML)

```yaml
metabase:
  url: https://metabase.company.com
  username: analyst@company.com
  password: secret

defaults:
  chunk_size: 500000
  format: jsonl

retry:
  max_retries: 3
  delay: 1.0
  timeout: 600
```

### 5. Environment Variables

```bash
export METABASE_URL=https://metabase.company.com
export METABASE_USERNAME=analyst@company.com
export METABASE_PASSWORD=secret
```

---

## 📘 CLI Reference

| Command | Description |
|---------|-------------|
| `export` | Export SQL query results to file |
| `export-table` | Export a table with optional column/filter selection |
| `query` | Execute SQL and print results (`-f table\|json\|jsonl\|csv`) |
| `cards` | List all saved questions |
| `card <id>` | Show card details |
| `execute-card <id>` | Run a saved question (with optional `-p` params) |
| `dashboards` | List all dashboards |
| `dashboard <id>` | Show dashboard details and its cards |
| `collections` | List all collections |
| `collection <id>` | Show items inside a collection |
| `search <query>` | Search cards, dashboards, collections, tables |
| `users` | List all users (admin only) |
| `whoami` | Show current authenticated user |
| `list-databases` | List available databases |
| `list-tables` | List tables in a database |
| `api <method> <endpoint>` | Raw API request (GET/POST/PUT/DELETE) |
| `config` | Add, edit, or delete remotes |
| `listremotes` | List configured remotes |
| `selectremote` | Set the default remote |
| `showremote <name>` | Show remote configuration |
| `deleteremote <name>` | Delete a remote |
| `testremote [name]` | Test connection to a remote |

---

## 📘 Python API Reference

### Metabase Client

```python
Metabase(
    url: str = None,
    username: str = None,
    password: str = None,
    config_file: str = None,
    credentials_file: str = None,
    remote_name: str = None,     # for token persistence
    persist_token: bool = True,  # reuse sessions across runs
)
```

### Methods

| Category | Methods |
|----------|---------|
| **Generic API** | `get(endpoint)` `post(endpoint, **kwargs)` `put(endpoint, **kwargs)` `delete_resource(endpoint)` |
| **Cards** | `list_cards()` `get_card(id)` `create_card(...)` `update_card(id, ...)` `delete_card(id)` `execute_card(id, parameters=)` |
| **Dashboards** | `list_dashboards()` `get_dashboard(id)` `create_dashboard(...)` `update_dashboard(id, ...)` `delete_dashboard(id)` |
| **Collections** | `list_collections()` `get_collection(id)` `get_collection_items(id)` `create_collection(...)` |
| **Search** | `search(query, models=, limit=)` |
| **Users** | `get_current_user()` `list_users()` `get_user(id)` |
| **Tables/Fields** | `list_tables(db)` `get_table(id)` `get_table_metadata(id)` `get_field(id)` `get_field_values(id)` |
| **Queries** | `query(sql, database, parameters=)` |
| **Export** | `export(query, output, database, parameters=, chunk_size=, ...)` `export_table(table, output, ...)` |
| **Connection** | `connect()` `disconnect()` |

### ExportResult

```python
result.total_rows         # int — total rows exported
result.chunks             # int — number of chunks used
result.duration_seconds   # float — wall-clock time
result.output_file        # str — path to output file
result.rate_per_second    # float — rows/sec throughput
result.format             # str — "jsonl", "json", or "csv"
```

---

## 🛠️ Requirements

- Python 3.9+
- `requests` `sqlglot` `pyyaml`

---

## 🤝 Contributing

Contributions are welcome! Please open an issue or submit a pull request.

```bash
git clone https://github.com/mokarimi19/pymetabase.git
cd pymetabase
pip install -e ".[dev]"
pytest
```

## 📄 License

[MIT](LICENSE)
