Metadata-Version: 2.4
Name: dbt-rabbit-bigquery
Version: 1.1.1.1.9.2
Summary: A dbt adapter that automatically optimizes BigQuery job costs using the Rabbit API
Author-email: Rabbit Team <success@followrabbit.ai>
Maintainer-email: Rabbit Team <success@followrabbit.ai>
License: Apache License 2.0
Project-URL: Homepage, https://followrabbit.ai
Project-URL: Documentation, https://followrabbit.ai/docs/dbt-adapter
Project-URL: Source Code, https://github.com/followtherabbit/dbt-rabbit-bigquery
Project-URL: Bug Tracker, https://github.com/followtherabbit/dbt-rabbit-bigquery/issues
Project-URL: Changelog, https://github.com/followtherabbit/dbt-rabbit-bigquery/blob/master/CHANGELOG.md
Keywords: dbt,bigquery,google-cloud,data-warehouse,optimization,cost-optimization,analytics,data-engineering,sql,rabbit
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Office/Business
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dbt-bigquery==1.9.2
Requires-Dist: rabbit-bq-job-optimizer>=0.1.12
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: pytest-mock>=3.10.0; extra == "dev"
Requires-Dist: black>=24.1.0; extra == "dev"
Requires-Dist: flake8>=7.0.0; extra == "dev"
Requires-Dist: mypy>=1.8.0; extra == "dev"
Requires-Dist: pydocstyle>=6.3.0; extra == "dev"
Requires-Dist: pre-commit>=3.5.0; extra == "dev"
Requires-Dist: types-requests>=2.31.0; extra == "dev"
Requires-Dist: build>=1.0.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Requires-Dist: wheel>=0.42.0; extra == "dev"
Requires-Dist: tomli>=2.0.0; python_version < "3.11" and extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.5.0; extra == "docs"
Requires-Dist: mkdocs-material>=9.5.0; extra == "docs"
Dynamic: license-file

# dbt-rabbit-bigquery

[![PyPI version](https://badge.fury.io/py/dbt-rabbit-bigquery.svg)](https://badge.fury.io/py/dbt-rabbit-bigquery)
[![Python versions](https://img.shields.io/pypi/pyversions/dbt-rabbit-bigquery.svg)](https://pypi.org/project/dbt-rabbit-bigquery/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![dbt-core](https://img.shields.io/badge/dbt--core-%E2%89%A51.5.0-orange.svg)](https://github.com/dbt-labs/dbt-core)

**Automatically optimize your BigQuery costs in dbt without changing a single line of SQL.**

The dbt-rabbit-bigquery adapter is a drop-in replacement for dbt-bigquery that intelligently routes your queries to the most cost-effective BigQuery resources using the Rabbit optimization platform. Save up to 60% on BigQuery costs while maintaining full compatibility with your existing dbt projects.

## 🎯 Why Use This Adapter?

### The Problem
BigQuery offers multiple pricing options (on-demand, flat-rate slots, reservations), but choosing the right option for each query is complex and time-consuming. Most teams either:
- Overpay by using on-demand pricing for everything
- Underutilize expensive slot commitments
- Spend engineering time manually optimizing queries

### The Solution
This adapter automatically analyzes each query and assigns it to the optimal BigQuery pricing model, ensuring you always get the best performance at the lowest cost—without any code changes.

### Key Benefits
- ✅ **Zero Code Changes**: Drop-in replacement for dbt-bigquery
- 💰 **Automatic Cost Optimization**: Save up to 60% on BigQuery costs
- 🚀 **No Performance Impact**: Sub-second API overhead
- 🛡️ **Production Ready**: Graceful fallback if optimization fails
- 📊 **Full Transparency**: Detailed logging and cost trackin

---

## 📦 Installation

**Important:** You must install the version that matches your `dbt-bigquery` version.

### Step 1: Check Your dbt-bigquery Version

```bash
pip show dbt-bigquery
# Look for: Version: 1.8.3 (or 1.9.2, 1.10.3, etc.)
```

### Step 2: Install the Matching Adapter Version

The adapter version format is `{base_version}.{dbt-bigquery_version}`. Install the version that matches your `dbt-bigquery`:

```bash
# For dbt-bigquery 1.8.3
pip install dbt-rabbit-bigquery==1.1.0.1.8.3

# For dbt-bigquery 1.9.2
pip install dbt-rabbit-bigquery==1.1.0.1.9.2

# For dbt-bigquery 1.10.3
pip install dbt-rabbit-bigquery==1.1.0.1.10.3
```

### Step 3: Verify Installation

```bash
dbt --version
# Should show: rabbitbigquery: 1.1.0
```

### Installation in Requirements Files

**requirements.txt:**
```txt
dbt-bigquery==1.8.3
dbt-rabbit-bigquery==1.1.0.1.8.3
```

**pyproject.toml:**
```toml
[project]
dependencies = [
    "dbt-bigquery==1.8.3",
    "dbt-rabbit-bigquery==1.1.0.1.8.3",
]
```

**Poetry (pyproject.toml):**
```toml
[tool.poetry.dependencies]
dbt-bigquery = "1.8.3"
dbt-rabbit-bigquery = "1.1.0.1.8.3"
```

### Supported Versions

| dbt-bigquery | dbt-rabbit-bigquery | Status |
|--------------|---------------------|--------|
| 1.8.3        | 1.1.0.1.8.3         | ✅ Supported |
| 1.9.2        | 1.1.0.1.9.2         | ✅ Supported |
| 1.10.3       | 1.1.0.1.10.3        | ✅ Supported |

**Note:** Always use the exact version match. The adapter version must match your `dbt-bigquery` version for compatibility.

### Upgrading

When upgrading `dbt-bigquery`, always upgrade `dbt-rabbit-bigquery` to the matching version:

```bash
# Upgrade dbt-bigquery
pip install --upgrade dbt-bigquery==1.10.3

# Upgrade adapter to matching version
pip install --upgrade dbt-rabbit-bigquery==1.1.0.1.10.3
```

### Understanding the Version Format

The adapter publishes separate versions to PyPI for each supported `dbt-bigquery` version. The version format encodes compatibility:

**Format:** `{base_version}.{dbt-bigquery_version}`

**Examples:**
- `1.1.0.1.8.3` - Compatible with `dbt-bigquery==1.8.3`
- `1.1.0.1.9.2` - Compatible with `dbt-bigquery==1.9.2`
- `1.1.0.1.10.3` - Compatible with `dbt-bigquery==1.10.3`

### Why This Versioning Approach?

The adapter needs to:
1. **Encode compatibility** - The PyPI version must indicate which `dbt-bigquery` version it's compatible with
2. **Satisfy PyPI requirements** - PyPI doesn't allow local version identifiers (`+`)
3. **Satisfy dbt validation** - dbt requires valid semantic versions (X.Y.Z format) in adapter files

The hybrid approach:
- Uses `{base}.{dbt-bigquery-version}` in `pyproject.toml` for PyPI publishing
- Uses `{base}` in `__version__.py` and `dbt_project.yml` for dbt validation
- Allows users to install the correct version for their `dbt-bigquery` setup

This allows the adapter to support multiple `dbt-bigquery` versions while satisfying both PyPI and dbt requirements.

---

## 🚀 Quick Start

### 1. Get Your Rabbit API Key

Sign up for Rabbit and get your API key: [https://followrabbit.ai](https://followrabbit.ai)

Contact: success@followrabbit.ai

### 2. Update Your `profiles.yml`

Change your profile type from `bigquery` to `rabbitbigquery`:

```yaml
my_project:
  target: dev
  outputs:
    dev:
      type: rabbitbigquery  # Changed from 'bigquery'
      method: service-account
      project: my-gcp-project
      dataset: my_dataset
      threads: 4
      keyfile: /path/to/service-account.json
      location: US

      # Rabbit configuration (3 lines added)
      rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
      rabbit_default_pricing_mode: on_demand
      rabbit_reservation_ids: "project:us.my-reservation"
```

### 3. Set Environment Variables

```bash
export RABBIT_API_KEY="your-api-key-here"
```

### 4. Run dbt as usual

```bash
dbt run
```

That's it! All your queries are now automatically optimized. 🎉

---

## 📖 Configuration

### Complete Configuration Example

```yaml
my_project:
  target: prod
  outputs:
    prod:
      # Standard BigQuery configuration (unchanged)
      type: rabbitbigquery
      method: service-account
      project: my-gcp-project
      dataset: analytics
      threads: 8
      keyfile: /path/to/service-account.json
      location: US

      # Rabbit optimization configuration
      rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
      rabbit_default_pricing_mode: on_demand  # or 'slot_based'

      # Multiple reservations (comma-separated or list format)
      rabbit_reservation_ids: "project:us.res1,project:eu.res2"
      # Or as a list:
      # rabbit_reservation_ids:
      #   - "project:us.res1"
      #   - "project:eu.res2"

      # Optional: Custom Rabbit API URL (for enterprise)
      rabbit_base_url: https://api.followrabbit.ai

      # Optional: Disable optimization temporarily
      rabbit_enabled: true
```

### Configuration Parameters

#### Required Parameters

| Parameter | Description | Example |
|-----------|-------------|---------|
| `rabbit_api_key` | Your Rabbit API key | `"rb_1234..."` |
| `rabbit_default_pricing_mode` | Default pricing model | `"on_demand"` or `"slot_based"` |
| `rabbit_reservation_ids` | BigQuery reservation IDs | `"project:us.res1,project:eu.res2"` |

#### Optional Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `rabbit_base_url` | Production API | Custom API endpoint for enterprise |
| `rabbit_enabled` | `true` | Enable/disable optimization |

### Reservation ID Format

Reservation IDs should follow the BigQuery format:

```
project-id:location.reservation-name
```

Examples:
- `my-project:us-central1.reservation1`
- `my-project:us.default-reservation`
- `my-project:europe-west1.batch-processing`

---

## 💡 How It Works

```mermaid
graph LR
    A[dbt SQL Model] --> B[Rabbit Adapter]
    B --> C{Analyze Query}
    C --> D[Rabbit API]
    D --> E{Optimize}
    E --> F[Assign Optimal Reservation]
    F --> G[BigQuery]
    G --> H[Results]
```

1. **Intercept**: The adapter captures each BigQuery job configuration
2. **Analyze**: Sends metadata to Rabbit API (query, project, reservations)
3. **Optimize**: Rabbit analyzes query characteristics and assigns optimal pricing
4. **Execute**: Job runs on BigQuery with optimized configuration
5. **Track**: View savings and performance in Rabbit dashboard

**What Gets Sent to Rabbit?**
- SQL query text
- Job configuration (not your data)
- Available reservation options

**What Rabbit Returns:**
- Optimized job configuration
- Assigned reservation
- Expected cost savings

---

## 📊 Monitoring & Verification

### View Logs

#### Console Output
```bash
dbt run --debug
```

#### Log File
```bash
cat logs/dbt.log | grep "RabbitBigQuery"
```

### Example Log Output

```
INFO: Rabbit optimization enabled | Default pricing mode: on_demand | Reservations: ['project:us.res1']
INFO: Optimized Job executed successfully
```

### Verify in Rabbit Dashboard

The easiest way to verify optimization and see cost savings:
1. Log in to your Rabbit dashboard
2. View optimized jobs in real-time
3. See cost savings per query
4. Track monthly savings trends

---

## ❓ FAQ & Common Concerns

### Security & Privacy

**Q: Does Rabbit have access to my data?**
A: No. Rabbit only receives job metadata (SQL queries and configuration), not your actual data. Your data never leaves BigQuery.

**Q: How is my API key stored?**
A: We recommend using environment variables (`{{ env_var('RABBIT_API_KEY') }}`) to keep keys out of version control. The adapter never logs API keys.

**Q: Is this SOC 2 compliant?**
A: Yes. Rabbit is SOC 2 Type II certified. Contact us for compliance documentation.

### Performance

**Q: What's the performance overhead?**
A: Typical API latency is 100-500ms per query. For long-running queries (>10 seconds), this is negligible (<5% overhead). For very short queries, the overhead is still minimal.

**Q: Can I disable optimization for specific models?**
A: Yes. Set `rabbit_enabled: false` in your profile or use environment variables for specific runs.

**Q: Does this affect dbt's multi-threading?**
A: No. dbt's threading works exactly as before. Optimization happens independently per thread.

### Cost & ROI

**Q: How much does Rabbit cost?**
A: Pricing is based on BigQuery spend or queries processed. Most customers save 5-10x more than the Rabbit fee. Contact success@followrabbit.ai for pricing.

**Q: What if optimization makes things more expensive?**
A: Rabbit's algorithm is designed to always reduce costs. If optimization fails or would increase costs, it falls back to your original configuration.

**Q: Can I see cost savings before committing?**
A: Yes. Use the Rabbit dashboard to see potential savings based on your query patterns. We also offer free trials.

### Reliability

**Q: What happens if Rabbit API is down?**
A: The adapter falls back to your original configuration gracefully. Your dbt jobs continue running normally with a warning logged.

**Q: Will this break my existing dbt project?**
A: No. This is a drop-in replacement for dbt-bigquery. All standard dbt functionality works identically.

**Q: Can I roll back quickly?**
A: Yes. Simply change `type: rabbitbigquery` back to `type: bigquery` in your `profiles.yml`. No code changes needed.

### Integration

**Q: Does this work with dbt Cloud?**
A: Currently, this adapter is designed for dbt Core. Contact us for dbt Cloud integration options.

**Q: Can I use this with other dbt packages?**
A: Yes. This adapter is fully compatible with all dbt packages and features.

**Q: Does this work with Airflow/Dagster/Prefect?**
A: Yes. Any orchestration tool that runs dbt Core will work seamlessly.

---

## 🔧 Advanced Usage

### Disable Optimization for Specific Runs

```bash
# Via environment variable
DBT_RABBIT_ENABLED=false dbt run

# Or in profiles.yml
rabbit_enabled: false
```

### Multiple Environments

```yaml
my_project:
  target: prod
  outputs:
    dev:
      type: bigquery  # Standard adapter in dev
      # ... config ...

    prod:
      type: rabbitbigquery  # Optimize in production
      # ... config ...
      rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
```

### Regional Reservations

Use different reservations per region:

```yaml
rabbit_reservation_ids:
  - "my-project:us-central1.us-reservation"
  - "my-project:europe-west1.eu-reservation"
  - "my-project:asia-east1.asia-reservation"
```

Rabbit automatically selects the optimal reservation based on data location.

### Debug Mode

Enable detailed logging:

```bash
dbt run --debug 2>&1 | tee dbt-debug.log
grep "RabbitBigQuery" dbt-debug.log
```

---

## 🧪 Examples

See the [`examples/`](./examples/) directory for:
- Complete dbt project setup
- Multi-environment configurations
- CI/CD integration examples
- Custom optimization scenarios

Quick test:

```bash
cd examples/
./setup.sh
dbt run --select test_simple_query
```

---

## 📊 Compatibility

| Component | Version |
|-----------|---------|
| dbt-core | ≥1.5.0 |
| dbt-bigquery | 1.8.3, 1.9.2, 1.10.3 (see [Installation](#-installation) for exact version matching) |
| Python | ≥3.8 |
| BigQuery API | v2 |

**Version Matching:** You must install the `dbt-rabbit-bigquery` version that matches your `dbt-bigquery` version. See the [Installation](#-installation) section above for details.

---

## 🐛 Troubleshooting

### Adapter Not Found

**Error:** `No module named 'dbt.adapters.rabbitbigquery'`

**Solution:**
```bash
# First, check your dbt-bigquery version
pip show dbt-bigquery

# Install the matching adapter version (replace with your dbt-bigquery version)
pip install dbt-rabbit-bigquery==1.1.0.1.8.3  # For dbt-bigquery 1.8.3
# Or: pip install dbt-rabbit-bigquery==1.1.0.1.9.2  # For dbt-bigquery 1.9.2
# Or: pip install dbt-rabbit-bigquery==1.1.0.1.10.3  # For dbt-bigquery 1.10.3

dbt debug  # Verify installation
```

### Version Mismatch or Validation Error

**Error:** `"1.1.0.1.8.3" is not a valid semantic version` or version validation errors

**Solution:**
- This error occurs if dbt tries to validate the concatenated version
- Ensure you've installed the correct adapter version matching your `dbt-bigquery` version
- Verify versions match:
  ```bash
  pip show dbt-bigquery dbt-rabbit-bigquery
  ```
- The adapter version format is `{base}.{dbt-bigquery-version}` (e.g., `1.1.0.1.8.3` for `dbt-bigquery==1.8.3`)
- If versions don't match, reinstall with the correct version:
  ```bash
  pip install --upgrade dbt-rabbit-bigquery==1.1.0.1.8.3  # Use your dbt-bigquery version
  ```

**Version Mismatch:**
- If you upgrade `dbt-bigquery`, you must also upgrade `dbt-rabbit-bigquery` to the matching version
- Example:
  ```bash
  pip install --upgrade dbt-bigquery==1.10.3
  pip install --upgrade dbt-rabbit-bigquery==1.1.0.1.10.3
  ```

### API Key Issues

**Error:** `Rabbit optimization failed: Invalid API key`

**Solution:**
1. Verify environment variable is set: `echo $RABBIT_API_KEY`
2. Check profiles.yml uses correct env_var syntax: `"{{ env_var('RABBIT_API_KEY') }}"`
3. Ensure no trailing spaces or special characters

### Optimization Not Working

**Symptoms:** No cost savings, Rabbit dashboard shows no activity

**Debug steps:**
```bash
# 1. Enable debug logging
dbt run --debug --select your_model

# 2. Check for Rabbit log entries
cat logs/dbt.log | grep -i rabbit

# 3. Verify configuration
dbt debug

# 4. Test API connectivity
python3 -c "from rabbit_bq_job_optimizer import RabbitBQJobOptimizer; \
  client = RabbitBQJobOptimizer(api_key='YOUR_KEY'); print('✓ Connected')"
```

### Performance Issues

If you experience unusual slowness:
1. Check Rabbit API status: [status.followrabbit.ai](https://followrabbit.ai)
2. Temporarily disable: `rabbit_enabled: false`
3. Contact support with job IDs: success@followrabbit.ai

---

## 🤝 Contributing

We welcome contributions! See [CONTRIBUTING.md](./CONTRIBUTING.md) for:
- Development setup
- Code style guidelines
- Testing requirements
- Pull request process

### Quick Start for Contributors

```bash
# Clone and setup
git clone https://github.com/your-username/dbt-rabbit-bigquery.git
cd dbt-rabbit-bigquery
python3 -m venv venv
source venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Set up pre-commit hooks (runs checks automatically before each commit)
pre-commit install

# Run all checks manually
pre-commit run --all-files
```

Pre-commit hooks automatically run:
- **Black** - Code formatting (config: `pyproject.toml`)
- **Flake8** - Linting (config: `.flake8`)
- **MyPy** - Type checking (config: `pyproject.toml`)
- **Pydocstyle** - Docstring validation (config: `pyproject.toml`)
- **File checks** - Trailing whitespace, EOF, YAML validation

**Note**: All linting and formatting tools use centralized configuration files that are shared between pre-commit hooks and CI/CD pipelines, ensuring consistency.

---

## 📚 Additional Resources

- **Rabbit Documentation**: [docs.followrabbit.ai](https://followrabbit.ai)
- **dbt Documentation**: [docs.getdbt.com](https://docs.getdbt.com)
- **BigQuery Reservations**: [cloud.google.com/bigquery/docs/reservations](https://cloud.google.com/bigquery/docs/reservations)
- **Blog: Optimizing BigQuery Costs in dbt**: [Coming Soon]

---

## 📞 Support

- **Email**: success@followrabbit.ai
- **Issues**: [GitHub Issues](https://github.com/your-org/dbt-rabbit-bigquery/issues)
- **Documentation**: This README + [examples/](./examples/)
- **Security Issues**: security@followrabbit.ai

---

## 📄 License

Apache License 2.0 - see [LICENSE](./LICENSE) for details.

---

## 🗺️ Roadmap

- [ ] Support for dbt Cloud
- [ ] Additional optimization strategies (query rewriting, caching)
- [ ] Real-time cost dashboards in dbt docs
- [ ] Integration with Snowflake and Redshift
- [ ] Auto-detection of optimal pricing models

---

## ⭐ Show Your Support

If this adapter saves you money, give us a star! ⭐

It helps others discover cost optimization for their dbt projects.

---

## 🙏 Acknowledgments

Built with ♥️ by the Rabbit team. Powered by:
- [dbt](https://www.getdbt.com/) - The best data transformation tool
- [BigQuery](https://cloud.google.com/bigquery) - Google's data warehouse
- [rabbit-bq-job-optimizer](https://pypi.org/project/rabbit-bq-job-optimizer/) - Core optimization library
