Metadata-Version: 2.4
Name: goquality
Version: 0.2.0
Summary: AI-Native Data Governance: TypeScript for Databases
Author: GoQuality
License: MIT
Keywords: ai,data-quality,governance,postgres,validation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: anthropic>=0.18.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: ibis-framework[duckdb,postgres]>=9.0.0
Requires-Dist: openai>=1.0.0
Requires-Dist: psycopg2-binary>=2.9.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0.0
Requires-Dist: sqlparse>=0.4.4
Requires-Dist: tomli>=2.0.0; python_version < '3.11'
Requires-Dist: typer>=0.9.0
Provides-Extra: all-connectors
Requires-Dist: databricks-sql-connector>=2.0.0; extra == 'all-connectors'
Requires-Dist: google-cloud-bigquery>=3.0.0; extra == 'all-connectors'
Requires-Dist: ibis-framework[bigquery]>=9.0.0; extra == 'all-connectors'
Requires-Dist: ibis-framework[mssql]>=9.0.0; extra == 'all-connectors'
Requires-Dist: ibis-framework[mysql]>=9.0.0; extra == 'all-connectors'
Requires-Dist: ibis-framework[snowflake]>=9.0.0; extra == 'all-connectors'
Requires-Dist: pymysql>=1.0.0; extra == 'all-connectors'
Requires-Dist: pyodbc>=4.0.0; extra == 'all-connectors'
Requires-Dist: snowflake-connector-python>=3.0.0; extra == 'all-connectors'
Provides-Extra: bigquery
Requires-Dist: google-cloud-bigquery>=3.0.0; extra == 'bigquery'
Requires-Dist: ibis-framework[bigquery]>=9.0.0; extra == 'bigquery'
Provides-Extra: databricks
Requires-Dist: databricks-sql-connector>=2.0.0; extra == 'databricks'
Provides-Extra: dev
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pandas>=2.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: mssql
Requires-Dist: ibis-framework[mssql]>=9.0.0; extra == 'mssql'
Requires-Dist: pyodbc>=4.0.0; extra == 'mssql'
Provides-Extra: mysql
Requires-Dist: ibis-framework[mysql]>=9.0.0; extra == 'mysql'
Requires-Dist: pymysql>=1.0.0; extra == 'mysql'
Provides-Extra: snowflake
Requires-Dist: ibis-framework[snowflake]>=9.0.0; extra == 'snowflake'
Requires-Dist: snowflake-connector-python>=3.0.0; extra == 'snowflake'
Description-Content-Type: text/markdown

# GoQuality CLI

**AI-Native Data Governance: TypeScript for Databases**

GoQuality brings type safety to your data. Define types once, validate everywhere. Let AI generate the types, you govern the rules.

```
┌─────────────────────────────────────────────────────────────────┐
│  Database  →  AI Inference  →  YAML Types  →  Validation  →  ✓ │
│                                                                 │
│  "email"      Email           pattern: ^...   99.8% valid      │
│  "amount"     USD             min: 0          100% valid       │
│  "status"     OrderStatus     enum: [...]     98.2% valid      │
└─────────────────────────────────────────────────────────────────┘
```

## Installation

```bash
# Basic installation (includes PostgreSQL and DuckDB)
pip install goquality

# Cloud Data Warehouses
pip install goquality[snowflake]     # Snowflake
pip install goquality[bigquery]      # Google BigQuery
pip install goquality[databricks]    # Databricks SQL & Unity Catalog

# Traditional Databases
pip install goquality[mysql]         # MySQL / MariaDB
pip install goquality[mssql]         # Microsoft SQL Server / Azure SQL

# All database connectors
pip install goquality[all-connectors]

# Development installation
pip install goquality[dev]
```

## Quick Start

```bash
# 1. Initialize a new project
goquality init

# 2. Generate types from your database using AI
goquality generate --source postgres://user:pass@localhost/mydb

# 3. Review and edit the generated goquality.yaml

# 4. Run validation checks
goquality check --source postgres://user:pass@localhost/mydb

# 5. Diagnose any issues
goquality doctor --source postgres://user:pass@localhost/mydb
```

## Commands

### `goquality init`

Initialize a new GoQuality configuration file.

```bash
goquality init [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--source` | `-s` | Database connection string to test |
| `--path` | `-p` | Path for configuration file (default: `goquality.yaml`) |

**Examples:**
```bash
# Create default config
goquality init

# Create config and test database connection
goquality init --source postgres://localhost/mydb

# Create config at custom path
goquality init --path config/goquality.yaml
```

---

### `goquality generate`

Generate type mappings using AI inference. Profiles your database schema and uses an LLM to suggest appropriate types for each column.

```bash
goquality generate [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--source` | `-s` | Database connection string (required) |
| `--output` | `-o` | Output file path (default: `goquality.yaml`) |
| `--schema` | | Database schema to profile |
| `--provider` | | LLM provider: `openai`, `anthropic`, `ollama` (default: `openai`) |

**Environment Variables:**
- `OPENAI_API_KEY` - Required for OpenAI provider
- `ANTHROPIC_API_KEY` - Required for Anthropic provider
- `OLLAMA_HOST` - Ollama server URL (default: `http://localhost:11434`)

**Examples:**
```bash
# Generate using OpenAI (default)
goquality generate --source postgres://localhost/mydb

# Generate using Anthropic Claude
goquality generate --source postgres://localhost/mydb --provider anthropic

# Generate for specific schema
goquality generate --source postgres://localhost/mydb --schema public

# Generate using local Ollama
OLLAMA_HOST=http://localhost:11434 goquality generate \
  --source postgres://localhost/mydb \
  --provider ollama
```

---

### `goquality check`

Run validation checks against your database. This is the core command that validates your data against the defined types.

```bash
goquality check [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--config` | `-c` | Configuration file path (default: `goquality.yaml`) |
| `--source` | `-s` | Database connection string |
| `--table` | `-t` | Only check this specific table |
| `--output` | `-o` | Output format: `table`, `json`, `yaml`, `csv`, `markdown`, `junit` |
| `--fail-threshold` | | Percentage of failures allowed (0-100) |
| `--fail-on-error/--no-fail-on-error` | | Exit with error code on failures (default: true) |
| `--quiet` | `-q` | Only show errors and summary |
| `--skip-references` | | Skip reference (FK) validation |
| `--skip-contracts` | | Skip contract validation |
| `--skip-freshness` | | Skip freshness validation |
| `--skip-volume` | | Skip volume (row count) validation |
| `--only` | | Only run specific validation: `types`, `references`, `contracts`, `freshness`, `volume` |
| `--parallel` | | Run table validations in parallel for faster execution |
| `--workers` | | Number of parallel workers (default: 4, only used with --parallel) |
| `--sample-size` | | Validate only N random rows per table (for large tables) |
| `--sample-percent` | | Validate only X% of rows per table (0.0-100.0) |
| `--notify/--no-notify` | | Send notifications configured in goquality.yaml (default: true) |
| `--webhook` | | Send results to this webhook URL |
| `--slack-webhook` | | Send Slack notification to this webhook URL |

**Exit Codes:**
- `0` - All checks passed (or within threshold)
- `1` - Validation failures detected (above threshold)

**Examples:**
```bash
# Basic check
goquality check --source postgres://localhost/mydb

# Check specific table
goquality check --source postgres://localhost/mydb --table users

# Output as JSON (for CI/CD pipelines)
goquality check --source postgres://localhost/mydb --output json

# Output as JUnit XML (for CI/CD test reporting)
goquality check --source postgres://localhost/mydb --output junit > results.xml

# Allow up to 5% failures
goquality check --source postgres://localhost/mydb --fail-threshold 5

# Generate markdown report
goquality check --source postgres://localhost/mydb --output markdown > report.md

# Send results to a webhook
goquality check --source postgres://localhost/mydb --webhook https://your-api.com/results

# Send Slack notification
goquality check --source postgres://localhost/mydb --slack-webhook https://hooks.slack.com/services/xxx

# Quiet mode for scripts
goquality check --source postgres://localhost/mydb --quiet

# Don't fail on errors (always exit 0)
goquality check --source postgres://localhost/mydb --no-fail-on-error

# Run in parallel for faster validation of many tables
goquality check --source postgres://localhost/mydb --parallel --workers 8

# Only run type validation (skip references and contracts)
goquality check --source postgres://localhost/mydb --only types

# Only run reference/FK validation
goquality check --source postgres://localhost/mydb --only references

# Only run freshness checks
goquality check --source postgres://localhost/mydb --only freshness

# Only run volume checks
goquality check --source postgres://localhost/mydb --only volume

# Use sampling for large tables (10,000 rows per table)
goquality check --source postgres://localhost/mydb --sample-size 10000

# Sample 1% of each table for quick validation
goquality check --source postgres://localhost/mydb --sample-percent 1

# Save metrics to JSON file
goquality check --source postgres://localhost/mydb --metrics-file metrics.json

# Enable performance profiling
goquality check --source postgres://localhost/mydb --profile

# Both metrics and profiling
goquality check --source postgres://localhost/mydb --metrics-file metrics.json --profile
```

**Output Formats:**

*Table (default):*
```
┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Column   ┃ Type   ┃ Rows   ┃ Valid % ┃ Status   ┃ Details       ┃
┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ email    │ Email  │ 10,000 │ 99.8%   │ ✓ PASS   │               │
│ status   │ Status │ 10,000 │ 98.2%   │ ✗ FAIL   │ 180 invalid   │
└──────────┴────────┴────────┴─────────┴──────────┴───────────────┘
```

*JSON:*
```json
{
  "summary": {
    "total_checks": 5,
    "passed": 4,
    "failed": 1,
    "failure_rate": 20.0,
    "threshold": 0.0,
    "threshold_passed": false
  },
  "tables": [...]
}
```

*JUnit XML:*
```xml
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="GoQuality Data Validation" tests="5" failures="1">
  <testsuite name="public.users" tests="3" failures="0">
    <testcase name="email (Email)" classname="public.users"/>
    <testcase name="id (UUID)" classname="public.users"/>
  </testsuite>
  <testsuite name="public.orders" tests="2" failures="1">
    <testcase name="total (USD)" classname="public.orders">
      <failure message="3 rows failed validation" type="ValidationError">
        Column: total
        Type: USD
        Invalid rows: 3 (0.03%)
      </failure>
    </testcase>
  </testsuite>
</testsuites>
```

---

### `goquality validate`

Validate configuration file syntax without connecting to a database.

```bash
goquality validate [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--config` | `-c` | Configuration file to validate (default: `goquality.yaml`) |

**Examples:**
```bash
# Validate default config
goquality validate

# Validate specific config
goquality validate --config staging.yaml
```

---

### `goquality types`

List and search available types in the standard library.

```bash
goquality types [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--search` | `-s` | Search types by name or description |
| `--tag` | `-t` | Filter by tag |
| `--show` | | Show details for a specific type |

**Examples:**
```bash
# List all types
goquality types

# Search for email types
goquality types --search email

# Filter by tag
goquality types --tag finance
goquality types --tag healthcare
goquality types --tag regional

# Show type details
goquality types --show Email
goquality types --show CreditCardNumber
```

**Available Tags:**
- `core` - Basic string/number types
- `finance` - Currency, banking, payments
- `healthcare` - Medical codes, identifiers
- `ecommerce` - Products, orders, shipping
- `saas` - API keys, tokens, SaaS identifiers
- `regional` - Country-specific formats
- `analytics` - Metrics, percentages, scores
- `iot` - Sensors, devices, protocols
- `pii` - Personally identifiable information

---

### `goquality doctor`

Diagnose your GoQuality environment and configuration.

```bash
goquality doctor [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--config` | `-c` | Configuration file to check |
| `--source` | `-s` | Database connection to test |
| `--verbose` | `-v` | Show detailed information |

**Checks Performed:**
- Python version compatibility
- Core dependencies installed
- Database drivers available
- LLM providers configured
- Type library loading
- Configuration file validity
- Database connectivity
- Environment variables

**Examples:**
```bash
# Basic diagnostics
goquality doctor

# Check with database connection
goquality doctor --source postgres://localhost/mydb

# Verbose output
goquality doctor --verbose
```

---

### `goquality stats`

Show statistics about the type library and configuration.

```bash
goquality stats [OPTIONS]
```

**Options:**
| Option | Short | Description |
|--------|-------|-------------|
| `--config` | `-c` | Configuration file path |

**Examples:**
```bash
goquality stats
```

---

### `goquality version`

Show version information.

```bash
goquality version
```

---

### `goquality connections`

Manage database connections configured in `goquality.toml`.

```bash
# List configured connections
goquality connections list

# Test a specific connection
goquality connections test dev

# Test all connections
goquality connections test-all

# Show connection details (credentials masked)
goquality connections show dev
```

---

### `goquality config`

Manage project configuration (`goquality.toml`).

```bash
# Create a new goquality.toml
goquality config init

# Show current configuration
goquality config show

# Validate configuration
goquality config validate

# Show config file path
goquality config path
```

---

## Configuration File

GoQuality uses YAML configuration files. The default file is `goquality.yaml`.

### Full Example

```yaml
# GoQuality Configuration
# https://goquality.dev/docs

# Custom type definitions (extend or override stdlib)
types:
  # Simple type with pattern
  - name: EmployeeId
    description: "Internal employee identifier"
    base: String
    pattern: "^EMP-[0-9]{6}$"
    min_length: 10
    max_length: 10

  # Type extending stdlib
  - name: CorporateEmail
    description: "Company email address"
    base: String
    extends: Email
    pattern: "^[a-z.]+@acme\\.com$"

  # Numeric type with range
  - name: DiscountPercent
    description: "Discount percentage"
    base: Decimal
    min: 0
    max: 100
    precision: 2

  # Enum type
  - name: Department
    description: "Company department"
    base: String
    enum: ["engineering", "sales", "marketing", "hr", "finance"]

  # Type with uniqueness constraint
  - name: ProductSKU
    description: "Unique product SKU"
    base: String
    pattern: "^[A-Z]{2}-[0-9]{6}$"
    unique: true

# Model mappings (table → column types)
models:
  - table: public.users
    columns:
      - name: id
        type: UUID
      - name: email
        type: CorporateEmail
      - name: employee_id
        type: EmployeeId
      - name: department
        type: Department
      - name: created_at
        type: Timestamp
    # Volume check: ensure table isn't empty
    volume:
      min_rows: 1

  - table: public.orders
    columns:
      - name: id
        type: UUID
      - name: user_id
        type: UUID
      - name: total_amount
        type: USD
      - name: discount
        type: DiscountPercent
        allow_null: true  # Override type's nullability
      - name: status
        type: OrderStatus
    # Freshness check: ensure recent data
    freshness:
      column: created_at
      warn_after:
        hours: 1
      error_after:
        hours: 6
    # Volume check: bounded row count
    volume:
      min_rows: 1000
      max_rows: 50000000

  # Pattern matching - apply to all audit tables
  - table: "*_audit"
    columns:
      - name: "*_at"       # Match created_at, updated_at, deleted_at
        type: Timestamp
      - name: "*_by"       # Match created_by, updated_by
        type: UUID

# Explicit relationships (FK validation)
relationships:
  - from: orders.user_id
    to: users.id

  - from: orders.shipping_address_id
    to: addresses.id
    name: "Order Shipping Address"
    nullable: true

  - from: order_items.order_id
    to: orders.id

# Ad-hoc checks (quick SQL rules)
checks:
  - "on": orders
    name: "Order integrity"
    rules:
      - "total_amount >= 0"
      - "created_at <= NOW()"
      - "status IS NOT NULL"

  - "on": users
    name: "User constraints"
    rules:
      - "email IS NOT NULL"
      - "created_at <= NOW()"

# SQL contracts (complex cross-table validation)
contracts:
  - name: order_items_sum_matches_total
    description: "Order items should sum to order total"
    sql: |
      SELECT o.id, o.total, SUM(oi.quantity * oi.unit_price) as items_sum
      FROM orders o
      JOIN order_items oi ON o.id = oi.order_id
      GROUP BY o.id, o.total
      HAVING ABS(o.total - SUM(oi.quantity * oi.unit_price)) > 0.01
    expect: empty
    severity: error

  - name: recent_orders_exist
    description: "Should have orders in the last 24 hours"
    sql: |
      SELECT 1 FROM orders
      WHERE created_at > NOW() - INTERVAL '24 hours'
      LIMIT 1
    expect: not_empty
    severity: warning

# Notifications (optional)
notifications:
  # Slack notification on failures
  - type: slack
    url: ${SLACK_WEBHOOK_URL}
    trigger: failure
    mention_on_failure:
      - U12345678  # Slack user ID

  # Webhook for custom integrations
  - type: webhook
    url: https://your-api.com/goquality-results
    trigger: always
    headers:
      Authorization: "Bearer ${API_TOKEN}"
```

### Notifications Configuration

Configure notifications to alert your team when validation runs complete.

| Field | Type | Description |
|-------|------|-------------|
| `type` | string | Notification type: `webhook`, `slack` |
| `url` | string | Webhook URL (use `${ENV_VAR}` for secrets) |
| `trigger` | string | When to notify: `always`, `failure`, `success`, `threshold_exceeded` |
| `headers` | object | Custom HTTP headers (optional) |
| `auth_token` | string | Bearer token for Authorization header (optional) |
| `timeout_seconds` | float | Request timeout (default: 30) |
| `retry_count` | int | Number of retries on failure (default: 3) |
| `include_samples` | bool | Include sample failure values (default: true) |
| `max_failures_shown` | int | Maximum failures to show (default: 10) |
| `channel` | string | Slack channel override (Slack only) |
| `mention_on_failure` | array | Slack user IDs to mention on failure (Slack only) |

### Type Definition Fields

| Field | Type | Description |
|-------|------|-------------|
| `name` | string | PascalCase type name (required) |
| `description` | string | Human-readable description (required) |
| `base` | string | Base type: `String`, `Integer`, `Decimal`, `Boolean`, `Date`, `Timestamp` |
| `extends` | string | Parent type to inherit from |
| `pattern` | string | Regex pattern (String types) |
| `min_length` | int | Minimum string length |
| `max_length` | int | Maximum string length |
| `not_empty` | bool | Reject empty/whitespace strings |
| `min` | number | Minimum value (numeric types) |
| `max` | number | Maximum value (numeric types) |
| `precision` | int | Decimal places (Decimal type) |
| `enum` | array | Allowed values |
| `allow_null` | bool | Whether NULL is permitted (default: false) |
| `unique` | bool | Values must be unique |
| `references` | string | FK reference as `table.column` (for cross-table validation) |
| `where` | string | SQL WHERE clause filter on referenced table |
| `tags` | array | Searchable tags |
| `examples` | array | Example valid values |
| `deprecated` | bool | Mark as deprecated |

---

## Connection Strings

GoQuality supports multiple database backends via connection strings.

### PostgreSQL

```bash
# Full format
postgres://user:password@host:port/database

# Examples
postgres://postgres:secret@localhost:5432/mydb
postgresql://user:pass@db.example.com/production
postgres://localhost/mydb  # Local with defaults
```

### DuckDB

```bash
# In-memory database
duckdb://:memory:

# File database
duckdb:///path/to/database.db

# CSV/Parquet files (auto-detected)
/path/to/data.csv
/path/to/data.parquet
./relative/path/data.csv
```

### Snowflake

```bash
# Full format
snowflake://user@account/database/schema?warehouse=WAREHOUSE

# Examples
snowflake://john@xy12345/analytics/public?warehouse=COMPUTE_WH
snowflake://user@account/db/schema?warehouse=WH&role=ANALYST
```

**Environment Variables:**
```bash
export SNOWFLAKE_ACCOUNT=xy12345
export SNOWFLAKE_USER=john
export SNOWFLAKE_PASSWORD=secret
export SNOWFLAKE_DATABASE=analytics
export SNOWFLAKE_SCHEMA=public
export SNOWFLAKE_WAREHOUSE=COMPUTE_WH
```

### BigQuery

```bash
# Format
bigquery://project-id/dataset

# Examples
bigquery://my-project/analytics
bigquery://prod-data-warehouse/sales
```

**Environment Variables:**
```bash
export GOOGLE_CLOUD_PROJECT=my-project
export BIGQUERY_DATASET=analytics
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
```

### MySQL / MariaDB

```bash
# Full format
mysql://user:password@host:port/database

# Examples
mysql://root:secret@localhost:3306/mydb
mysql://user:pass@db.example.com/production
mariadb://user:pass@localhost/mydb  # MariaDB compatible

# Cloud providers
mysql://admin:pass@mydb.cluster-xxxxx.us-east-1.rds.amazonaws.com:3306/mydb  # AWS RDS
mysql://user:pass@34.xxx.xxx.xxx:3306/mydb                                    # Cloud SQL
```

**Environment Variables:**
```bash
export MYSQL_HOST=localhost
export MYSQL_PORT=3306
export MYSQL_USER=myuser
export MYSQL_PASSWORD=secret
export MYSQL_DATABASE=mydb
```

### Microsoft SQL Server

```bash
# Full format
mssql://user:password@host:port/database

# Examples
mssql://sa:Password123@localhost:1433/mydb
sqlserver://user:pass@server.database.windows.net/mydb  # Azure SQL

# With schema
mssql://user:pass@localhost/mydb?schema=dbo

# Windows Authentication
mssql://localhost/mydb?trusted_connection=true
```

**Environment Variables:**
```bash
export MSSQL_HOST=localhost
export MSSQL_PORT=1433
export MSSQL_USER=sa
export MSSQL_PASSWORD=secret
export MSSQL_DATABASE=mydb
export MSSQL_SCHEMA=dbo
```

### Databricks

```bash
# Full format
databricks://hostname/schema?http_path=/sql/...&catalog=main&access_token=xxx

# Examples
databricks://my-workspace.cloud.databricks.com/default?http_path=/sql/1.0/warehouses/abc123&access_token=dapiXXX

# Azure Databricks
databricks://adb-123456789.7.azuredatabricks.net/default?http_path=/sql/1.0/warehouses/abc&access_token=dapiXXX

# With Unity Catalog
databricks://hostname/myschema?http_path=/sql/1.0/warehouses/abc&catalog=production&access_token=dapiXXX
```

**Environment Variables:**
```bash
export DATABRICKS_HOST=my-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapiXXXXXXXXXX
export DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/abc123
export DATABRICKS_CATALOG=main
export DATABRICKS_SCHEMA=default
```

---

## Project Configuration (goquality.toml)

Store database connections, AI provider settings, and environments in a TOML configuration file.

### File Location

GoQuality looks for project config in:
1. `./goquality.toml` (project root)
2. `./.goquality/config.toml`
3. `~/.config/goquality/config.toml` (user-level defaults)

### Quick Start

```bash
# Create a new goquality.toml
goquality config init

# List configured connections
goquality connections list

# Test a connection
goquality connections test dev

# Validate configuration
goquality config validate
```

### Full Example

```toml
# goquality.toml - Project configuration

[project]
name = "My Data Project"

#──────────────────────────────────────────────────────────────────────────────
# DATABASE CONNECTIONS
#──────────────────────────────────────────────────────────────────────────────

[connections]
default = "dev"  # Default connection when --source not specified

[connections.local]
connection_string = "duckdb://:memory:"
description = "Local testing with DuckDB"

[connections.dev]
dialect = "postgres"
host = "localhost"
port = 5432
database = "myapp_dev"
user = "${DB_USER}"           # Environment variable interpolation
password = "${DB_PASSWORD}"
description = "Development database"

[connections.staging]
dialect = "postgres"
host = "${STAGING_DB_HOST}"
database = "myapp_staging"
user = "${STAGING_DB_USER}"
password = "${STAGING_DB_PASSWORD}"
description = "Staging environment"

[connections.prod]
connection_string = "postgres://${PROD_USER}:${PROD_PASS}@prod.example.com/myapp"
description = "Production database (read-only)"

[connections.warehouse]
dialect = "snowflake"
host = "xy12345.snowflakecomputing.com"
database = "analytics"
schema = "public"
user = "${SNOWFLAKE_USER}"
password = "${SNOWFLAKE_PASSWORD}"
description = "Snowflake data warehouse"

[connections.warehouse.options]
warehouse = "COMPUTE_WH"
role = "ANALYST"

#──────────────────────────────────────────────────────────────────────────────
# AI / LLM CONFIGURATION (for `goquality generate`)
#──────────────────────────────────────────────────────────────────────────────

[ai]
default = "openai"  # Default AI provider

[ai.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4o"  # Optional: override default model

[ai.anthropic]
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4-20250514"

[ai.ollama]
host = "http://localhost:11434"
model = "llama3"

#──────────────────────────────────────────────────────────────────────────────
# ENVIRONMENTS (bundle connection + AI + settings per environment)
#──────────────────────────────────────────────────────────────────────────────

[environments]
default = "dev"

[environments.dev]
connection = "dev"
ai = "ollama"              # Use local LLM in dev
fail_threshold = 10        # More lenient in development

[environments.staging]
connection = "staging"
ai = "openai"
fail_threshold = 5

[environments.prod]
connection = "prod"
ai = "openai"
fail_threshold = 0         # Zero tolerance in production

#──────────────────────────────────────────────────────────────────────────────
# DEFAULT CLI OPTIONS
#──────────────────────────────────────────────────────────────────────────────

[defaults]
parallel = true
workers = 4
output = "table"
notify = true
```

### Using Named Connections

```bash
# Use default connection (from goquality.toml)
goquality check

# Use named connection
goquality check --source dev
goquality check --source staging
goquality check --source warehouse

# Explicit connection string still works
goquality check --source postgres://user:pass@localhost/mydb
```

### Using Environments

```bash
# Use default environment
goquality check

# Use named environment (bundles connection + AI + settings)
goquality check --env prod
goquality generate --env dev

# Environment via environment variable
export GOQUALITY_ENV=staging
goquality check
```

### Connection Management Commands

```bash
# List all configured connections
goquality connections list
goquality connections list --verbose

# Test a specific connection
goquality connections test dev
goquality connections test  # Tests default connection

# Test all connections
goquality connections test-all

# Show connection details (credentials masked)
goquality connections show dev
```

### Config Management Commands

```bash
# Create new config file
goquality config init
goquality config init --name "My Project"

# Show current configuration
goquality config show
goquality config show --verbose

# Validate configuration
goquality config validate

# Show config file path
goquality config path
```

---

## CI/CD Integration

### GitHub Actions

```yaml
name: Data Quality

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 6 * * *'  # Daily at 6 AM

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install GoQuality
        run: pip install goquality[postgres]

      - name: Validate Configuration
        run: goquality validate

      - name: Run Data Quality Checks
        run: |
          goquality check \
            --source ${{ secrets.DATABASE_URL }} \
            --output junit \
            --fail-threshold 1 \
            > results.xml

      - name: Upload Test Results
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: quality-report
          path: results.xml

      - name: Publish Test Results
        uses: dorny/test-reporter@v1
        if: always()
        with:
          name: GoQuality Results
          path: results.xml
          reporter: java-junit
```

### GitHub Actions with Slack Notifications

```yaml
name: Data Quality with Notifications

on:
  schedule:
    - cron: '0 6 * * *'  # Daily at 6 AM

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install GoQuality
        run: pip install goquality[postgres]

      - name: Run Data Quality Checks
        run: |
          goquality check \
            --source ${{ secrets.DATABASE_URL }} \
            --output junit \
            --slack-webhook ${{ secrets.SLACK_WEBHOOK_URL }} \
            > results.xml
        env:
          GOQUALITY_SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK_URL }}
```

### GitLab CI

```yaml
data-quality:
  image: python:3.11
  stage: test
  script:
    - pip install goquality[postgres]
    - goquality validate
    - goquality check --source $DATABASE_URL --output junit > report.xml
  artifacts:
    reports:
      junit: report.xml
    paths:
      - report.xml
    expire_in: 1 week
```

### Pre-commit Hook

```yaml
# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: goquality-validate
        name: Validate GoQuality Config
        entry: goquality validate
        language: system
        files: goquality\.yaml$
        pass_filenames: false
```

---

## Standard Library Types

GoQuality includes 320+ pre-defined types organized by category.

### Core Types

| Type | Base | Description |
|------|------|-------------|
| `Email` | String | Email address |
| `EmailNullable` | String | Optional email |
| `UUID` | String | UUID v4 |
| `URL` | String | HTTP/HTTPS URL |
| `PhoneNumber` | String | International phone |
| `Hostname` | String | DNS hostname |

### Finance Types

| Type | Base | Description |
|------|------|-------------|
| `USD` | Decimal | US Dollar amount |
| `EUR` | Decimal | Euro amount |
| `CreditCardNumber` | String | Credit card (Luhn) |
| `IBAN` | String | International bank account |
| `BIC` | String | Bank identifier code |
| `ABARoutingNumber` | String | US routing number |

### Healthcare Types

| Type | Base | Description |
|------|------|-------------|
| `ICD10` | String | ICD-10 diagnosis code |
| `CPT` | String | CPT procedure code |
| `NPI` | String | National Provider ID |
| `NDC` | String | National Drug Code |
| `LOINC` | String | Lab test code |

### E-commerce Types

| Type | Base | Description |
|------|------|-------------|
| `SKU` | String | Stock keeping unit |
| `UPC` | String | UPC-A barcode |
| `EAN13` | String | EAN-13 barcode |
| `ASIN` | String | Amazon product ID |
| `ISBN13` | String | Book ISBN-13 |

### Regional Types

| Type | Base | Description |
|------|------|-------------|
| `SSN` | String | US Social Security |
| `USZipCode` | String | US ZIP code |
| `USState` | String | US state code |
| `GermanVATNumber` | String | German VAT |
| `UKPostcode` | String | UK postcode |
| `IndianPAN` | String | Indian tax ID |

### Analytics Types

| Type | Base | Description |
|------|------|-------------|
| `Percentage` | Decimal | 0-100 percentage |
| `Rate` | Decimal | 0-1 rate |
| `Score` | Decimal | 0-100 score |
| `MRR` | Decimal | Monthly recurring revenue |
| `NPSScore` | Integer | Net promoter score |

Browse all types:
```bash
goquality types
goquality types --tag finance
goquality types --search email
```

---

## Custom Validators (Plugins)

GoQuality supports custom validation logic via Python plugins.

### Creating a Validator

```python
# .goquality/plugins/my_validators.py

from goquality.plugins import register_validator

@register_validator("is_palindrome", description="Check if string is palindrome")
def is_palindrome(value: str) -> bool:
    clean = value.lower().replace(" ", "")
    return clean == clean[::-1]

@register_validator("divisible_by", description="Check divisibility")
def divisible_by_three(value: int) -> bool:
    return value % 3 == 0
```

### Built-in Advanced Validators

| Validator | Description |
|-----------|-------------|
| `luhn` | Luhn checksum (credit cards) |
| `iban` | IBAN checksum |
| `isbn10` | ISBN-10 checksum |
| `isbn13` | ISBN-13 checksum |
| `ean13` | EAN-13 barcode checksum |
| `upc` | UPC-A barcode checksum |
| `email_format` | Email format validation |
| `ipv4` | IPv4 address format |
| `ipv6` | IPv6 address format |
| `mac_address` | MAC address format |
| `json` | Valid JSON string |
| `base64` | Valid Base64 encoding |
| `future_date` | Date in the future |
| `past_date` | Date in the past |

---

## Security & Observability

### Security Features

GoQuality includes comprehensive security features for production use:

#### SQL Injection Prevention

All contract SQL is validated before execution:
- Only SELECT statements are allowed
- Dangerous keywords are blocked (DROP, DELETE, UPDATE, etc.)
- SQL is parsed and validated using AST analysis
- Invalid SQL is rejected at config load time

```yaml
contracts:
  - name: safe_contract
    sql: SELECT * FROM users WHERE active = true  # ✅ Valid
    expect: not_empty
    
  # This would be rejected:
  # sql: DROP TABLE users  # ❌ Rejected
```

#### Query Timeout

Configure query timeouts to prevent hanging validations:

```bash
# Via environment variable
export GOQUALITY_QUERY_TIMEOUT_SECONDS=600

# Or in .env file
GOQUALITY_QUERY_TIMEOUT_SECONDS=600
```

Default timeout: 300 seconds (5 minutes)

### Observability Features

#### Structured Logging

GoQuality supports structured logging to files:

```bash
# Log to file
export GOQUALITY_LOG_FILE=goquality.log

# JSON format for log aggregation
export GOQUALITY_LOG_JSON=true

# Set log level
export GOQUALITY_LOG_LEVEL=DEBUG
```

Log files automatically rotate (10MB max, 5 backups).

#### Metrics Collection

Collect validation metrics for analysis:

```bash
# Save metrics to JSON
goquality check --source postgres://... --metrics-file metrics.json
```

Metrics include:
- Overall statistics (tables, columns, checks, pass rates)
- Type validation metrics
- Reference validation metrics
- Contract validation metrics
- Performance metrics (query times, durations)

Example metrics output:
```json
{
  "run_id": "abc123",
  "timestamp": "2024-01-15T10:30:00Z",
  "total_tables": 10,
  "total_columns": 45,
  "total_checks": 45,
  "passed_checks": 42,
  "failed_checks": 3,
  "duration_seconds": 12.5,
  "pass_rate": 0.933,
  "query_count": 45,
  "avg_query_time_seconds": 0.278
}
```

#### Performance Profiling

Enable performance profiling to identify bottlenecks:

```bash
goquality check --source postgres://... --profile
```

Profiling shows:
- Query execution times
- Table validation durations
- Slowest tables
- Overall performance summary

Example output:
```
Performance Summary:
  Total duration: 12.50s
  Tables validated: 10
  Total queries: 45
  Total rows: 1,234,567
  Avg query time: 0.278s
  Slowest table: orders (3.45s)
```

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `GOQUALITY_QUERY_TIMEOUT_SECONDS` | Query timeout in seconds | 300 |
| `GOQUALITY_LOG_FILE` | Path to log file | None (stderr) |
| `GOQUALITY_LOG_JSON` | Output logs as JSON | false |
| `GOQUALITY_LOG_LEVEL` | Log level (DEBUG, INFO, WARNING, ERROR) | INFO |
| `GOQUALITY_LOG_FILE_MAX_BYTES` | Max log file size before rotation | 10485760 (10MB) |
| `GOQUALITY_LOG_FILE_BACKUP_COUNT` | Number of backup log files | 5 |

## Troubleshooting

### Common Issues

**"Config file not found"**
```bash
# Create a config file
goquality init

# Or specify path
goquality check --config path/to/config.yaml
```

**"Unknown type: X"**
```bash
# List available types
goquality types --search X

# Check if custom type is defined in config
goquality validate
```

**"Connection failed"**
```bash
# Run diagnostics
goquality doctor --source YOUR_CONNECTION_STRING

# Check if driver is installed
pip install goquality[postgres]  # or [snowflake], [bigquery]
```

**"LLM API error"**
```bash
# Check API key is set
echo $OPENAI_API_KEY

# Try different provider
goquality generate --source ... --provider anthropic
goquality generate --source ... --provider ollama
```

### Debug Mode

```bash
# Enable verbose logging
GOQUALITY_DEBUG=1 goquality check --source ...

# Or use log level
GOQUALITY_LOG_LEVEL=DEBUG goquality check --source ...
```

### Logging to File

```bash
# Log to file
export GOQUALITY_LOG_FILE=goquality.log
goquality check --source postgres://localhost/mydb

# JSON format for log aggregation
export GOQUALITY_LOG_JSON=true
export GOQUALITY_LOG_FILE=goquality.log
goquality check --source postgres://localhost/mydb
```

### Notification Environment Variables

```bash
# Webhook URL for notifications (alternative to --webhook flag)
export GOQUALITY_WEBHOOK_URL=https://your-api.com/goquality-results

# Slack webhook URL for notifications (alternative to --slack-webhook flag)
export GOQUALITY_SLACK_WEBHOOK=https://hooks.slack.com/services/xxx/yyy/zzz
```

### Getting Help

```bash
# General help
goquality --help

# Command-specific help
goquality check --help
goquality generate --help
```

---

## License

MIT License - see [LICENSE](LICENSE) for details.

## Contributing

Contributions welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## Links

- Documentation: https://goquality.dev/docs
- GitHub: https://github.com/goquality/goquality
- PyPI: https://pypi.org/project/goquality/
