Metadata-Version: 2.4
Name: databridge-ai-pro
Version: 0.49.2
Summary: DataBridge AI Pro - Advanced data reconciliation, AI agents, and enterprise features
Project-URL: Homepage, https://github.com/datanexum/DATABRIDGE_AI
Project-URL: Documentation, https://github.com/datanexum/DATABRIDGE_AI/wiki/Pro-Features
Project-URL: Repository, https://github.com/datanexum/DATABRIDGE_AI
Project-URL: Issues, https://github.com/datanexum/DATABRIDGE_AI/issues
Author-email: Datanexum Consulting LLC <legal@datanexum.com>
License: Proprietary
License-File: LICENSE
Keywords: ai,analytics,cortex,data,data-catalog,data-mart,dbt,enterprise,graphrag,hierarchy,lineage,mcp,observability,reconciliation,snowflake
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Information Technology
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.10
Requires-Dist: databridge-ai>=0.40.0
Requires-Dist: networkx>=3.0.0
Requires-Dist: sentence-transformers>=2.0.0
Requires-Dist: snowflake-connector-python>=3.0.0
Provides-Extra: all
Requires-Dist: chromadb>=0.4.0; extra == 'all'
Requires-Dist: langchain>=0.1.0; extra == 'all'
Requires-Dist: prometheus-client>=0.17.0; extra == 'all'
Requires-Dist: snowflake-snowpark-python>=1.0.0; extra == 'all'
Provides-Extra: cortex
Requires-Dist: snowflake-snowpark-python>=1.0.0; extra == 'cortex'
Provides-Extra: dev
Requires-Dist: build>=1.0.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: twine>=4.0.0; extra == 'dev'
Provides-Extra: examples
Requires-Dist: databridge-ai-examples>=0.40.0; extra == 'examples'
Provides-Extra: graphrag
Requires-Dist: chromadb>=0.4.0; extra == 'graphrag'
Requires-Dist: langchain>=0.1.0; extra == 'graphrag'
Provides-Extra: observability
Requires-Dist: prometheus-client>=0.17.0; extra == 'observability'
Description-Content-Type: text/markdown

# DataBridge AI Pro

*Enterprise-grade data reconciliation, AI agents, and advanced analytics — ~297 tools.*

---

## Overview

**DataBridge AI Pro** extends the Community Edition (~106 tools) with **19 additional modules** and ~191 tools for enterprise data management. Pro includes everything in Community Edition plus:

### Data Infrastructure
| Module | Tools | Description |
|--------|-------|-------------|
| **Hierarchy Builder** | 49 | Multi-level hierarchy projects (up to 15 levels) for financial reporting and organizational structures |
| **Wright Pipeline** | 31 | 4-object data mart factory (VW_1 → DT_2 → DT_3A → DT_3) with hierarchy integration |
| **Cortex AI** | 26 | Snowflake Cortex integration — natural language to SQL, AI reasoning loops, semantic models |
| **Data Catalog** | 19 | Centralized metadata registry with business glossary and automatic lineage detection |
| **Faux Objects** | 18 | Domain persona-based hierarchy generation and semantic modeling |
| **Connections** | 16 | Multi-database connectivity management for Snowflake, PostgreSQL, MySQL, and more |
| **Hierarchy-Graph Bridge** | 5 | Event-driven sync between hierarchies, GraphRAG vector store, and lineage graph |

### AI & Automation
| Module | Tools | Description |
|--------|-------|-------------|
| **AI Orchestrator** | 16 | Multi-agent task coordination, event publishing, and workflow management |
| **PlannerAgent** | 11 | AI-powered workflow planning, agent suggestions, and execution optimization |
| **GraphRAG Engine** | 10 | Anti-hallucination layer with graph + vector retrieval-augmented generation |
| **Unified AI Agent** | 10 | Cross-system operations with Book/Librarian/Researcher pattern |
| **Smart Recommendations** | 5 | Context-aware feature suggestions and guided workflows |

### Governance & Operations
| Module | Tools | Description |
|--------|-------|-------------|
| **Data Observability** | 15 | Real-time metrics, alerting, anomaly detection, and health scoring |
| **Data Versioning** | 12 | Semantic versioning, snapshots, rollback, and diff for all data objects |
| **Git/CI-CD** | 12 | Automated git workflows, GitHub PR creation, and CI/CD pipeline generation |
| **Lineage Tracking** | 11 | Column-level lineage from SQL/dbt with impact analysis |
| **Console Dashboard** | 5 | Real-time broadcast messaging and system monitoring |
| **Schema Matcher** | 5 | Cross-database schema comparison and fuzzy column mapping |
| **Data Matcher** | 4 | Row-level data comparison across database connections |

## Requirements

- DataBridge AI Community Edition >= 0.40.0
- Valid Pro or Enterprise license key
- Python 3.10+

## Installation

### Step 1: Set Your License Key

```bash
# Set environment variable
export DATABRIDGE_LICENSE_KEY="DB-PRO-YOURCOMPANY-20260101-yoursignature"

# Or add to .env file
echo 'DATABRIDGE_LICENSE_KEY=DB-PRO-YOURCOMPANY-20260101-yoursignature' >> .env
```

### Step 2: Install from GitHub Packages

```bash
# Install Pro package
pip install databridge-ai-pro --extra-index-url https://ghp_TOKEN@raw.githubusercontent.com/datanexum/DATABRIDGE_AI/main/
```

### Step 3: Verify Installation

```python
from databridge_ai_pro import get_pro_status

status = get_pro_status()
print(f"License valid: {status['license_valid']}")
print(f"Features: {status['features']}")
```

## Pro Examples Add-on

The **Pro Examples** package (`databridge-ai-examples`) provides comprehensive tests and tutorials:

| Category | Contents | Count |
|----------|----------|-------|
| Beginner Use Cases | Pizza, friends, school, sports tutorials | 4 cases |
| Financial Use Cases | SEC EDGAR, Apple, Microsoft analysis | 7 cases |
| Faux Objects Use Cases | Domain persona tutorials | 8 cases |
| CE Test Suite | Data loading, hashing, fuzzy, dbt, quality, diff | ~12 files |
| Pro Test Suite | Hierarchy, cortex, catalog, versioning, wright | ~15 files |
| Shared Fixtures | conftest.py, sample data | 2 files |

```bash
# Install CE tests + beginner tutorials
pip install databridge-ai-examples

# Install with Pro tests + advanced tutorials (requires Pro key)
pip install databridge-ai-examples[pro]
```

## Feature Highlights

### Cortex AI Agent

AI-powered data analysis using Snowflake Cortex:

```python
# Via MCP tools
cortex_complete(prompt="Analyze sales trends", model="mistral-large")
cortex_reason(question="Why did revenue drop in Q3?", max_steps=5)

# Cortex Analyst — natural language to SQL
analyst_ask(question="What was total revenue by region?",
            semantic_model_file="@ANALYTICS.PUBLIC.MODELS/sales.yaml")
```

### Hierarchy Builder

Multi-level hierarchy management for financial reporting:

```python
# Create and manage hierarchies
create_hierarchy_project(name="Revenue P&L", description="Revenue hierarchy")
create_hierarchy(project_id="...", name="Product Revenue", parent_id="...")
add_source_mapping(hierarchy_id="...", source_column="ACCOUNT_CODE", source_uid="41%")

# Export and deploy
export_hierarchy_csv(project_id="...")
generate_hierarchy_scripts(project_id="...")
```

### Wright Pipeline

Generate complete data mart structures, with direct hierarchy integration:

```python
# Create a data mart configuration
create_mart_config(
    project_name="upstream_gross",
    report_type="GROSS",
    hierarchy_table="TBL_0_GROSS_LOS_REPORT_HIERARCHY"
)

# Generate the full 4-object pipeline
generate_mart_pipeline(config_name="upstream_gross")

# Generate Wright pipeline directly from a hierarchy project
wright_from_hierarchy(project_id="revenue-pl", report_type="GROSS")

# Sync Wright mart config when hierarchy changes
wright_hierarchy_sync(config_name="upstream_gross", project_id="revenue-pl")
```

### Hierarchy-Graph Bridge

Event-driven sync between hierarchies and downstream subsystems:

```python
# Check bridge sync status
hierarchy_graph_status(project_id="revenue-pl")

# Reindex hierarchy into vector store for RAG search
hierarchy_reindex(project_id="revenue-pl")

# Build lineage graph from hierarchy relationships
hierarchy_lineage_build(project_id="revenue-pl")

# Search hierarchies via RAG-powered vector index
hierarchy_rag_search(query="Which hierarchies map to ACCOUNT_CODE?", top_k=5)

# Analyze downstream impact of hierarchy changes
hierarchy_impact_analysis(project_id="revenue-pl", node_id="h-42")
```

### GraphRAG Engine

Validate AI outputs against your data:

```python
# Search with context
results = rag_search(query="revenue by region", top_k=5)

# Validate AI-generated content
validation = rag_validate_output(content="Revenue increased 20%", sources=results)
```

### Data Observability

Monitor data quality in real-time:

```python
# Record metrics
obs_record_metric(name="hierarchy.validation.success_rate", value=98.5,
                  type="gauge", tags='{"project_id": "revenue-pl"}')

# Create alert rules
obs_create_alert_rule(name="row_count_drop",
                      metric_name="row_count", threshold=900000,
                      comparison="<", severity="critical")

# Get asset health
obs_get_asset_health(asset_id="revenue-pl", asset_type="hierarchy_project")
```

### Data Catalog

Comprehensive metadata management:

```python
# Scan a connection for metadata
catalog_scan_connection(connection_id="snowflake_prod")

# Search the catalog
results = catalog_search(query="customer dimension")

# Get automatic lineage from SQL
lineage = catalog_auto_lineage_from_sql(sql="SELECT * FROM dim_customer")
```

### Lineage Tracking

Column-level lineage and impact analysis:

```python
# Track lineage from SQL
catalog_auto_lineage_from_sql(sql="INSERT INTO fact_sales SELECT ...")

# Analyze change impact
catalog_impact_from_asset(asset_id="dim_customer")
```

## License Tiers

| Feature | Community | Pro | Pro Examples | Enterprise |
|---------|:---------:|:---:|:------------:|:----------:|
| Data Reconciliation (~106 tools) | ✅ | ✅ | | ✅ |
| Hierarchy Builder (49 tools) | | ✅ | | ✅ |
| Wright Pipeline (31 tools) | | ✅ | | ✅ |
| Cortex AI Agent (26 tools) | | ✅ | | ✅ |
| Data Catalog (19 tools) | | ✅ | | ✅ |
| Faux Objects (18 tools) | | ✅ | | ✅ |
| Connections (16 tools) | | ✅ | | ✅ |
| AI Orchestrator (16 tools) | | ✅ | | ✅ |
| Data Observability (15 tools) | | ✅ | | ✅ |
| Data Versioning (12 tools) | | ✅ | | ✅ |
| Git/CI-CD (12 tools) | | ✅ | | ✅ |
| Lineage Tracking (11 tools) | | ✅ | | ✅ |
| PlannerAgent (11 tools) | | ✅ | | ✅ |
| GraphRAG Engine (10 tools) | | ✅ | | ✅ |
| Unified AI Agent (10 tools) | | ✅ | | ✅ |
| Hierarchy-Graph Bridge (5 tools) | | ✅ | | ✅ |
| Console Dashboard (5 tools) | | ✅ | | ✅ |
| Schema Matcher (5 tools) | | ✅ | | ✅ |
| Data Matcher (4 tools) | | ✅ | | ✅ |
| 47 Tests + 19 Tutorials | | | ✅ | |
| Custom Agents | | | | ✅ |
| White-label | | | | ✅ |
| SLA Support | | | | ✅ |
| On-premise Deploy | | | | ✅ |

**License Key Format:** `DB-{TIER}-{CUSTOMER_ID}-{EXPIRY}-{SIGNATURE}`

## Package Distribution

| Package | Location | Install |
|---------|----------|---------|
| `databridge-ai` | PyPI (public) | `pip install databridge-ai` |
| `databridge-ai-pro` | GitHub Packages (private) | `pip install databridge-ai-pro` (+ license key) |
| `databridge-ai-examples` | GitHub Packages (private) | `pip install databridge-ai-examples` (+ license key) |

## Support

- **Pro License**: Email support (support@databridge.ai)
- **Enterprise License**: Priority support with SLA

## Contact

- Sales: sales@databridge.ai
- Support: support@databridge.ai

## License

Proprietary - see [LICENSE](LICENSE) for details.
