Metadata-Version: 2.4
Name: databridge-ai
Version: 0.32.0
Summary: DataBridge AI - An open-source, MCP-native data reconciliation engine with tools for hierarchy management, data quality, and analytics
Project-URL: Homepage, https://github.com/your-username/Databridge_AI
Project-URL: Documentation, https://github.com/your-username/Databridge_AI#readme
Project-URL: Repository, https://github.com/your-username/Databridge_AI.git
Project-URL: Issues, https://github.com/your-username/Databridge_AI/issues
Author-email: Your Name <your-email@example.com>
License: DATABRIDGE AI SOFTWARE LICENSE AGREEMENT
        
        Copyright (c) 2024-2026 DataBridge AI Team. All Rights Reserved.
        
        IMPORTANT: BY INSTALLING, COPYING, OR USING THIS SOFTWARE, YOU AGREE TO BE
        BOUND BY THE TERMS OF THIS LICENSE AGREEMENT. IF YOU DO NOT AGREE TO THESE
        TERMS, DO NOT INSTALL OR USE THE SOFTWARE.
        
        1. GRANT OF LICENSE
        Subject to the terms of this Agreement, DataBridge AI Team grants you a
        non-exclusive, non-transferable, limited license to use the Software for
        your internal business purposes.
        
        2. CONFIDENTIALITY
        a) The Software, including all source code, algorithms, designs, documentation,
           and any related materials, constitutes confidential and proprietary
           information of DataBridge AI Team ("Confidential Information").
        
        b) You agree to:
           - Maintain the confidentiality of all Confidential Information
           - Not disclose Confidential Information to any third party without prior
             written consent from DataBridge AI Team
           - Use Confidential Information solely for the purposes permitted under
             this Agreement
           - Protect Confidential Information using at least the same degree of care
             used to protect your own confidential information, but in no event less
             than reasonable care
           - Promptly notify DataBridge AI Team of any unauthorized disclosure or use
        
        c) Confidentiality obligations survive termination of this Agreement for a
           period of five (5) years.
        
        3. RESTRICTIONS
        You may NOT:
        a) Copy, modify, or distribute the Software except as expressly permitted
        b) Reverse engineer, decompile, or disassemble the Software
        c) Remove or alter any proprietary notices, labels, or marks
        d) Sublicense, rent, lease, or lend the Software to third parties
        e) Use the Software to develop competing products or services
        f) Share API keys, credentials, or access tokens with unauthorized parties
        
        4. DATA PROCESSING
        a) Any data processed through this Software remains your property
        b) DataBridge AI Team does not collect, store, or transmit your data
        c) You are responsible for ensuring compliance with applicable data
           protection laws (GDPR, CCPA, etc.)
        
        5. INTELLECTUAL PROPERTY
        All intellectual property rights in the Software remain with DataBridge AI
        Team. This Agreement does not grant you any rights to trademarks, service
        marks, or logos of DataBridge AI Team.
        
        6. WARRANTY DISCLAIMER
        THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. DATABRIDGE AI
        TEAM DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
        TO WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
        NON-INFRINGEMENT.
        
        7. LIMITATION OF LIABILITY
        IN NO EVENT SHALL DATABRIDGE AI TEAM BE LIABLE FOR ANY INDIRECT, INCIDENTAL,
        SPECIAL, CONSEQUENTIAL, OR PUNITIVE DAMAGES ARISING OUT OF OR RELATED TO
        THIS AGREEMENT OR THE USE OF THE SOFTWARE.
        
        8. TERMINATION
        This Agreement terminates automatically if you breach any of its terms.
        Upon termination, you must destroy all copies of the Software and certify
        such destruction in writing.
        
        9. GOVERNING LAW
        This Agreement shall be governed by and construed in accordance with the
        laws of the State of Texas, USA, without regard to its conflict of laws
        principles.
        
        10. ACCEPTANCE
        BY INSTALLING OR USING THIS SOFTWARE, YOU ACKNOWLEDGE THAT YOU HAVE READ
        THIS AGREEMENT, UNDERSTAND IT, AND AGREE TO BE BOUND BY ITS TERMS AND
        CONDITIONS. YOU ALSO AGREE THAT THIS AGREEMENT IS THE COMPLETE AND EXCLUSIVE
        STATEMENT OF THE AGREEMENT BETWEEN YOU AND DATABRIDGE AI TEAM.
        
        For licensing inquiries: support@databridge.ai
License-File: LICENSE
Keywords: ai,analytics,cortex,data,data-catalog,data-mart,data-quality,dbt,etl,hierarchy,lineage,mcp,reconciliation,snowflake
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.10
Requires-Dist: fastmcp>=2.0.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: rapidfuzz>=3.0.0
Requires-Dist: sqlalchemy>=2.0.0
Provides-Extra: all
Requires-Dist: pillow>=10.0.0; extra == 'all'
Requires-Dist: pypdf>=3.0.0; extra == 'all'
Requires-Dist: pytesseract>=0.3.10; extra == 'all'
Requires-Dist: snowflake-connector-python>=3.0.0; extra == 'all'
Provides-Extra: dev
Requires-Dist: build>=1.0.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: twine>=4.0.0; extra == 'dev'
Provides-Extra: ocr
Requires-Dist: pillow>=10.0.0; extra == 'ocr'
Requires-Dist: pytesseract>=0.3.10; extra == 'ocr'
Provides-Extra: pdf
Requires-Dist: pypdf>=3.0.0; extra == 'pdf'
Provides-Extra: snowflake
Requires-Dist: snowflake-connector-python>=3.0.0; extra == 'snowflake'
Description-Content-Type: text/markdown

# DataBridge AI

[![PyPI version](https://badge.fury.io/py/databridge-ai.svg)](https://badge.fury.io/py/databridge-ai)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: Proprietary](https://img.shields.io/badge/License-Proprietary-red.svg)](LICENSE)

**DataBridge AI** is a headless, MCP-native data reconciliation engine with **292 tools** for hierarchy management, data quality, and analytics.

---

## ⚠️ CONFIDENTIALITY NOTICE

**BY INSTALLING THIS SOFTWARE, YOU AGREE TO THE FOLLOWING:**

1. This software contains **CONFIDENTIAL AND PROPRIETARY** information
2. You agree to maintain strict confidentiality of all source code, algorithms, and documentation
3. Unauthorized disclosure or distribution is **STRICTLY PROHIBITED**
4. You accept the terms of the [License Agreement](LICENSE)

**If you do not agree to these terms, do not install or use this software.**

---

## Features

- **Data Reconciliation** - Compare and validate data from CSV, SQL, PDF, and JSON sources
- **Hierarchy Builder** - Create and manage multi-level hierarchy projects (up to 15 levels)
- **Wright Module** - Hierarchy-driven data mart generation with 4-object pipeline
- **Cortex AI Integration** - Snowflake Cortex AI with natural language to SQL
- **Data Catalog** - Centralized metadata registry with business glossary
- **Data Quality** - Expectation suites and data contracts
- **Lineage Tracking** - Column-level lineage and impact analysis
- **Git/CI-CD** - Automated workflows and GitHub integration
- **dbt Integration** - Generate dbt projects from hierarchies

## Installation

**By installing, you accept the [License Agreement](LICENSE) including confidentiality obligations.**

```bash
# Basic installation
pip install databridge-ai

# With PDF support
pip install databridge-ai[pdf]

# With Snowflake support
pip install databridge-ai[snowflake]

# Full installation
pip install databridge-ai[all]
```

## Quick Start

### As MCP Server (Claude Desktop)

Add to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "DataBridge_AI": {
      "command": "python",
      "args": ["-m", "src.server"]
    }
  }
}
```

### Programmatic Usage

```python
from src.server import mcp

# Run as MCP server
mcp.run()
```

### Available Tools (292)

| Category | Count | Examples |
|----------|-------|----------|
| Data Reconciliation | 20+ | `load_csv`, `compare_hashes`, `fuzzy_match_columns` |
| Hierarchy Builder | 44 | `create_hierarchy_project`, `import_hierarchy_csv` |
| Wright (Mart Factory) | 18 | `create_mart_config`, `generate_mart_pipeline` |
| Cortex AI | 22 | `cortex_complete`, `analyst_ask`, `cortex_reason` |
| Data Catalog | 15 | `catalog_create_asset`, `catalog_search` |
| Versioning | 12 | `version_create`, `version_rollback` |
| Lineage | 11 | `track_column_lineage`, `analyze_change_impact` |
| Git/CI-CD | 12 | `git_commit`, `github_create_pr` |
| dbt Integration | 8 | `create_dbt_project`, `generate_dbt_model` |
| Data Quality | 7 | `generate_expectation_suite`, `run_validation` |

## Tool Categories

### Data Reconciliation
- Load and profile data from CSV, JSON, and SQL sources
- Compare datasets with hash-based matching
- Fuzzy matching for deduplication
- PDF text extraction and OCR

### Hierarchy Builder
- Create multi-level hierarchy projects
- Define source mappings to database columns
- Build calculation formulas (SUM, SUBTRACT, MULTIPLY, DIVIDE)
- Export to CSV/JSON and generate deployment scripts
- Deploy hierarchies to Snowflake

### Wright Module (Data Mart Factory)
- 4-object pipeline: VW_1 → DT_2 → DT_3A → DT_3
- 7 configuration variables for parameterization
- AI-powered hierarchy discovery via Cortex
- 5-level formula precedence engine

### Cortex AI Integration
- Snowflake Cortex functions (COMPLETE, SUMMARIZE, SENTIMENT, TRANSLATE)
- Natural language to SQL via semantic models
- Orchestrated reasoning loop (Observe → Plan → Execute → Reflect)

## Configuration

Create a `.env` file or set environment variables:

```env
# Data directory
DATA_DIR=./data

# NestJS backend (optional)
NESTJS_BACKEND_URL=http://localhost:8001
NESTJS_API_KEY=your-api-key

# Snowflake (optional)
SNOWFLAKE_ACCOUNT=your-account
SNOWFLAKE_USER=your-user
SNOWFLAKE_PASSWORD=your-password
```

## License

**Proprietary License** - This software is confidential and proprietary.

See [LICENSE](LICENSE) for the complete terms including:
- Confidentiality obligations
- Usage restrictions
- Non-disclosure requirements

Copyright (c) 2024-2026 DataBridge AI Team. All Rights Reserved.

## Support

For licensing inquiries: support@databridge.ai
