Metadata-Version: 2.4
Name: QuerySUTRA
Version: 0.4.1
Summary: SUTRA: AI-powered data analysis with automatic MySQL export
Author: Aditya Batta
License: MIT
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.3.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: openai>=1.0.0
Requires-Dist: plotly>=5.0.0
Requires-Dist: matplotlib>=3.3.0
Requires-Dist: PyPDF2>=3.0.0
Requires-Dist: python-docx>=0.8.11
Requires-Dist: openpyxl>=3.0.0
Provides-Extra: mysql
Requires-Dist: sqlalchemy>=1.4.0; extra == "mysql"
Requires-Dist: mysql-connector-python>=8.0.0; extra == "mysql"
Provides-Extra: postgres
Requires-Dist: sqlalchemy>=1.4.0; extra == "postgres"
Requires-Dist: psycopg2-binary>=2.9.0; extra == "postgres"
Provides-Extra: embeddings
Requires-Dist: sentence-transformers>=2.0.0; extra == "embeddings"
Provides-Extra: all
Requires-Dist: sqlalchemy>=1.4.0; extra == "all"
Requires-Dist: mysql-connector-python>=8.0.0; extra == "all"
Requires-Dist: psycopg2-binary>=2.9.0; extra == "all"
Requires-Dist: sentence-transformers>=2.0.0; extra == "all"
Dynamic: license-file
Dynamic: requires-python

# QuerySUTRA

**SUTRA: Structured-Unstructured-Text-Retrieval-Architecture**

AI-powered data analysis library. Upload PDFs, query with natural language, export to MySQL automatically.

## Installation

```bash
pip install QuerySUTRA
pip install QuerySUTRA[mysql]  # For MySQL export
```

## Quick Start

```python
from sutra import SUTRA

# Upload PDF and auto-export to MySQL in ONE step
sutra = SUTRA(api_key="your-openai-key")

sutra.upload("data.pdf", auto_export_mysql={
    'host': 'localhost',
    'user': 'root', 
    'password': '123456',
    'database': 'my_database'  # Auto-creates if not exists
})

# Query immediately
result = sutra.ask("Show me all people")
print(result.data)
```

## Features

**1. Automatic MySQL Export**

Database auto-created if not exists. No errors.

```python
# Upload and export to MySQL automatically
sutra.upload("data.pdf", auto_export_mysql={
    'host': 'localhost',
    'user': 'root',
    'password': 'your_password',
    'database': 'my_new_database'  # Creates automatically
})
```

**2. Complete Data Extraction**

Processes entire PDF in chunks. Extracts ALL employees (not just first 10).

```python
sutra.upload("large_document.pdf")  # Extracts all 50+ employees
sutra.tables()  # Shows all extracted tables
```

**3. Natural Language Queries**

```python
result = sutra.ask("Show all people from California")
result = sutra.ask("Who has Python skills?", table="skills")
result = sutra.ask("Count employees by state", viz="pie")
```

**4. Custom Visualizations**

```python
result = sutra.ask("Sales by region", viz="pie")
result = sutra.ask("Trends", viz="line")
result = sutra.ask("Compare", viz="bar")
result = sutra.ask("Data", viz="scatter")
```

**5. Load Existing Databases**

```python
# Load SQLite
sutra = SUTRA.load_from_db("data.db", api_key="key")

# Connect to MySQL
sutra = SUTRA.connect_mysql("localhost", "root", "pass", "database")

# Connect to PostgreSQL  
sutra = SUTRA.connect_postgres("localhost", "postgres", "pass", "database")
```

**6. Smart Features (Optional)**

```python
sutra = SUTRA(
    api_key="your-key",
    use_embeddings=True,    # Cache similar queries (saves API calls)
    fuzzy_match=True,       # "New York City" matches "New York"
    check_relevance=True,   # Detect irrelevant queries
    cache_queries=True      # Cache exact queries
)
```

**7. Direct SQL (Free)**

```python
result = sutra.sql("SELECT * FROM people WHERE state='CA'")
print(result.data)
```

## Complete Workflow

**In Colab:**
```python
from sutra import SUTRA

sutra = SUTRA(api_key="your-key")
sutra.upload("employee_data.pdf")
sutra.tables()  # See extracted tables

# Export and download
sutra.export_db("data.db", format="sqlite")
from google.colab import files
files.download("data.db")
```

**On Windows:**
```python
from sutra import SUTRA

# Load downloaded database
sutra = SUTRA.load_from_db("data.db", api_key="your-key")

# Export to MySQL (auto-creates database)
sutra.save_to_mysql("localhost", "root", "password", "my_database")

# Verify in MySQL
sutra_mysql = SUTRA.connect_mysql("localhost", "root", "password", "my_database")
sutra_mysql.tables()
```

## Export Options

```python
# SQLite
sutra.export_db("backup.db", format="sqlite")

# SQL dump
sutra.export_db("schema.sql", format="sql")

# JSON
sutra.export_db("data.json", format="json")

# Excel
sutra.export_db("data.xlsx", format="excel")

# MySQL (auto-creates database)
sutra.save_to_mysql("localhost", "root", "pass", "new_db")

# PostgreSQL
sutra.save_to_postgres("localhost", "postgres", "pass", "new_db")
```

## API Reference

**Initialize**
```python
SUTRA(api_key, db, use_embeddings, check_relevance, fuzzy_match, cache_queries)
```

**Class Methods**
- `load_from_db(path, api_key)` - Load SQLite
- `connect_mysql(host, user, password, database)` - Connect MySQL
- `connect_postgres(host, user, password, database)` - Connect PostgreSQL

**Instance Methods**
- `upload(data, name, auto_export_mysql)` - Upload with optional auto-export
- `ask(question, viz, table)` - Natural language query
- `sql(query, viz)` - Direct SQL
- `tables()` - List tables
- `schema(table)` - Show schema
- `peek(table, n)` - Preview data
- `export_db(path, format)` - Export database
- `save_to_mysql(host, user, password, database)` - Export to MySQL (auto-creates DB)
- `save_to_postgres(...)` - Export to PostgreSQL
- `backup(path)` - Backup
- `close()` - Close

## Troubleshooting

**MySQL database doesn't exist**
- Fixed in v0.4.0 - auto-creates database automatically
- No need to manually create database

**Only 10 employees extracted from 50-employee PDF**
- Fixed in v0.4.0 - processes entire PDF in chunks
- Upgrade: `pip install --upgrade QuerySUTRA`

**connect_mysql() not found**
- Update: `pip install --upgrade QuerySUTRA`
- Install MySQL support: `pip install QuerySUTRA[mysql]`

## Supported Formats

CSV, Excel, JSON, SQL, PDF, Word, Text, Pandas DataFrame

## Requirements

- Python 3.8+
- OpenAI API key
- MySQL/PostgreSQL (optional)

## License

MIT License

## Changelog

**v0.4.0**
- AUTO-CREATES MySQL database (no more errors)
- Complete PDF extraction (all pages, all employees)
- Chunk processing for large documents
- One-line auto-export to MySQL
- Simplified everything

**v0.3.x**
- MySQL/PostgreSQL connectivity
- Embeddings caching
- Fuzzy matching
- Custom visualizations

---

**Made by Aditya Batta**
