Metadata-Version: 2.4
Name: aws-s3-document-connector
Version: 0.1.0
Summary: A highly robust, decoupled library for securely proxying S3 uploads with MySQL metadata.
Author: Vipin
License: MIT
Project-URL: Homepage, https://github.com/yourusername/aws-s3-document-connector
Project-URL: Issues, https://github.com/yourusername/aws-s3-document-connector/issues
Keywords: aws,s3,fastapi,mysql,document-storage
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Intended Audience :: Developers
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: boto3>=1.28.0
Dynamic: license-file

# AWS S3 Document Connector

[![PyPI version](https://badge.fury.io/py/aws-s3-document-connector.svg)](https://badge.fury.io/py/aws-s3-document-connector)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)

A rigorous, decoupled backend Python library designed to perfectly wrap AWS S3 document storage alongside MySQL metadata management. It enforces **strict path formatting, input validation, and large-file memory optimizations**, preventing S3 orphaned files and database drift.

Built explicitly to be used inside enterprise services (like FastAPI) without any framework lock-in.

---

## Features
- ✅ **Completely Decoupled**: Zero dependencies on FastAPI, Django, or Pydantic. Use it in any Python module.
- ✅ **Strict Internal Validation**: Guarantees S3 key structural integrity (`tenant/clinic/patient/type/year/month/uuid_file.pdf`).
- ✅ **Chunked Uploads for Gigabyte Files**: Bypasses classic memory-blocking bottlenecks by utilizing native Boto3 multipart managers.
- ✅ **Automatic Sync Rollbacks**: Instantly and natively rolls back S3 object allocations if the localized DB metadata insert yields an exception.
- ✅ **Custom Metadata Bridging**: Easily serialize unstructured payload metadata to MySQL via standard `dict` mappings.
- ✅ **Security Hardened**: Injects SSE-KMS tags uniformly with intelligent ClientError handling.

---

## Installation

You can install this package globally or in your virtual environment via pip:

```bash
pip install aws-s3-document-connector
```

*(This library requires `boto3` to operate AWS calls safely).*

---

## Quickstart Guide

### 1. Configure the system
Your configuration accepts inputs purely from standard OS Environments:

```bash
export S3_BUCKET="my-production-bucket"
export AWS_REGION="us-east-1"
export AWS_ACCESS_KEY_ID="XXXXX"
export AWS_SECRET_ACCESS_KEY="XXXXX"
# Optional overrides:
export S3_KMS_KEY="your-kms-identifier"
export PRESIGNED_EXPIRY="600"
```

### 2. Connect Your MySQL
This library delegates connection management to you. Provide any active MySQL connection that supports `.cursor()` and `.commit()`.

```python
import mysql.connector
from aws_s3_connector.config import Config
from aws_s3_connector.s3_client import S3Client
from aws_s3_connector.repository import DocumentRepository
from aws_s3_connector.storage import DocumentStorage

config = Config()

db_connection = mysql.connector.connect(
    host="your-aws-rds-endpoint",
    user="root",
    password="password",
    database="rcm_database"
)

# Instantiate decoupled modules
storage = DocumentStorage(
    s3_client=S3Client(config),
    repository=DocumentRepository(db_connection),
    config=config
)
```

### 3. Safely Upload a Document

No more raw strings or malformed directories. Construct an exact, typed request:

```python
from aws_s3_connector.models import UploadRequest

# 1. Structure the payload
request = UploadRequest(
    tenant_name="acme_tenant",
    clinic_id=101,
    patient_id=7777,
    document_type="IDENTIFICATION",
    document_format="pdf",
    file_name="Patient_ID_Scan.pdf",
    content_type="application/pdf",
    custom_metadata={"uploaded_by_user": "Dr. Smith"}  # Will route natively to DB JSON
)

# 2. Upload with automatic DB sync and Rollbacks
with open("Patient_ID_Scan.pdf", "rb") as file_stream:
    result = storage.upload_file(file_stream, request)

print(f"Success! Uploaded uniquely to S3 at: {result.key}")
print(f"Integrity Check: {result.etag}")
```

### 4. Search Valid Active Documents

Skip costly S3 parsing limits and hit the fast metadata cache layer linearly.

```python
# Fetches all non-deleted MySQL metadata references logically belonging to the patient
documents = storage.list_documents(tenant_name="acme_tenant", patient_id=7777)

for doc in documents:
    print(doc["FILE_NAME_ORIGINAL"])
```

---

## Contributing
Issues and Pull Requests are deeply appreciated to help expand logging patterns, caching optimizations, or extra metadata bridges!

## License
Provided under the MIT License.
