Metadata-Version: 2.3
Name: pyulysses
Version: 0.9.0
Summary: Library to manage connection to Dremio via Apache Arrow Flight
Author: Wallace Camargo
Author-email: wallace.graca@example.com
Requires-Python: >=3.11,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: certifi (>=2024.8.30,<2025.0.0)
Requires-Dist: duckdb (>=1.1.3,<2.0.0)
Requires-Dist: pyarrow (>=18.1.0,<19.0.0)
Requires-Dist: python-dotenv (>=1.0.0,<2.0.0)
Description-Content-Type: text/markdown

# PyUlysses

**PyUlysses** is a Python library for seamless connectivity to Dremio DataHub using Apache Arrow Flight. It provides an intuitive interface for executing SQL queries and managing data operations with built-in support for DuckDB integration.

[![Python Version](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-Apache%202.0-green.svg)](LICENSE)

## Architecture
```
┌─────────────────┐
│  Your Python    │
│  Application    │
└────────┬────────┘
         │
         │ PyUlysses
         ▼
┌─────────────────┐      Arrow Flight      ┌─────────────────┐
│   PyArrow       │◄────────────────────────►│  Dremio Server  │
│   Client        │                          │   (DataHub)     │
└────────┬────────┘                          └─────────────────┘
         │
         │ Zero-copy transfer
         ▼
┌─────────────────┐
│    DuckDB       │
│   Analytics     │
└─────────────────┘
```

## Features

✨ **Key Features:**
- High-performance Arrow Flight connectivity to Dremio
- Native DuckDB integration for query results
- Automatic retry logic with configurable timeouts
- Robust error handling and logging
- Secure authentication via PAT tokens
- Zero-copy data transfer with Apache Arrow
- Direct integration with analytical workflows

## Installation

### Virtual Environment (Recommended)

While optional, we **strongly recommend** using a virtual environment to isolate PyUlysses dependencies from your system Python installation.

#### Linux/macOS (Cortex)

```bash
# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install lib
pip install pyulysses
```

#### Windows

```bash
# Create virtual environment
python -m venv venv

# Activate virtual environment (Command Prompt)
venv\Scripts\activate.bat

# Or PowerShell
venv\Scripts\Activate.ps1

# Install lib
pip install pyulysses
```

### Install dependencies

#### Using Pip
```bash
pip install -r requirements.txt
```

#### Using Poetry

```bash
poetry install --all-extras
```

## Quick Start

### 1. Configure Environment Variables

Create a `.env` file in your project root:

```bash
DREMIO_USERNAME=your_username
DREMIO_ACCESS_TOKEN=your_personal_access_token
DREMIO_HOST=dremio.example.com
DREMIO_PORT=9047
```

### 2. Basic Usage

```python
from pyulysses.connector.datahub_connector import Client

# Initialize client
client = Client()

# Execute query
result = client.query(""" SELECT * FROM your_table """)

# Display results
print(result)
```

## Documentation

Full documentation is available at `docs/` directory. To build and serve locally:

```bash
mkdocs serve
```

Then visit `http://localhost:8000`


## License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

## Authors

- Wallace Camargo - *Initial work*

## Support

For issues, questions, or contributions, please open an issue on GitLab.

