Metadata-Version: 2.4
Name: dremioframe
Version: 0.2.1
Summary: A dataframe-like library for Dremio Cloud & Dremio Software
Author-email: Alex Merced <alexmerced@alexmerced.com>
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
License-File: LICENSE
Requires-Dist: pyarrow>=14.0.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: requests>=2.31.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: polars>=0.20.0
Requires-Dist: matplotlib
Project-URL: Homepage, https://github.com/developer-advocacy-dremio/dremio-cloud-dremioframe
Project-URL: Issues, https://github.com/developer-advocacy-dremio/dremio-cloud-dremioframe/issues

# DremioFrame

DremioFrame is a Python library that provides an Ibis-like dataframe builder interface for interacting with Dremio Cloud & Dremio Software. It allows you to list data, perform CRUD operations, and administer Dremio resources using a familiar API.

## Documentation

- [Architecture](architecture.md)

- [Connection Guide](docs/connection.md)
- [Administration](docs/admin.md)
- [Catalog & Admin](docs/catalog.md)
- [Dataframe Builder](docs/builder.md)
- [Aggregation](docs/aggregation.md)
- [Sorting & Distinct](docs/sorting.md)
- [Joins](docs/joins.md)
- [Iceberg Features](docs/iceberg.md)
- [Advanced Features](docs/advanced.md)
- [Charting](docs/charting.md)
- [Data Export](docs/export.md)
- [API Ingestion](docs/ingestion.md)
- [Ingestion Patterns](docs/ingestion_patterns.md)
- [Working with Files](docs/files.md)
- [SQL Functions](docs/functions.md)
    - [Aggregate](docs/functions/aggregate.md)
    - [Math](docs/functions/math.md)
    - [String](docs/functions/string.md)
    - [Date](docs/functions/date.md)
    - [Window](docs/functions/window.md)
    - [Conditional](docs/functions/conditional.md)
    - [AI](docs/functions/ai.md)

## Installation

```bash
pip install dremioframe
```

## Quick Start

### Dremio Cloud

```python
from dremioframe.client import DremioClient

# Assumes DREMIO_PAT and DREMIO_PROJECT_ID are set in env
client = DremioClient()

# Query a table
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").limit(5).collect()
print(df)
```

### Dremio Software

```python
client = DremioClient(
    hostname="localhost",
    port=32010,
    username="admin",
    password="password123",
    tls=False
)
```

## Features

```python
from dremioframe.client import DremioClient

client = DremioClient(pat="YOUR_PAT", project_id="YOUR_PROJECT_ID")

# List catalog
print(client.catalog.list_catalog())

# Query data
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").filter("state = 'MA'").collect()
print(df)

# Calculated Columns
df.mutate(total_pop="pop * 2").show()

# Aggregation
df.group_by("state").agg(avg_pop="AVG(pop)").show()

# Joins
df.join("other_table", on="left_tbl.id = right_tbl.id").show()

# Iceberg Time Travel
df.at_snapshot("123456789").show()



# API Ingestion
client.ingest_api(
    url="https://api.example.com/users",
    table_name="users",
    mode="merge",
    pk="id"
)

# Charting
df.chart(kind="bar", x="category", y="sales", save_to="sales.png")

# Export
df.to_csv("data.csv")
df.to_parquet("data.parquet")

# Insert Data (Batched)
import pandas as pd
data = pd.DataFrame({"id": [1, 2], "name": ["A", "B"]})
client.table("my_table").insert("my_table", data=data, batch_size=1000)

# SQL Functions
from dremioframe import F

client.table("sales") \
    .select(
        F.col("dept"),
        F.sum("amount").alias("total_sales"),
        F.rank().over(F.Window.order_by("amount")).alias("rank")
    ) \
    .show()

# Merge (Upsert)
client.table("target").merge(
    target_table="target",
    on="id",
    matched_update={"name": "source.name"},
    not_matched_insert={"id": "source.id", "val": "source.val"},
    data=data
)

# Data Quality
df.quality.expect_not_null("city")
df.quality.expect_row_count("pop > 1000000", 5, "ge") # Expect at least 5 cities with pop > 1M
```

