Metadata-Version: 2.4
Name: turboframe
Version: 0.1.0
Summary: Fast parallel DataFrame library built on PyArrow, optimized for Delta Lake and Azure Fabric
Author-email: Kushal <kushal.patiloth@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/jealouscornball/turboframe
Project-URL: Issues, https://github.com/jealouscornball/turboframe/issues
Keywords: dataframe,parallel,arrow,delta,fabric,fast
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyarrow>=12.0.0
Requires-Dist: numpy>=1.24.0
Provides-Extra: delta
Requires-Dist: deltalake>=0.15.0; extra == "delta"
Provides-Extra: excel
Requires-Dist: pandas>=2.0.0; extra == "excel"
Requires-Dist: openpyxl>=3.1.0; extra == "excel"
Provides-Extra: sql
Requires-Dist: pandas>=2.0.0; extra == "sql"
Provides-Extra: polars
Requires-Dist: polars>=0.20.0; extra == "polars"
Provides-Extra: all
Requires-Dist: deltalake>=0.15.0; extra == "all"
Requires-Dist: polars>=0.20.0; extra == "all"
Requires-Dist: pandas>=2.0.0; extra == "all"
Requires-Dist: openpyxl>=3.1.0; extra == "all"
Dynamic: license-file
Dynamic: requires-python

# TurboFrame

A fast, parallel DataFrame library built on PyArrow. Optimized for Delta Lake reads and parallel GroupBy on Microsoft Fabric notebooks.

## Install

```bash
# Core (parquet, csv, json)
pip install turboframe

# With Delta Lake support
pip install turboframe[delta]

# Everything
pip install turboframe[all]
```

## Quick Start

```python
from turboframe import TurboFrame

# From ANY source
tf = TurboFrame({"region": ["E","W","E"], "sales": [100,200,150]})
tf = TurboFrame(my_pandas_df)
tf = TurboFrame(my_spark_df)
tf = TurboFrame.read_csv("data.csv")
tf = TurboFrame.read_delta("/lakehouse/default/Tables/sales")
tf = TurboFrame.read_parquet("data.parquet")
tf = TurboFrame.read_excel("report.xlsx")
tf = TurboFrame.read_sql("SELECT * FROM sales", connection)

# Filter + GroupBy + Sort
result = (
    tf.filter("sales > 100")
      .groupby("region")
      .agg({"sales": "sum"})
      .sort("sales_sum", ascending=False)
)
result.show()
```

## Supported Aggregations

sum, mean, min, max, count, std, var, median, nunique

## License

MIT
