Metadata-Version: 2.4
Name: parquetframe
Version: 2.0.0a7
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Environment :: Console
Classifier: Typing :: Typed
Requires-Dist: pandas>=2.0.0
Requires-Dist: dask[dataframe]>=2024.1.0
Requires-Dist: pyarrow>=10.0.0
Requires-Dist: polars>=0.19.0 ; extra == 'phase2'
Requires-Dist: fastavro>=1.8.0 ; extra == 'phase2'
Requires-Dist: polars>=0.19.0 ; extra == 'engines'
Requires-Dist: pytest>=7.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=4.0 ; extra == 'dev'
Requires-Dist: pytest-mock>=3.10 ; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0 ; extra == 'dev'
Requires-Dist: ruff>=0.1.0 ; extra == 'dev'
Requires-Dist: black>=23.0 ; extra == 'dev'
Requires-Dist: pre-commit>=3.0 ; extra == 'dev'
Requires-Dist: mypy>=1.0 ; extra == 'dev'
Requires-Dist: tox>=4.0 ; extra == 'dev'
Requires-Dist: markdown>=3.0 ; extra == 'dev'
Requires-Dist: mkdocs>=1.5.0 ; extra == 'docs'
Requires-Dist: mkdocs-material>=9.0.0 ; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.24.0 ; extra == 'docs'
Requires-Dist: pytest>=7.0 ; extra == 'test'
Requires-Dist: pytest-cov>=4.0 ; extra == 'test'
Requires-Dist: pytest-mock>=3.10 ; extra == 'test'
Requires-Dist: pytest-asyncio>=0.23.0 ; extra == 'test'
Requires-Dist: fastavro>=1.8.0 ; extra == 'test'
Requires-Dist: s3fs>=2023.1.0 ; extra == 'test'
Requires-Dist: click>=8.0 ; extra == 'cli'
Requires-Dist: rich>=13.0 ; extra == 'cli'
Requires-Dist: psutil>=5.8.0 ; extra == 'cli'
Requires-Dist: pyyaml>=6.0 ; extra == 'cli'
Requires-Dist: duckdb>=0.9.0 ; extra == 'cli'
Requires-Dist: polars>=1.33.1 ; extra == 'cli'
Requires-Dist: duckdb>=0.9.0 ; extra == 'sql'
Requires-Dist: sqlalchemy>=2.0.0 ; extra == 'db'
Requires-Dist: bioframe>=0.4.0 ; extra == 'bio'
Requires-Dist: ollama>=0.1.7 ; extra == 'ai'
Requires-Dist: prompt-toolkit>=3.0.0 ; extra == 'ai'
Requires-Dist: datafusion>=33.0.0 ; extra == 'ai'
Requires-Dist: polars>=0.19.0 ; extra == 'all'
Requires-Dist: fastavro>=1.8.0 ; extra == 'all'
Requires-Dist: click>=8.0 ; extra == 'all'
Requires-Dist: rich>=13.0 ; extra == 'all'
Requires-Dist: psutil>=5.8.0 ; extra == 'all'
Requires-Dist: pyyaml>=6.0 ; extra == 'all'
Requires-Dist: duckdb>=0.9.0 ; extra == 'all'
Requires-Dist: sqlalchemy>=2.0.0 ; extra == 'all'
Requires-Dist: bioframe>=0.4.0 ; extra == 'all'
Requires-Dist: ollama>=0.1.7 ; extra == 'all'
Requires-Dist: prompt-toolkit>=3.0.0 ; extra == 'all'
Provides-Extra: phase2
Provides-Extra: engines
Provides-Extra: dev
Provides-Extra: docs
Provides-Extra: test
Provides-Extra: cli
Provides-Extra: sql
Provides-Extra: db
Provides-Extra: bio
Provides-Extra: ai
Provides-Extra: rust
Provides-Extra: all
License-File: LICENSE
Summary: A universal data processing framework with multi-engine support (pandas, Polars, Dask) and multi-format I/O (CSV, JSON, Parquet, ORC, Avro) with intelligent backend selection
Keywords: pandas,polars,dask,parquet,csv,json,orc,avro,dataframe,big-data,multi-engine,sql,duckdb,bioframe,genomics,cli,multi-format,data-science,analytics,bioinformatics,file-format,entity-framework
Author-email: Christopher Murray <lee.christopher.murray@gmail.com>
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://leechristophermurray.github.io/parquetframe/
Project-URL: Documentation, https://leechristophermurray.github.io/parquetframe/
Project-URL: Repository, https://github.com/leechristophermurray/parquetframe.git
Project-URL: Bug Tracker, https://github.com/leechristophermurray/parquetframe/issues
Project-URL: Changelog, https://github.com/leechristophermurray/parquetframe/blob/main/CHANGELOG.md

# ParquetFrame

**High-performance data analytics with AI/ML capabilities**

[![PyPI version](https://badge.fury.io/py/parquetframe.svg)](https://badge.fury.io/py/parquetframe)
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)

ParquetFrame is a unified data platform combining SQL, time series, geospatial, financial analysis, and AI/ML capabilities - all with familiar DataFrame interfaces.

## ✨ Features

- **SQL Engine**: Query DataFrames with SQL (DataFusion/DuckDB)
- **Time Series**: `.ts` accessor for resampling, rolling windows
- **GeoSpatial**: `.geo` accessor for spatial operations
- **Financial**: `.fin` accessor for technical indicators
- **AI/ML**: Tetnus ML framework + RAG with Knowlogy knowledge graph
- **Cloud**: S3, GCS, Azure Blob Storage support
- **Interactive CLI**: Rich REPL with syntax highlighting

## 🚀 Quick Start

```bash
pip install parquetframe
```

```python
import pandas as pd
import parquetframe as pf
import parquetframe.sql
import parquetframe.time
import parquetframe.finance

# SQL queries
result = pf.sql("SELECT * FROM df WHERE value > 100", df=df)

# Time series
daily = df.ts.resample('1D', agg='mean')

# Financial indicators
rsi = df.fin.rsi('close', 14)
macd = df.fin.macd('close')
```

## 📚 Documentation

- [Getting Started](docs/tutorials/getting_started.md)
- [API Reference](docs/api_reference.md)
- [SQL Guide](docs/sql/index.md)
- [Time Series](docs/time/index.md)
- [Financial Analysis](docs/finance/index.md)
- [GeoSpatial](docs/geo/index.md)

## 🎯 Use Cases

### Financial Analysis

```python
import parquetframe.finance

prices = pd.read_csv("stock.csv", index_col='date', parse_dates=True)
prices['SMA_20'] = prices.fin.sma('close', 20)
prices['RSI'] = prices.fin.rsi('close', 14)
```

### Time Series Forecasting

```python
import parquetframe.time

sensor_data = df.ts.resample('1H', agg='mean')
smoothed = sensor_data.ts.rolling('24H', agg='mean')
```

### GeoSpatial Analysis

```python
import geopandas as gpd
import parquetframe.geo

cities = gpd.read_file("cities.geojson")
buffered = cities.geo.buffer(1000)
```

### AI-Powered RAG

```python
from parquetframe.ai import SimpleRagPipeline
from parquetframe import knowlogy

# Query knowledge graph
formula = knowlogy.get_formula("variance")

# RAG with formula grounding
result = pipeline.run_query("Explain variance", user_context="analyst")
```

## 🏗️ Architecture

ParquetFrame combines:
- **Rust Core**: High-performance kernels (pf-time-core, pf-geo-core, pf-fin-core)
- **Python API**: Familiar pandas-style accessors
- **AI/ML**: Tetnus framework + Knowlogy knowledge graph
- **Cloud**: Multi-cloud storage integration

## 🔗 Project Links

- [Documentation](docs/)
- [Examples](examples/)
- [Tutorials](docs/tutorials/)

## 📄 License

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0
International Public License

## 🙏 Acknowledgments

Built on top of:
- Apache Arrow / Polars / pandas
- DataFusion / DuckDB
- GeoPandas / Shapely
- PyTorch (Tetnus)

