Metadata-Version: 2.4
Name: zipstream-ai
Version: 1.0.0
Summary: Stream and query zipped datasets using LLMs
License: MIT
Author: Pranav Nitin Motarwar
Author-email: pranavmotarwar@gmail.com
Requires-Python: >=3.8,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: google-generativeai (>=0.4.1,<0.5.0)
Requires-Dist: openai (>=1.10.0,<2.0.0)
Requires-Dist: pandas (>=2.0,<3.0)
Requires-Dist: pillow (>=10.0.0,<11.0.0)
Requires-Dist: python-dotenv (>=1.0.0,<2.0.0)
Requires-Dist: typer (>=0.9.0,<0.10.0)
Description-Content-Type: text/markdown

# zipstream-ai

![PyPI - Python Version](https://img.shields.io/pypi/pyversions/zipstream-ai)
![PyPI](https://img.shields.io/pypi/v/zipstream-ai)
![License](https://img.shields.io/pypi/l/zipstream-ai)
![Docs](https://img.shields.io/badge/docs-passing-brightgreen)
![Tests](https://img.shields.io/badge/tests-passing-brightgreen)
![mypy](https://img.shields.io/badge/mypy-checked-blue)
![code style: black](https://img.shields.io/badge/code%20style-black-000000)

**Stream, Parse, and Chat with Compressed Datasets Using LLMs**

`zipstream-ai` is a Python package that lets you interact with `.zip` and `.tar.gz` files directly—no need to extract them manually. It integrates archive streaming, format detection, data parsing (e.g., CSV, JSON), and natural language querying with LLMs like Gemini, all through a unified interface.

---

## Installation

```bash
pip install zipstream-ai
```

---

## Features

| Feature                     | Description                                                                 |
|----------------------------|-----------------------------------------------------------------------------|
| 📂 Archive Streaming       | Stream `.zip` and `.tar.gz` files without extraction                        |
| 🔍 Format Auto-Detection   | Automatically detects file types (CSV, JSON, TXT, etc.)                     |
| 📊 DataFrame Integration   | Parses tabular data directly into pandas DataFrames                         |
| 💬 LLM Querying            | Ask questions about your data using Gemini (Google's LLM)                   |
| 🧩 Modular Design          | Easily extensible for new formats or models                                 |
| 🖥️ Python + CLI Support    | Use via command line or as a Python package                                 |

---

## Use Case Examples

### 1. Load & Explore ZIP

```python
from zipstream_ai import ZipStreamReader

reader = ZipStreamReader("dataset.zip")
print(reader.list_files())
```

### 2. Parse CSV from ZIP

```python
from zipstream_ai import FileParser

parser = FileParser(reader)
df = parser.load("data.csv")
print(df.head())
```

### 3. Ask Questions with Gemini

```python
from zipstream_ai import ask

response = ask(df, "Which 3 rows have the highest 'score'?")
print(response)
```

---

## Why zipstream-ai?

| Traditional Workflow               | With `zipstream-ai`                         |
|-----------------------------------|---------------------------------------------|
| Manually unzip files              | Stream directly from archive                |
| Write boilerplate code to parse   | Built-in file parsers (CSV, JSON, etc.)     |
| Switch between tools for LLMs     | One-liner `ask(df, question)` integration   |

---

## Architecture Diagram

```
         ┌──────────────┐
         │  .zip/.tar   │
         └────┬─────────┘
              │
   ┌──────────▼──────────┐
   │  ZipStreamReader    │
   └──────────┬──────────┘
              │
     ┌────────▼────────┐
     │   FileParser    │────>  pd.DataFrame
     └────────┬────────┘
              │
     ┌────────▼────────┐
     │     ask()       │────> Gemini LLM Output
     └─────────────────┘
```





