Metadata-Version: 2.4
Name: gurrt
Version: 1.0.2
Summary: An Intelligent Open-Source Video Understanding System A different path from traditional Large Video Language Models (LVLMs). Built for modularity, openness, and real-world usability.
Author-email: Mohammad Owais <owaismohammad2515@gmail.com>, Fareha Aslam <farehaaslam57@gmail.com>
License-Expression: MIT
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: opencv-python>=4.13.0.92
Requires-Dist: transformers>=5.1.0
Requires-Dist: accelerate>=1.12.0
Requires-Dist: pillow>=12.1.0
Requires-Dist: chromadb>=1.4.1
Requires-Dist: ollama>=0.6.1
Requires-Dist: langchain>=1.2.9
Requires-Dist: langchain-groq>=1.1.2
Requires-Dist: moviepy>=1.0.3
Requires-Dist: sentence-transformers>=5.2.2
Requires-Dist: tqdm>=4.67.3
Requires-Dist: scenedetect>=0.6.7.1
Requires-Dist: scikit-image>=0.26.0
Requires-Dist: faster-whisper>=1.2.1
Requires-Dist: langchain-text-splitters>=1.1.0
Requires-Dist: fastapi>=0.128.7
Requires-Dist: pydantic>=2.12.5
Requires-Dist: supermemory>=3.24.0
Requires-Dist: platformdirs>=4.5.1
Requires-Dist: typer>=0.21.1
Requires-Dist: opencv-python-headless>=4.13.0.92
Provides-Extra: cuda
Requires-Dist: torch; extra == "cuda"
Requires-Dist: torchvision; extra == "cuda"
Requires-Dist: torchaudio; extra == "cuda"
Dynamic: license-file

[![PyPI version](https://img.shields.io/pypi/v/gurrt)](https://pypi.org/project/gurrt/)
[![Python Versions](https://img.shields.io/pypi/pyversions/gurrt)](https://pypi.org/project/gurrt/)
[![License](https://img.shields.io/pypi/l/gurrt)](https://pypi.org/project/gurrt/)

[![Downloads](https://pepy.tech/badge/gurrt/)](https://pepy.tech/project/gurrt)
[![Twitter Follow](https://img.shields.io/twitter/follow/muffBozo.svg?style=social)](https://twitter.com/muffBozo)
![gurrt](https://raw.githubusercontent.com/owaismohammad/gurrt/main/gurrt.png)

gurrt is an intelligent video understanding system, an open-source alternative to monolithic Large Video Language Models built out of frustration.

One cannot work with Large Video Language Models :

- Expensive to set up  
- GPU intensive  
- Slow to experiment with  
- Difficult to run on consumer hardware  
- Often closed or partially restricted  

Most state-of-the-art video models require massive compute clusters and large-scale infrastructure.  
They are impressive — but they are not accessible.

If meaningful video intelligence requires:

- Multiple high-end GPUs  
- Hours of inference time  
- Proprietary model access  

Then it stops feeling truly open.

---

### A Different Philosophy

gurrt is not an attempt to compete with systems like YouTube’s internal models or other large-scale industrial LVLMs trained on massive GPU clusters.
It is an attempt to rethink the approach.
Instead of asking how to build a larger end-to-end video transformer, it explores a different path:

- Smarter frame sampling techniques  
- Stronger and more modular vision models  
- Better structured embedding strategies  
- More efficient and grounded RAG pipelines  
- Persistent memory-driven reasoning  

The idea is how can i just get the job done with minimal efforts yielding high end results

It represents a belief that meaningful video understanding can emerge from:

- Thoughtful engineering  
- Smart sampling  
- Strong modular components  
- Memory-augmented retrieval  

Not just from massive GPU clusters and billion-parameter models.
## 🌿 Quick Start Guide for pypi package

### 1. Installation

Set up **gurrt** using `uv`. Note: This project requires **Python 3.12**.

```bash
# 1. Install uv and set Python version
pip install uv
uv venv
uv python pin 3.12

# 2. Activate environment
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# 3. Install gurrt (Standard/CPU)
uv pip install gurrt

# 4. OR Install with GPU Support
uv pip install gurrt[cuda] --extra-index-url https://download.pytorch.org/whl/cu121

```

---

### 2. Commands

| Command | Description |
| --- | --- |
| `gurrt init` | Configure API keys (Groq, Supermemory, Ollama). |
| `gurrt models-download` | Download and cache AI models locally. |
| `gurrt index <path>` | Extract frames and audio for search. |
| `gurrt index-ollama <path> <model>` | Index using a specific Ollama model. |
| `gurrt ask "<query>"` | Query your indexed video content. |

The tool automatically optimizes performance by disabling unnecessary logging and tokenizer parallelism to ensure a clean CLI experience yet some logs do appear of Moviepy will resolve it in future iterations.

---


### Architecture Overview
```bash
Video
  │
  ├── Smart Frame Extraction
  │     └── Captioning + Embeddings
  │
  ├── Audio Extraction
  │     └── Speech-to-Text + Embeddings
  │
  ├── Vector Memory Store
  │
  ├── Supermemory (Persistent Conversation Layer)
  │
  └── LLM Reasoning Engine
```

### Project Setup (using uv)

```bash
# Install uv if you haven't already
pip install uv

# Sync dependencies
uv sync

# Activate environment
.venv\Scripts\activate
```

### File Structure

```bash
gurrt/
├── src/
│   |
│   │
│   └── videorag/                      # Core Video-RAG application package
│       │
│       ├── api/
│       │   └── server.py              # API server (exposes endpoints for querying, ingestion, etc.)
│       │
│       ├── cli/
│       │   └── main.py                # CLI entry point (init, ingest, query commands)
│       │
│       ├── config/
│       │   └── config.py              # Configuration management (API keys, paths, environment setup)
│       │
│       ├── core/                      # Core intelligence pipeline
│       │   ├── __init__.py
│       │   ├── asr.py                 # Audio extraction + speech-to-text processing
│       │   ├── embedding.py           # Embedding generation for captions & transcripts
│       │   ├── llm.py                 # LLM interaction and reasoning logic
│       │   ├── models.py              # Model loading and management utilities
│       │   ├── pipeline.py            # End-to-end ingestion + query pipeline orchestration
│       │   ├── prompts.py             # Prompt templates and structured context injection
│       │   ├── search.py              # Retrieval logic (semantic search over stored embeddings)
│       │   └── vectordb.py            # Vector database interface and storage abstraction
│       │
│       └── utils/
│           └── utils.py            # Shared utility functions and helpers
│
└── README.md                         # Project documentation
```


















