Metadata-Version: 2.4
Name: rag-llm-api-pipeline
Version: 0.2.0
Summary: Multimodal RAG pipeline for low-compute, local, real-world deployment
Home-page: https://github.com/pkbythebay29/ot-rag-llm-api
Author: pkbythebay29
Author-email: kannan@haztechrisk.org
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: pytesseract
Requires-Dist: pymupdf
Requires-Dist: sentence-transformers
Requires-Dist: faiss-cpu
Requires-Dist: transformers
Requires-Dist: moviepy
Requires-Dist: SpeechRecognition
Requires-Dist: pyaudio
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# RAG-LLM-API-Pipeline

A fully  local GPU poor, multimodal Retrieval-Augmented Generation (RAG) system powered by open-source local LLMs. This pipeline is designed for operational technology environments to provide AI-assisted access to technical knowledge, manuals, and historical data — securely and offline, at min cost.

---

## ✅ Key Features

- 🔍 Retrieval-Augmented Generation (RAG) using FAISS + SentenceTransformers
- 🧠 Query handling via a local, open-source Large Language Model (LLM)
- 📄 Supports multiple input formats:
  - PDFs
  - Plain text files
  - Images (OCR via Tesseract)
  - Audio files (`.wav`, `.flac`, `.aiff`)
  - Videos (`.mp4` with audio extraction)
- 💻 Interfaces:
  - Command Line Interface (CLI)
  - Local REST API (FastAPI)
- 🛠️ Asset definition via YAML configuration
- 🔐 Works in fully local environments after setup

✅ Works locally, GPU/CPU-friendly with configurable precision  
✅ CLI, API and simple web UI included


---

## 📦 Installation

```bash
pip install rag-llm-api-pipeline

```

---

## 🛠️ Setup Instructions (Windows + Anaconda)

### 1. Create Python Environment
```bash
conda create -n rag_env python=3.10
conda activate rag_env
```

### 2. Install Dependencies
#### Via Conda (system-level tools):
```bash
conda install -c conda-forge ffmpeg pytesseract pyaudio
```

#### Via Pip (Python packages):
```bash
pip install -r requirements.txt
```

> Ensure Tesseract is installed and in your system PATH. You can get it from https://github.com/tesseract-ocr/tesseract.

---

## 🚀 Usage

### CLI Example
```bash
python cli/main.py --system Pump_A --question "What is the pressure threshold for operation?"
```

### API Server
Start the server:
```bash
uvicorn api.server:app --reload
```

Query with curl or Postman:
```bash
curl -X POST http://localhost:8000/query \
     -H "Content-Type: application/json" \
     -d '{"system": "Pump_A", "question": "Explain the restart procedure"}'
```

---

## 🧱 Configuration

Edit `config/system.yaml` to define your assets and associated documents:

```yaml
assets:
  - name: Pump_A
    docs:
      - pump_manual.pdf
      - safety_guide.mp4

models:
  embedding_model: sentence-transformers/all-MiniLM-L6-v2
  llm_model: tiiuae/falcon-7b-instruct

retriever:
  top_k: 5
  index_dir: data/indexes

llm:
  max_new_tokens: 256
  prompt_template: |
    Use the following context to answer the question:
    {context}

    Question: {question}
    Answer:

settings:
  data_dir: data/manuals
  force_rebuild_index: false
  use_cpu: true
```

> Documents can be PDFs, plain text, images, or audio/video files.

---
## 🐧 Setup Instructions (Linux)

### 1. Create Python Environment
```bash
python3 -m venv rag_env
source rag_env/bin/activate
```

Or with `conda`:
```bash
conda create -n rag_env python=3.10
conda activate rag_env
```

### 2. Install System Dependencies
```bash
sudo apt update
sudo apt install -y ffmpeg tesseract-ocr libpulse-dev portaudio19-dev
```

> Optional: install language packs for OCR (e.g., `tesseract-ocr-eng`).

### 3. Install Python Packages
```bash
pip install -r requirements.txt
```

---

## 🔁 Running the Application on Linux

### CLI
```bash
python cli/main.py --system Pump_A --question "What is the restart sequence for this machine?"
```

### API Server
```bash
uvicorn api.server:app --host 0.0.0.0 --port 8000
```

### cURL Query
```bash
curl -X POST http://localhost:8000/query \
     -H "Content-Type: application/json" \
     -d '{"system": "Pump_A", "question": "What does error E204 indicate?"}'
```

## 📚 How it Works

1. **Index Building**:
   - Files are parsed using `loader.py`.
   - Text chunks are embedded with MiniLM.
   - FAISS index stores embeddings for fast similarity search.

2. **Query Execution**:
   - User provides a natural language question.
   - Relevant text chunks are retrieved from the index.
   - LLM generates an answer based on retrieved context.

---

## 🧠 Model Info

- Default LLM: `tiiuae/falcon-rw-1b` (run locally via `transformers`)
- Embedding model: `sentence-transformers/all-MiniLM-L6-v2`
- All models are open-source and run offline.

> You can replace these with any local-compatible Hugging Face model.

---

## 🔐 Security & Offline Use

- No cloud or external dependencies required after initial setup.
- Ideal for OT environments.
- All processing is local: embeddings, LLM inference, and data storage.

---

## 📜 License

MIT License

---

## 📧 Contact

For issues, improvements, or contributions, please open an issue or PR.
