Metadata-Version: 2.4
Name: guard-rag
Version: 1.0.6
Summary: Privacy-first, fully offline AI document assistant secured by tiered safety guardrails
Home-page: https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL
Author: Sowmiyan S
Author-email: 
License: MIT License
        
        Copyright (c) 2025 Sowmiyan S (https://github.com/sowmiyan-s)
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL
Project-URL: Documentation, https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL#readme
Project-URL: Repository, https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL
Project-URL: Bug Tracker, https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL/issues
Keywords: rag,retrieval-augmented-generation,langchain,ollama,faiss,embeddings,chatbot,llm,privacy,offline
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Environment :: Console
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain>=0.3.0
Requires-Dist: langchain-core>=0.3.0
Requires-Dist: langchain-community>=0.3.0
Requires-Dist: langchain-text-splitters>=0.3.0
Requires-Dist: langchain-ollama>=0.2.0
Requires-Dist: langchain-huggingface>=0.1.0
Requires-Dist: sentence-transformers>=2.7.0
Requires-Dist: faiss-cpu>=1.8.0
Requires-Dist: pypdf>=4.2.0
Requires-Dist: docx2txt>=0.8
Requires-Dist: python-dotenv>=1.0.1
Requires-Dist: nest_asyncio>=1.6.0
Requires-Dist: rich>=13.0.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: ruff>=0.3.0; extra == "dev"
Requires-Dist: mypy>=1.9.0; extra == "dev"
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

<div align="center">

<br/>

<img src="https://img.shields.io/badge/LangChain-RAG-311b92?style=for-the-badge&logo=chainlink&logoColor=white"/>
&nbsp;
<img src="https://img.shields.io/badge/Ollama-Local%20LLM-black?style=for-the-badge&logo=ollama&logoColor=white"/>
&nbsp;
<img src="https://img.shields.io/badge/FAISS-Vector%20Store-0064A4?style=for-the-badge&logo=meta&logoColor=white"/>
&nbsp;
<img src="https://img.shields.io/badge/CLI-Command%20Line-000000?style=for-the-badge&logo=terminal&logoColor=white"/>

<br/><br/>

# GuardRAG

### A privacy-first, fully offline AI document assistant — secured by a tiered safety guardrails system

<br/>

[![License: MIT](https://img.shields.io/badge/License-MIT-22c55e?style=flat-square)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.9%2B-3b82f6?style=flat-square&logo=python&logoColor=white)](https://python.org)
[![PyPI](https://img.shields.io/pypi/v/guard-rag?style=flat-square)](https://pypi.org/project/guard-rag/)
[![Offline](https://img.shields.io/badge/Mode-100%25%20Offline-76b900?style=flat-square)](#)
[![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-a78bfa?style=flat-square)](#contributing)

<br/>

> Upload any document. Ask anything. Get answers — **entirely on your machine.**  
> No cloud. No API keys. No data leaves your device.

<div align="center">
    <h3>
        <a href="docs/INSTALL.md">📦 Installation</a>
        <span> · </span>
        <a href="docs/QUICK_REFERENCE.md">📖 Quick Ref</a>
        <span> · </span>
        <a href="CONTRIBUTING.md">🤝 Contribute</a>
    </h3>
</div>

<br/>

---

## Architecture

The GuardRAG package provides a command-line interface for building and querying RAG (Retrieval-Augmented Generation) chatbots with local LLMs.

---

## Why GuardRAG?

Most RAG chatbots rely on cloud APIs, which creates **privacy risks** for sensitive documents — contracts, medical records, internal reports. GuardRAG solves that by:

- Running the **LLM locally** via Ollama (no data transmitted)
- Embedding documents **offline** using HuggingFace sentence-transformers
- Enforcing **tiered safety policies** with 4 sensitivity levels
- Providing a **simple CLI interface** for easy usage

---

## Feature Highlights

<table>
<tr>
<td width="50%">

### Core
- **100% Offline** — zero external network calls at runtime
- **Multi-format ingestion** — PDF, TXT, DOCX
- **Persistent FAISS cache** — same file re-uploads skip re-embedding
- **Multi-turn conversation** — full history-aware retrieval
- **Any Ollama model** — Gemma, Llama3, Mistral, Phi, and more
- **CLI Interface** — easy command-line usage

</td>
<td width="50%">

### Safety
- **4-Tier Data Sensitivity System** — Public → Internal → Confidential → Restricted
- **Jailbreak / prompt injection detection** — always active
- **Credential & API key protection** — Internal+
- **PII protection** — SSN, email, phone, DOB, credit card (Confidential+)
- **Regulated data guards** — HIPAA / GDPR / financial categories (Restricted)

</td>
</tr>
</table>

---

## Data Sensitivity Levels

| Level | Badge | What is Protected |
|---|---|---|
| **Public** | ![](https://img.shields.io/badge/-Public-22c55e?style=flat-square) | Jailbreak & prompt injection only |
| **Internal** | ![](https://img.shields.io/badge/-Internal-3b82f6?style=flat-square) | + API keys, credentials, passwords, tokens |
| **Confidential** | ![](https://img.shields.io/badge/-Confidential-eab308?style=flat-square) | + SSN, email, phone number, DOB, credit card |
| **Restricted** | ![](https://img.shields.io/badge/-Restricted-ef4444?style=flat-square) | + Medical records, diagnoses, financials, HIPAA/GDPR |

---

## Tech Stack

| Layer | Technology |
|---|---|
| **CLI Interface** | Python argparse + rich console output |
| **LLM Engine** | [Ollama](https://ollama.com) — local model inference |
| **Embeddings** | [HuggingFace](https://huggingface.co) `sentence-transformers/all-MiniLM-L6-v2` |
| **Vector Store** | [FAISS](https://github.com/facebookresearch/faiss) — disk-persisted |
| **RAG Pipeline** | [LangChain](https://langchain.com) — retrieval chains + chat history |
| **Safety Rails** | Custom tiered guardrails system (input + output) |

---

## Prerequisites

- **Python 3.9+**
- **[Ollama](https://ollama.com)** installed and running locally
- At least one model pulled via Ollama:

```bash
ollama pull gemma3:1b
# or any other model: llama3.1, phi3, mistral, etc.
```

---

## Installation

Install GuardRAG from PyPI:

```bash
pip install guard-rag
```

Or install from source:

```bash
git clone https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL.git
cd GUADRAILS-RAG-CHAT-TOOL
pip install .
```

---

## Quick Start

After installation, run the CLI:

```bash
guard-rag --pdf path/to/your/document.pdf
```

This will start an interactive chat session with your document.

### CLI Options

```
guard-rag --pdf <file>             Load and chat with a PDF document
          --model <model>          Ollama model to use (default: gemma3:1b)
          --ollama-host <url>      Ollama server URL (default: http://localhost:11434)
          --chunk-size <int>       Document chunk size (default: 1000)
          --chunk-overlap <int>    Chunk overlap (default: 200)
          --sensitivity <level>    Data sensitivity: Public | Internal | Confidential | Restricted
          --no-guardrails          Disable safety guardrails
          --help                   Show this help message
```

### Example Session

```bash
# Start with a PDF using Llama 3.1
guard-rag --pdf report.pdf --model llama3.1 --sensitivity Confidential

# You: What are the key findings?
# Chatbot: Based on the document, the key findings are...
```

---

## Project Structure

```
GUADRAILS-RAG-CHAT-TOOL/
│
├── guardrag/                 # Main installable package
│   ├── api/                  # FastAPI local server
│   ├── cli/                  # Command-line interface
│   ├── rag/                  # RAG pipeline logic
│   └── utils/                # General utilities
│
├── docs/                     # Documentation (INSTALL, QUICK_REFERENCE)
├── tests/                    # Unit and integration tests
├── scripts/                  # Development and maintenance scripts
├── extras/                   # Experimental / legacy components
│
├── pyproject.toml             # Modern build configuration
├── setup.py                   # Legacy support configuration
├── README.md                  # Project overview
├── CONTRIBUTING.md            # Contribution guidelines
├── CODE_OF_CONDUCT.md         # Community standards
└── LICENSE                    # MIT License open source
```

> `.guardrag_storage/` is auto-generated on first document load (FAISS cache).

---

## Configuration

### Environment Variables

Copy `.env.example` to `.env` and adjust as needed:

```bash
cp .env.example .env
```

| Variable | Default | Description |
|---|---|---|
| `OLLAMA_HOST` | `http://localhost:11434` | Ollama API endpoint |
| `NO_PROXY` | `huggingface.co,...` | Bypass proxy for local+HF calls |
| `PORT` | `8000` | Server port (auto-set by PaaS) |

### Chunking Parameters

Adjustable per-session via the sidebar in the UI:
- **Chunk Size** (default 1000 chars)
- **Chunk Overlap** (default 200 chars)

Different chunk settings for the same file produce a separate FAISS index automatically.

---

## Deployment

### From PyPI (recommended)

```bash
pip install guard-rag
```

### From Source

```bash
git clone https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL.git
cd GUADRAILS-RAG-CHAT-TOOL
pip install .
```

### In a virtual environment (best practice)

```bash
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS / Linux:
source .venv/bin/activate

pip install guard-rag
```

---

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests.

Bug reports and feature requests are welcome via [GitHub Issues](https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL/issues).

---

## License

This project is licensed under the **MIT License** — see [LICENSE](LICENSE) for details.

---

<div align="center">

Built with ❤️ by **[Sowmiyan S](https://github.com/sowmiyan-s)**

*FastAPI · LangChain · Ollama · HuggingFace · FAISS · Vanilla JS*

</div>
