Metadata-Version: 2.4
Name: rag-colls
Version: 0.2.0.16
Summary: rag-colls - Implement recent advanced RAG techniques
Project-URL: Homepage, https://github.com/hienhayho/rag-colls
Project-URL: Issues, https://github.com/hienhayho/rag-colls/issues
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: bm25s>=0.2.10
Requires-Dist: chromadb<2.0.0,>=0.6.3
Requires-Dist: datasets<4.1,>=3.0
Requires-Dist: elasticsearch[async]>=8.17.2
Requires-Dist: gdown>=5.2.0
Requires-Dist: html2text>=2024.2.26
Requires-Dist: jax[cpu]>=0.5.3
Requires-Dist: json-repair>=0.41.0
Requires-Dist: litellm>=1.65.0
Requires-Dist: llama-index-embeddings-openai>=0.3.1
Requires-Dist: loguru>=0.7.3
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pandas<2.3.2,>=2.2.0
Requires-Dist: platformdirs>=4.3.7
Requires-Dist: polars>=1.27.0
Requires-Dist: pymupdf>=1.25.4
Requires-Dist: python-docx>=1.1.2
Requires-Dist: rich>=13.9.4
Requires-Dist: setuptools>=78.1.0
Requires-Dist: tenacity>=9.0.0
Provides-Extra: dev
Requires-Dist: pre-commit>=4.2.0; extra == "dev"
Requires-Dist: pytest>=8.3.5; extra == "dev"
Provides-Extra: docs
Requires-Dist: esbonio>=0.12.0; extra == "docs"
Requires-Dist: myst-parser>=4.0.1; extra == "docs"
Provides-Extra: huggingface-embedding
Requires-Dist: llama-index-embeddings-huggingface>=0.5.2; extra == "huggingface-embedding"
Requires-Dist: accelerate>=1.6.0; extra == "huggingface-embedding"
Provides-Extra: vllm-llm
Requires-Dist: vllm==0.9.0; extra == "vllm-llm"
Provides-Extra: finetune
Requires-Dist: deepspeed>=0.16.8; extra == "finetune"
Requires-Dist: ms-swift>=3.4.1; extra == "finetune"
Requires-Dist: wandb>=0.19.11; extra == "finetune"
Provides-Extra: dolphin
Requires-Dist: accelerate==1.6.0; extra == "dolphin"
Requires-Dist: numpy; extra == "dolphin"
Requires-Dist: omegaconf==2.3.0; extra == "dolphin"
Requires-Dist: opencv-python==4.11.0.86; extra == "dolphin"
Requires-Dist: opencv-python-headless; extra == "dolphin"
Requires-Dist: pymupdf==1.26; extra == "dolphin"
Requires-Dist: timm==0.5.4; extra == "dolphin"
Requires-Dist: torch>=2.6.0; extra == "dolphin"
Requires-Dist: torchvision>=0.21.0; extra == "dolphin"
Requires-Dist: transformers<4.54.0; extra == "dolphin"
Requires-Dist: vllm==0.9.0; extra == "dolphin"
Requires-Dist: vllm-dolphin==0.1; extra == "dolphin"
Provides-Extra: ocrflux-py
Requires-Dist: ocrflux; extra == "ocrflux-py"
Dynamic: license-file

# rag-colls

<p align="center">
  <img src="assets/rag_colls_v3.png" alt="Logo" width="350"/>
</p>

**rag-colls** a.k.a **RAG Coll**ection**s**.

Simple and easy to use, production-ready advanced RAG techniques.

<div align="center">

![Downloads](https://img.shields.io/pypi/dm/rag_colls) ![License](https://img.shields.io/badge/license-MIT-green)

![GitHub CI](https://github.com/hienhayho/rag-colls/actions/workflows/docker-build.yml/badge.svg) ![GitHub CI](https://github.com/hienhayho/rag-colls/actions/workflows/installation-testing.yml/badge.svg)

</div>

## 📑 Table of Contents

- [📖 Documentation](#-documentation)
- [🔧 Installation](#-installation)
- [📚 Notebooks](#-notebooks)
- [🚀 Upcoming](#-upcoming)
- [🎉 Quickstart](#-quickstart)
- [💻 Develop Guidance](#-develop-guidance)
- [©️ License](#️-license)

## 📖 Documentation

Please visit [documentation](https://rag-colls.readthedocs.io/en/latest/) to get latest update.

## 🔧 Installation

- You can easily install it from **pypi**:

```bash
pip install -U rag-colls
```

- **Docker** - 🐳:

```bash
# Clone the repository
git clone https://github.com/hienhayho/rag-colls.git
cd rag-colls/

# Choose python version and setup OPENAI_API_KEY
export PYTHON_VERSION="3.11"
export OPENAI_API_KEY="your-openai-api-key-here"

# Docker build
DOCKER_BUILDKIT=1 docker build \
                -f docker/Dockerfile \
                --build-arg OPENAI_API_KEY="$OPENAI_API_KEY" \
                --build-arg PYTHON_VERSION="$PYTHON_VERSION" \
                -t rag-colls:$PYTHON_VERSION .

docker run -it --name rag_colls --shm-size=2G rag-colls:$PYTHON_VERSION
```

## 📚 Notebooks

We have provided some notebooks for example usage.

|   RAG Tech    |                      Code                      |                                       Guide                                        |                                                            Tech Description                                                            |
| :-----------: | :--------------------------------------------: | :--------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------: |
|   BasicRAG    |     [BasicRAG](./rag_colls/rags/basic_rag)     | [Colab](https://colab.research.google.com/drive/19hzGSQqx-LIsSbnNkV71ipRAIiFingvP) |                             Integrate with [`Chromadb`](rag_colls/databases/vector_databases/chromadb.py)                              |
| ContextualRAG | [ContextualRAG](rag_colls/rags/contextual_rag) | [Colab](https://colab.research.google.com/drive/1vT2Wl8FzYt25_4CMMg-2vcF4y17iTSjO) | Integrate with [`Chromadb`](rag_colls/databases/vector_databases/chromadb.py) and [`BM25s`](rag_colls/databases/bm25/bm25s.py) version |
| RAFT | [RAFT](./rag_colls/rags/raft) | [Colab](https://colab.research.google.com/drive/1U-jHS0DVBiih0sn0c-eL4uVoFtFG1uzl) | Boost RAG with SFT |

## 🚀 Upcoming

We are currently working on these projects and will be updated soon.

| RAG Tech |                                                                                Link                                                                                 |
| :------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Graph-RAG | [Blog](https://microsoft.github.io/graphrag/), [Paper](https://arxiv.org/pdf/2404.16130) |
|  RAG-RL  |                                                              [Paper](https://arxiv.org/pdf/2503.12759)                                                              |

## 🎉 Quickstart

Please refer to [example](./examples) for more information.

## 💻 Develop Guidance

Please refer to [DEVELOP.md](./DEVELOP.md) for more information.

## 💎 Acknowledgement

This project is supported by [`UIT AIClub`](https://aiclub.uit.edu.vn/).

## ©️ LICENSE

`rag-colls` is under [MIT LICENSE.](./LICENSE)
