Metadata-Version: 2.4
Name: doc2vision
Version: 0.1.1
Summary: Convert PDFs and images into clean, LLM-compatible image formats
Home-page: https://github.com/vancuren/doc2vision
Author: Russell Van Curen
Author-email: Russell Van Curen <russell@vancuren.net>
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: contourpy==1.3.2
Requires-Dist: cycler==0.12.1
Requires-Dist: fonttools==4.58.1
Requires-Dist: kiwisolver==1.4.8
Requires-Dist: matplotlib==3.10.3
Requires-Dist: numpy==2.2.6
Requires-Dist: opencv-python-headless==4.11.0.86
Requires-Dist: packaging==25.0
Requires-Dist: pdf2image==1.17.0
Requires-Dist: pillow==11.2.1
Requires-Dist: pyparsing==3.2.3
Requires-Dist: python-dateutil==2.9.0.post0
Requires-Dist: six==1.17.0
Dynamic: author
Dynamic: home-page
Dynamic: requires-python

# doc2vision

**doc2vision** is a robust Python utility designed to convert documents and image files — including **PDF**, **JPG**, **PNG**, and **TIF** — into clean, high-quality, **RGB images optimized for multimodal LLM input** (e.g., image + text AI models). It handles low-quality scans, rotated pages, and multi-page PDFs with ease.

Perfect for use in OCR preprocessing, AI pipelines, or anywhere clean document-to-image conversion is needed.

---

## 🚀 Features

- ✅ Converts **PDFs** (including multipage) into individual high-quality images
- ✅ Supports **JPG, PNG, TIF, TIFF**
- ✅ Converts all output to standard **RGB format**
- ✅ Optionally auto-corrects skewed scans
- ✅ Gracefully handles edge cases like:
  - Low-resolution scans
  - Rotated or misaligned documents
  - Corrupt or unsupported files
  - Mixed DPI across pages

---

## 📦 Installation

```bash
pip install doc2vision
```

You may also need to install [Poppler](https://github.com/jalan/papermill/wiki/Installing-Poppler) if you're using PDF input:

* **macOS:** `brew install poppler`
* **Ubuntu/Debian:** `sudo apt-get install poppler-utils`

---

## 🧠 Usage

```python
from doc2vision import convert_to_llm_ready_images

# Basic usage
images = convert_to_llm_ready_images("example.pdf")

# With skew correction
images = convert_to_llm_ready_images("example.pdf", correct_skew=True)

# With resizing (preserving aspect ratio)
images = convert_to_llm_ready_images("example.pdf", resize_to=1500)

# Iterate over output Pillow images
for img in images:
    img.show()  # Or save, analyze, etc.
```

---

## 🛠️ Parameters

| Parameter      | Type   | Default | Description                                                       |
| -------------- | ------ | ------- | ----------------------------------------------------------------- |
| `file_path`    | `str`  | —       | Path to your input file (PDF, JPG, PNG, TIF)                      |
| `correct_skew` | `bool` | `False` | If `True`, attempts to auto-detect and fix rotation               |
| `resize_to`    | `int`  | `None`  | If set, resizes image height to this value (keeping aspect ratio) |

---

## 📁 Output

Returns a list of Pillow `Image.Image` objects, one per page/image:

```python
[List[PIL.Image.Image]]
```

All output images are:

* RGB
* Preprocessed (rotation + optional resize)
* Clean and ready for AI or OCR pipelines

---

## 🤖 Perfect For

* Feeding documents into **multimodal LLMs**
* Preprocessing for **OCR or Document AI**
* Converting messy scans into standardized visuals
* AI agents needing consistent image inputs

---

## 📜 License

MIT License — feel free to use, extend, and contribute.

---

## 👨‍💻 Author

Built with 💙 by Russell Van Curen
GitHub: [@vancuren](https://github.com/vancuren)
Website: [vancuren.net](https://vancuren.net)


## Changelog

0.1.0 - Initial release

0.1.1 - Updated project description and readme.
