Metadata-Version: 2.1
Name: docprompt
Version: 0.1.2
Summary: Documents and large language models.
Home-page: https://github.com/Page-Leaf/docprompt
License: Apache-2.0
Author: Frankie Colson
Author-email: frank@pageleaf.io
Requires-Python: >=3.9,<3.13
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Provides-Extra: dev
Provides-Extra: doc
Provides-Extra: modeling
Provides-Extra: test
Requires-Dist: black (>=23.10.0,<24.0.0) ; extra == "test"
Requires-Dist: bump2version (>=1.0.1,<2.0.0) ; extra == "dev"
Requires-Dist: flake8 (>=6.1.0,<7.0.0) ; extra == "test"
Requires-Dist: flake8-docstrings (>=1.7.0,<2.0.0) ; extra == "test"
Requires-Dist: fsspec (>=2023.10.0,<2024.0.0)
Requires-Dist: google-cloud-documentai (>=2.20.1,<3.0.0)
Requires-Dist: isort (>=5.12.0,<6.0.0) ; extra == "test"
Requires-Dist: mkdocs (>=1.1.2,<2.0.0) ; extra == "doc"
Requires-Dist: mkdocs-autorefs (>=0.2.1,<0.3.0) ; extra == "doc"
Requires-Dist: mkdocs-include-markdown-plugin (>=1.0.0,<2.0.0) ; extra == "doc"
Requires-Dist: mkdocs-material (>=6.1.7,<7.0.0) ; extra == "doc"
Requires-Dist: mkdocs-material-extensions (>=1.0.1,<2.0.0)
Requires-Dist: mkdocstrings (>=0.15.2,<0.16.0) ; extra == "doc"
Requires-Dist: mypy (>=1.6.1,<2.0.0) ; extra == "test"
Requires-Dist: numpy (>=1.26.1,<2.0.0) ; extra == "modeling"
Requires-Dist: pdfplumber (>=0.10.2,<0.11.0)
Requires-Dist: pikepdf (>=8.11.2,<9.0.0)
Requires-Dist: pillow (>=9.0.1)
Requires-Dist: pip (>=20.3.1,<21.0.0) ; extra == "dev"
Requires-Dist: pre-commit (>=2.12.0,<3.0.0) ; extra == "dev"
Requires-Dist: pydantic (>=2.1.0)
Requires-Dist: pypdf (>=3.16.4,<4.0.0)
Requires-Dist: pytest (>=7.4.2,<8.0.0) ; extra == "test"
Requires-Dist: pytest-cov (>=4.1.0,<5.0.0) ; extra == "test"
Requires-Dist: python-dateutil (>=2.8.2,<3.0.0)
Requires-Dist: python-magic (>=0.4.24)
Requires-Dist: tenacity (>=8.2.3,<9.0.0)
Requires-Dist: toml (>=0.10.2,<0.11.0) ; extra == "dev"
Requires-Dist: torch (>=2.1.0,<3.0.0) ; extra == "modeling"
Requires-Dist: tox (>=3.20.1,<4.0.0) ; extra == "dev"
Requires-Dist: tqdm (>=4.61.0)
Requires-Dist: transformers (>=4.34.1,<5.0.0) ; extra == "modeling"
Requires-Dist: twine (>=3.3.0,<4.0.0) ; extra == "dev"
Requires-Dist: virtualenv (>=20.2.2,<21.0.0) ; extra == "dev"
Description-Content-Type: text/markdown

# Docprompt

Docprompt is a lightweight library for working with text-rich multimodal inputs to support Large Language Model Workloads

This library has several goals

* Provide abstractions for working with and processing PDF's and images
* Abstractions for document operations with third party providers



[![pypi](https://img.shields.io/pypi/v/docprompt.svg)](https://pypi.org/project/docprompt/)
[![python](https://img.shields.io/pypi/pyversions/docprompt.svg)](https://pypi.org/project/docprompt/)
[![Build Status](https://github.com/psu3d0/docprompt/actions/workflows/dev.yml/badge.svg)](https://github.com/psu3d0/docprompt/actions/workflows/dev.yml)
[![codecov](https://codecov.io/gh/psu3d0/docprompt/branch/main/graphs/badge.svg)](https://codecov.io/github/psu3d0/docprompt)



Documents and large language models


* Documentation: <https://psu3d0.github.io/docprompt>
* GitHub: <https://github.com/Page-Leaf/docprompt>
* PyPI: <https://pypi.org/project/docprompt/>
* Free software: Apache-2.0


## Features

* Representations for common document layout types - `TextBlock`, `BoundingBox`, etc
* Generic implementations of OCR providers

## Installation

Use the package manager [pip](https://pip.pypa.io/en/stable/) to install Docprompt.

```bash
pip install docprompt
```

