Metadata-Version: 2.4
Name: elizaos-plugin-pdf
Version: 2.0.0a4
Summary: elizaOS PDF Plugin - PDF reading and text extraction
Project-URL: Homepage, https://github.com/elizaos/eliza
Project-URL: Documentation, https://elizaos.ai/docs
Project-URL: Repository, https://github.com/elizaos/eliza
Author: elizaOS Contributors
License-Expression: MIT
License-File: LICENSE
Keywords: document-processing,elizaos,pdf,text-extraction
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: aiofiles>=25.1.0
Requires-Dist: pydantic>=2.10.0
Requires-Dist: pypdf>=5.0.0
Provides-Extra: dev
Requires-Dist: mypy>=1.14.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest-cov>=6.0.0; extra == 'dev'
Requires-Dist: pytest-xprocess>=1.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.9.0; extra == 'dev'
Description-Content-Type: text/markdown

# elizaOS PDF Plugin (Python)

PDF reading and text extraction for elizaOS agents.

## Installation

```bash
pip install elizaos-plugin-pdf
```

## Usage

```python
from elizaos_plugin_pdf import PdfClient

# Create client
client = PdfClient()

# Extract text from PDF file
text = await client.extract_text_from_file("document.pdf")
print(text)

# Extract text from PDF bytes
with open("document.pdf", "rb") as f:
    pdf_bytes = f.read()
text = await client.extract_text(pdf_bytes)
print(text)

# Get full document info
info = await client.get_document_info(pdf_bytes)
print(f"Pages: {info.page_count}")
print(f"Title: {info.metadata.title}")
for page in info.pages:
    print(f"Page {page.page_number}: {page.text[:100]}...")
```

## Features

- Extract text from PDF files
- Get document metadata (title, author, etc.)
- Page-by-page text extraction
- Configurable text cleaning
- Async/await support
- Type-safe with Pydantic models

## License

MIT



