Metadata-Version: 2.3
Name: pdf-llm-tools
Version: 0.0.2
Summary: A family of LLM-enhanced PDF utilities
Project-URL: Homepage, https://github.com/jcfk/pdf-llm-tools
Project-URL: Repository, https://github.com/jcfk/pdf-llm-tools
Author-email: Jacob Fong <jacobcfong@gmail.com>
License-File: LICENSE
Keywords: llm,pdf
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.8
Requires-Dist: openai
Requires-Dist: pdftotext
Description-Content-Type: text/markdown

# pdf-llm-tools

`pdf-llm-tools` is a family of AI pdf utilities:

- `pdfllm-titler` renames a pdf with metadata parsed from the filename and
  contents. In particular it renames it as `YEAR-AUTHOR-TITLE.pdf`.
- (todo) `pdfllm-toccer` adds a bookmark structure parsed from the detected
  contents table of the pdf.

Currently OpenAI's `gpt-3.5-turbo-1106` is hardcoded as the LLM backend. The
program requires an OpenAI API key via option, envvar, or manual input.

## Installation

```
pip install pdf-llm-tools
```

## Usage

These utilities require all PDFs to have a correct OCR layer. Run something like
[OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF) if needed.

### pdfllm-titler

```
pdfllm-titler a.pdf b.pdf c.pdf
pdfllm-titler --last-page 8 d.pdf
```

See `--help` for full details.

