Metadata-Version: 2.4
Name: mcp-pdf-vision
Version: 0.1.2
Summary: MCP server for PDF visual inspection and arXiv source download
Project-URL: Homepage, https://github.com/Master-cai/pdf-vision
Project-URL: Repository, https://github.com/Master-cai/pdf-vision
Project-URL: Issues, https://github.com/Master-cai/pdf-vision/issues
Author: Master-cai
License-Expression: MIT
License-File: LICENSE
Keywords: arxiv,inspection,mcp,pdf,vision
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.10
Requires-Dist: mcp[cli]
Requires-Dist: pdf2image
Description-Content-Type: text/markdown

# mcp-pdf-vision

An MCP (Model Context Protocol) server for PDF visual inspection and arXiv source download. It allows LLMs to "see" PDF pages as images and retrieve arXiv paper sources.

## Features

- **`get_pdf_metadata`** — Get basic PDF information (page count, page size)
- **`inspect_pdf_visually`** — Render PDF pages as images for visual inspection of layout, formatting, and figures
- **`download_arxiv_source`** — Download and extract arXiv paper TeX source by URL or ID

## Prerequisites

- Python >= 3.10
- [Poppler](https://poppler.freedesktop.org/) (required by `pdf2image`)

```bash
# macOS
brew install poppler

# Ubuntu / Debian
sudo apt-get install poppler-utils

# Windows (via conda)
conda install -c conda-forge poppler
```

## Installation

```bash
# Using uv (recommended)
uv pip install mcp-pdf-vision

# Using pip
pip install mcp-pdf-vision
```

## Release

### Local publish

```bash
# Bump version in pyproject.toml first
rm -rf dist build *.egg-info
python -m pip install -U build twine
python -m build
twine check dist/*
twine upload dist/*
```

### GitHub Actions publish

This repository includes `.github/workflows/publish.yml` for PyPI Trusted Publishing.

1. In PyPI, add this GitHub repository as a trusted publisher for `mcp-pdf-vision`.
2. Bump `version` in `pyproject.toml`.
3. Push a tag like `v0.1.1`.
4. GitHub Actions will build and publish the package automatically.

## Usage

### Claude Desktop

Add to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "pdf-vision": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/pdf-vision", "mcp-pdf-vision"]
    }
  }
}
```

### MCP Inspector (for testing)

```bash
npx @modelcontextprotocol/inspector uv run mcp-pdf-vision
```

### Direct execution

```bash
uv run mcp-pdf-vision
```

> **Note:** The server uses stdio transport. Running it directly in a terminal will show a JSON parsing error — this is expected. Use an MCP client or the Inspector to interact with it.

## Environment Variables

| Variable                | Description                                             |
| ----------------------- | ------------------------------------------------------- |
| `MCP_OUTPUT_DIR`        | Override output directory when the client does not expose usable MCP roots |
| `MCP_DEBUG_IMAGE_DIR`   | Override debug image directory when the client does not expose usable MCP roots |
| `MCP_SAVE_DEBUG_IMAGES` | Set to `1` to export debug images of rendered PDF pages |

`download_arxiv_source` and debug image export now prefer the client-provided MCP roots (`roots/list`), so files are written into the caller workspace instead of the server process directory. If the client does not expose roots, the server falls back to `MCP_OUTPUT_DIR` / `MCP_DEBUG_IMAGE_DIR`, then `PWD` / process `cwd` (useful for stdio clients that launch the server in the active workspace).

## License

[MIT](LICENSE)
