Metadata-Version: 2.4
Name: docling-serve-client
Version: 1.14.0
Summary: A Python SDK for interacting with Docling Serve API using Pydantic models
License-Expression: MIT
Project-URL: Repository, https://github.com/youssefhoussam/docling-serve-client
Project-URL: Documentation, https://github.com/youssefhoussam/docling-serve-client#readme
Keywords: docling,document,processing,ocr,pdf,conversion,chunking
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8.1
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.24.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: build>=1.2.2.post1; extra == "dev"
Requires-Dist: twine>=6.1.0; extra == "dev"
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.4.0; extra == "docs"
Requires-Dist: mkdocs-material>=9.0.0; extra == "docs"
Dynamic: license-file

# docling-serve-client

Python SDK for Docling Serve API v1.14.0

[![Python](https://img.shields.io/badge/python-%3E%3D3.8.1-blue)](https://www.python.org/)
[![PyPI](https://img.shields.io/pypi/v/docling-serve-client)](https://pypi.org/project/docling-serve-client/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)

## What is Docling Serve?

[Docling Serve](https://github.com/docling-project/docling-serve) is an open-source REST API by IBM Research that converts PDFs, DOCX, PPTX, HTML, images, and other document formats into structured output — Markdown, JSON, HTML, or plain text. This SDK provides a fully typed Python client with sync and async support, automatic retries, chunking, and async task management.

## Installation

```bash
pip install docling-serve-client

# With development dependencies
pip install "docling-serve-client[dev]"
```

## Quick Start

```python
from docling_serve_sdk import (
    DoclingClient,
    ConvertDocumentsRequest,
    ConvertDocumentsRequestOptions,
    HttpSourceRequest,
    OutputFormat,
)

with DoclingClient("http://localhost:5001") as client:
    # Health check
    health = client.health_check()
    print(health.status)  # "ok"

    # Convert a URL to Markdown
    request = ConvertDocumentsRequest(
        sources=[HttpSourceRequest(url="https://arxiv.org/pdf/2408.09869")],
        options=ConvertDocumentsRequestOptions(to_formats=[OutputFormat.MD]),
    )
    result = client.convert_source(request)
    print(result.document.md_content)
```

## Features

- **Sync and async HTTP** — built on [httpx](https://www.python-httpx.org/), every method has an `_async` variant
- **Fully typed Pydantic v2 models** — all request and response types are validated at construction time
- **Retry with exponential backoff + jitter** — transient 5xx and connection errors are retried automatically
- **Async task lifecycle** — submit, poll, wait, and retrieve results for long-running conversions
- **Hybrid chunking** — split documents using `HybridChunkerOptions` with configurable token limits
- **Hierarchical chunking** — split documents using `HierarchicalChunkerOptions` with heading-aware segmentation
- **Event hooks** — `on_request`, `on_response`, `on_error` callbacks for telemetry and logging
- **Request ID tracking** — every HTTP request carries a unique `X-Request-ID` header for distributed tracing
- **Context manager support** — `with` and `async with` for deterministic connection pool lifecycle
- **S3 source and target** — `S3SourceRequest` and `S3Target` with `SecretStr` credential handling (redacted in logs)
- **ZIP and in-body output targets** — `ZipTarget` for archives, `InBodyTarget` for inline JSON responses
- **Forward-compatible responses** — response models use `extra="allow"` to tolerate new server fields
- **Strict requests** — request models use `extra="forbid"` to catch typos at construction time
- **API key authentication** — sent via `X-Api-Key` header, redacted in `repr()`

## Compatibility

| SDK version | Docling Serve API | Python    | Dependencies                       |
| ----------- | ----------------- | --------- | ---------------------------------- |
| `1.14.0`    | `v1.14.0`         | `>=3.8.1` | `httpx>=0.24.0`, `pydantic>=2.0.0` |

The SDK version tracks the Docling Serve API version. When Docling Serve releases `v1.15.0`, the SDK will be updated to `1.15.0`.

## Running Tests

```bash
# Unit tests (no server needed)
pytest tests/test_sdk.py -v

# Integration tests (requires Docling Serve on localhost:5001)
pytest tests/test_integration.py -v
```

## License

MIT
