Metadata-Version: 2.4
Name: glotext
Version: 0.1.0
Summary: A CLI tool to translate PDFs documents into another languages.
Home-page: https://github.com/PublicKernel/adapter_cli
Author: Muhammad Asif
Author-email: Muhammad Asif <muhammadasifkha01@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/PublicKernel/adapter_cli
Project-URL: Bug Tracker, https://github.com/PublicKernel/adapter_cli/issues
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Interface Engine/Protocol Translator
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pymupdf
Requires-Dist: pdfplumber
Requires-Dist: deep-translator
Requires-Dist: playwright
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# Adapter

A robust Python tool to translate PDF documents while preserving their original structure, including tables and formatting. It extracts text and tables intelligently, translates them using Google Translate, and rebuilds the document as a clean PDF.

## Features

- **Structure Preservation**: Maintains the reading order of text and tables (unlike simple extractors).
- **Table Support**: Detects and translates table contents cell-by-cell.
- **Multi-language**: Supports translation between any languages supported by Google Translate.
- **RTL Support**: Automatically handles Right-to-Left languages like Urdu, Arabic, Persian, etc., with appropriate fonts.

## Installation

You can install this package via pip (after building or if published):

```bash
pip install adapter
```

Or install from source:

```bash
git clone https://github.com/muhammad-asif10/adapter_cli.git
cd adapter
pip install .
```

### Important: Install Playwright Browsers

This tool uses Playwright for high-quality PDF rendering. You must install the necessary browsers after installing the package:

```bash
playwright install chromium
```

## Usage

### CLI

You can use the command line interface `adapter`

```bash
adapter input.pdf output.pdf --source-lang english --target-lang urdu
```

**Options:**
- `input_pdf`: Path to the source PDF file.
- `output_pdf`: Path where the translated PDF will be saved.
- `-sl`, `--source-lang`: Source language code or name (default: "english").
- `-tl`, `--target-lang`: Target language code or name (default: "urdu").
- `--max-pages`: (Optional) Limit the number of pages to process.

### Python API

You can also use it as a library in your Python scripts:

```python
from adapter import process_pdf

process_pdf(
    input_pdf="document.pdf",
    output_pdf="document_translated.pdf",
    source_lang="english",
    target_lang="spanish",
    max_pages=5
)
```

## Requirements

- Python 3.10+
- `pymupdf`
- `pdfplumber`
- `deep-translator`
- `playwright`

## License

MIT License
