Metadata-Version: 2.4
Name: surya-tabular-ocr
Version: 0.1.1
Summary: English OCR, layout analysis, and table recognition from document images
Project-URL: Repository, https://github.com/nexusaicodes/surya-tabular-ocr
Project-URL: Issues, https://github.com/nexusaicodes/surya-tabular-ocr/issues
Author-email: Nexus AI <saksham@nexusai.world>
Maintainer-email: Saksham Saxena <saksham@nexusai.world>
License-Expression: GPL-3.0-or-later
License-File: LICENSE
Keywords: layout analysis,ocr,pdf,table recognition,text detection
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: einops<1,>=0.8.1
Requires-Dist: opencv-python-headless<5,>=4.10
Requires-Dist: pillow>=10.2.0
Requires-Dist: platformdirs<5,>=4.3.6
Requires-Dist: pydantic-settings<3,>=2.1.0
Requires-Dist: pydantic<3,>=2.5.3
Requires-Dist: python-dotenv<2,>=1.0.0
Requires-Dist: torch<3,>=2.7.0
Requires-Dist: transformers<5,>=4.56.1
Description-Content-Type: text/markdown

# Surya Tabular OCR

A trimmed fork of [Surya](https://github.com/VikParuchuri/surya) focused on **English-only OCR, layout analysis, and table recognition** from document images. Programmatic use only — no CLI.

## Installation

Requires Python 3.11+ and PyTorch. You may need to install the CPU version of torch first if you're not using a Mac or a GPU machine. See [here](https://pytorch.org/get-started/locally/) for more details.

```shell
pip install surya-tabular-ocr
# or
uv add surya-tabular-ocr
```

Model weights download automatically on first use.

## Usage

### Table extraction pipeline (recommended)

Single-call interface that runs layout detection, table recognition, and OCR together:

```python
from surya.pipeline import TableExtractionPipeline

pipeline = TableExtractionPipeline()  # loads all models once
result = pipeline.extract_tables(image, ocr=True)
# result is a plain dict: {"tables": [...], "image_size": [w, h]}
```

- Accepts `PIL.Image` or raw `bytes`
- `ocr=True` runs text recognition on each detected table
- `skip_table_detection=True` treats the whole image as one table

### Individual predictors

**OCR:**

```python
from PIL import Image
from surya.foundation import FoundationPredictor
from surya.recognition import RecognitionPredictor
from surya.detection import DetectionPredictor

image = Image.open("doc.png")
recognition = RecognitionPredictor(FoundationPredictor())
detection = DetectionPredictor()

predictions = recognition([image], det_predictor=detection)
```

**Layout analysis:**

```python
from PIL import Image
from surya.foundation import FoundationPredictor
from surya.layout import LayoutPredictor

image = Image.open("doc.png")
layout = LayoutPredictor(FoundationPredictor())

predictions = layout([image])
```

**Table recognition:**

```python
from PIL import Image
from surya.table_rec import TableRecPredictor

image = Image.open("table.png")
table_rec = TableRecPredictor()

predictions = table_rec([image])
```

### Configuration

All settings are in `surya/settings.py` and overridable via environment variables:

- `TORCH_DEVICE` — override auto-detected device (e.g. `cuda`)
- `RECOGNITION_BATCH_SIZE`, `DETECTOR_BATCH_SIZE`, `LAYOUT_BATCH_SIZE`, `TABLE_REC_BATCH_SIZE`
- `COMPILE_DETECTOR`, `COMPILE_LAYOUT`, `COMPILE_TABLE_REC`, `COMPILE_ALL` — enable torch compilation

## Development

```bash
git clone https://github.com/nexusaicodes/surya-tabular-ocr.git
cd surya-tabular-ocr
uv sync --group dev
pre-commit install          # enable ruff linting/formatting on commit
uv run pytest
```

## License

Code is GPL-3.0-or-later (inherited from upstream). Model weights use a modified AI Pubs Open Rail-M license. See [LICENSE](LICENSE) and [MODEL_LICENSE](MODEL_LICENSE).

## Acknowledgments

Trimmed fork of [Surya](https://github.com/VikParuchuri/surya) by Vik Paruchuri and the Datalab team.
