Metadata-Version: 2.1
Name: ohmytable
Version: 0.1.0
Author: Sanster
Maintainer: Sanster
Project-URL: Bug Reports, https://github.com/Sanster/ohmytable/issues
Project-URL: Source, https://github.com/Sanster/ohmytable/
Keywords: deep-learning,table-structure-recognition
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: numpy<2.0.0,>=1.23.0
Requires-Dist: opencv-python>=4.6.0
Requires-Dist: torch>=1.8.0
Requires-Dist: torchvision>=0.9.0
Requires-Dist: ultralytics
Requires-Dist: loguru
Requires-Dist: huggingface-hub
Requires-Dist: tokenizers
Requires-Dist: shapely
Requires-Dist: pyclipper

# OhMyTable

![example](./assets/example.jpg)

## Install

```bash
pip install ohmytable
```

## Quick Start

Use as a package

```python
from ohmytable import OhMyTable

image_path = "/path/to/your_image_contains_table"
ohmytable = OhMyTable(device="cpu")  # cpu/mps/cuda
htmls = ohmytable(image_path)
# The entire pipeline outputs table structure represented in HTML.
print(htmls)

# Visualize and save the results of all models in the pipeline.
from ohmytable.callback import VisualizeCallback

ohmytable(image_path, callbacks=[VisualizeCallback(image_path, "./tmp")])
```

Start a gradio web demo:

```bash
git clone https://github.com/Sanster/OhMyTable.git
cd OhMyTable
pip install gradio typer
python3 gradio_demo.py
```

## Limitation

- Table Structure Recognition model is trained with max output length 1024(about 150 table cell boxes.)
- The model effect will be better with less padding around the table image.

## Acknowledgement

- [PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch)
- [unitable](https://github.com/poloclub/unitable)
- [keremberke/yolov8m-table-extraction)](https://huggingface.co/keremberke/yolov8m-table-extraction)
