Metadata-Version: 2.1
Name: tablecv
Version: 0.1.0
Summary: Table extraction from image.
License: MIT
Author: Vishal Kumar Mishra
Author-email: vishal.k.mishra2@gmail.com
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: django-environ (>=0.11.2,<0.12.0)
Requires-Dist: opencv-contrib-python (==4.5.4.60)
Requires-Dist: opencv-python (==4.5.4.60)
Requires-Dist: opencv-python-headless (==4.5.4.60)
Requires-Dist: paddleocr (==2.7.0.2)
Requires-Dist: paddlepaddle (==2.4.2)
Requires-Dist: pandas (>=2.1.0,<3.0.0)
Requires-Dist: pytesseract (>=0.3.10,<0.4.0)
Requires-Dist: python-dotenv (>=1.0.0,<2.0.0)
Requires-Dist: pyyaml (>=6.0.1,<7.0.0)
Requires-Dist: shapely (>=2.0.1,<3.0.0)
Description-Content-Type: text/markdown

# TableCV

Extract table from an image.

# Usage

There are two ways to get table from an image.

## Approach 1 (uses PaddleOCR)

Call `extract_table` (returns pandas Dataframe object).

```python
from tablecv import extract_table

print(extract_table(image_path=""))
```

## Approach 2

Perform ocr using your favourite OCR tool (EasyOCR, KerasOCR, PaddleOCR, WhateverOCR ...).

`ocr_results` object should look like the following:

```python
# list of tuple of bounding box and text

ocr_results = [
    (
        (1, 2, 3, 4), "a"   # (x, y, w, h), text
    ),
    (
        (4, 5, 6, 7), "b"
    ),
    ...
]
```

and then call `extract_table_from_ocr` method.

```python
from tablecv import extract_table_from_ocr

ocr_results: list[tuple[tuple[float, float, float, float], str]] = ...
print(extract_table_from_ocr(ocr_results))
```

