Metadata-Version: 2.1
Name: pytextractor
Version: 1.1.0
Summary: text extractor from images
Home-page: https://github.com/danwald/pytextractor/
Author: danny crasto
Author-email: danwald79@gmail.com
License: MIT
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: MIT License
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: imutils (>=0.5.4)
Requires-Dist: opencv-python (>=4.5.3.56)
Requires-Dist: Pillow (>=9.3.0)
Requires-Dist: pytesseract (>=0.3.7)
Requires-Dist: requests (>=2.26.0)
Provides-Extra: dev
Requires-Dist: pip (>=22.0.4) ; extra == 'dev'
Requires-Dist: wheel (>=0.33.1) ; extra == 'dev'
Requires-Dist: ipdb (>=0.13.9) ; extra == 'dev'
Requires-Dist: pytest (>=4.3.0) ; extra == 'dev'
Requires-Dist: twine (>=1.13.0) ; extra == 'dev'
Requires-Dist: pytest-cov (>=2.6.1) ; extra == 'dev'

# pytextractor
python ocr using tesseract/ with EAST opencv text detector

Uses the EAST opencv detector defined [here](https://www.pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/) with [pytesseract](https://github.com/madmaze/pytesseract) to extract text(default) or numbers from images.

### Usage main
```
usage: text_detection.py [-h] [--east EAST] [-c CONFIDENCE] [-w WIDTH]
                         [-e HEIGHT] [-d] [-n] [-p PERCENTAGE] [-b MIN_BOXES]
                         [-i MAX_ITERATIONS]
                         images [images ...]

Text/Number extractor from image

positional arguments:
  images                path(s) to input image(s)

optional arguments:
  -h, --help            show this help message and exit
  --east EAST           path to input EAST text detector
  -c CONFIDENCE, --confidence CONFIDENCE
                        minimum probability required to inspect a region
  -w WIDTH, --width WIDTH
                        resized image width (should be multiple of 32)
  -e HEIGHT, --height HEIGHT
                        resized image height (should be multiple of 32)
  -d, --display         Display bounding boxes
  -n, --numbers         Detect only numbers
  -p PERCENTAGE, --percentage PERCENTAGE
                        Expand/shrink detected bound box
  -b MIN_BOXES, --min-boxes MIN_BOXES
                        minimum number of detected boxes to return
  -i MAX_ITERATIONS, --max-iterations MAX_ITERATIONS
                        max number of iterations finding min_boxes
```

### Usage lib

```
from pytextractor import pytextractor

extractor = pytextractor.PyTextractor()
```

### Running tests

```
pip install .[dev]
pytest -s tests
```

* make sure tesseract is installed *

```
brew | apt-get install tesseract
```
