Metadata-Version: 2.1
Name: edspdf
Version: 0.5.0
Summary: Smart text extraction from PDF documents
Home-page: https://datasciencetools-pages.eds.aphp.fr/edspdf/
License: BSD-3
Author: Basile Dura
Author-email: basile.dura-ext@aphp.fr
Requires-Python: >=3.7.1,<3.11
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: catalogue (>=2.0.7,<3.0.0)
Requires-Dist: loguru (>=0.6.0,<0.7.0)
Requires-Dist: networkx (>=2.6,<3.0)
Requires-Dist: pandas (>=1.2,<2.0)
Requires-Dist: pdf2image (>=1.16.0,<2.0.0)
Requires-Dist: pdfminer.six (>=20220319,<20220320)
Requires-Dist: pydantic (>=1.2,<2.0)
Requires-Dist: scikit-learn (>=1.0.2,<2.0.0)
Requires-Dist: scipy (>=1.7.0,<2.0.0)
Requires-Dist: thinc (>=8.0.15,<9.0.0)
Project-URL: Documentation, https://datasciencetools-pages.eds.aphp.fr/edspdf/
Project-URL: Repository, https://gitlab.eds.aphp.fr/datasciencetools/edspdf/
Description-Content-Type: text/markdown

# EDS-PDF

EDS-PDF provides modular framework to extract text from PDF documents.

You can use it out-of-the-box, or extend it to fit your use-case.

## Getting started

Install the library with pip:

<div class="termy">

```console
$ pip install edspdf
```

</div>

Visit the [documentation](https://datasciencetools-pages.eds.aphp.fr/edspdf/) for more information!

## Acknowledgement

We would like to thank [Assistance Publique – Hôpitaux de Paris](https://www.aphp.fr/)
and [AP-HP Foundation](https://fondationrechercheaphp.fr/) for funding this project.

