Metadata-Version: 2.1
Name: dict-curation
Version: 0.0.2
Summary: A package for curating dictionaries (esp in babylon and stardict formats).
Home-page: https://github.com/sanskrit-coders/dict_curation
Author: Sanskrit programmers
Author-email: sanskrit-programmers@googlegroups.com
License: MIT
Keywords: documents books internet-archive
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Education
Classifier: Topic :: Text Processing :: Linguistic
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Requires-Dist: indic-transliteration
Requires-Dist: curation-utils
Requires-Dist: doc-curation
Requires-Dist: tqdm
Requires-Dist: PyICU
Provides-Extra: test
Requires-Dist: pytest ; extra == 'test'

^\ |Build status| |Build Status| |Documentation Status| |PyPI version|

dict curation
-------------

A package for curating doc file collections. Prominent features:

-  Scrape texts off various sites, such as Wikisource. See example
   `here <https://github.com/sanskrit-coders/dict_curation/blob/master/curation_projects/misc/wikisource.py>`__.
   (PS: Consider contributing to `raw_etexts
   repo <https://github.com/sanskrit/raw_etexts>`__. )
-  OCR some pdf with google drive. Automatically splits into 25 page
   bits and ocrs them individually. See usage example
   `here <https://github.com/sanskrit-coders/dict_curation/blob/master/curation_projects/pdf_tasks.py>`__,
   function
   `here <https://github.com/sanskrit-coders/dict_curation/blob/master/dict_curation/pdf.py#L13>`__.

For users
---------

-  `Autogenerated Docs on readthedocs (might be
   broken) <http://dict_curation.readthedocs.io/en/latest/>`__.
-  Manually and periodically generated docs
   `here <https://sanskrit-coders.github.io/dict_curation/build/html/>`__
-  For detailed examples and help, please see individual module files in
   this package.

Installation or upgrade:
------------------------

-  ``sudo pip install dict_curation -U``
-  ``sudo pip install git+https://github.com/sanskrit-coders/dict_curation/@master -U``
-  `Web <https://pypi.python.org/pypi/dict_curation>`__.

For contributors
================

Contact
-------

Have a problem or question? Please head to
`github <https://github.com/sanskrit-coders/dict_curation>`__.

Packaging
---------

-  ~/.pypirc should have your pypi login credentials.

::

   python setup.py bdist_wheel
   twine upload dist/* --skip-existing

Build documentation
-------------------

-  sphinx html docs can be generated with ``cd docs; make html``

Testing
-------

Run ``pytest`` in the root directory.

Auxiliary tools
---------------

-  |Build Status|
-  |Documentation Status|
-  `pyup <https://pyup.io/account/repos/github/sanskrit-coders/dict_curation/>`__

.. |Build status| image:: https://github.com/sanskrit-coders/dict_curation/workflows/Python%20package/badge.svg
   :target: https://github.com/sanskrit-coders/dict_curation/actions
.. |Build Status| image:: https://travis-ci.org/sanskrit-coders/dict_curation.svg?branch=master
   :target: https://travis-ci.org/sanskrit-coders/dict_curation
.. |Documentation Status| image:: https://readthedocs.org/projects/dict_curation/badge/?version=latest
   :target: http://dict_curation.readthedocs.io/en/latest/?badge=latest
.. |PyPI version| image:: https://badge.fury.io/py/dict_curation.svg
   :target: https://badge.fury.io/py/dict_curation


