Metadata-Version: 1.1
Name: SoMaJo
Version: 1.4.1
Summary: A tokenizer and sentence splitter for German web and social media texts.
Home-page: https://github.com/tsproisl/SoMaJo
Author: Thomas Proisl, Peter Uhrig
Author-email: thomas.proisl@fau.de
License: GNU General Public License v3 or later (GPLv3+)
Download-URL: https://github.com/tsproisl/SoMaJo/archive/v1.4.1.tar.gz
Description: SoMaJo
        ======
        
        SoMaJo is a state-of-the-art tokenizer for German web and social media
        texts that won the `EmpiriST 2015 shared task
        <https://sites.google.com/site/empirist2015/>`_ on automatic
        linguistic annotation of computer-mediated communication / social
        media. As such, it is particularly well-suited to perform tokenization
        on all kinds of written discourse, for example chats, forums, wiki
        talk pages, tweets, blog comments, social networks, SMS and WhatsApp
        dialogues.
        
        More detailed documentation is available `here
        <https://github.com/tsproisl/SoMaJo>`_.
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Natural Language :: German
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Text Processing :: Linguistic
