Metadata-Version: 1.1
Name: underthesea
Version: 1.0.17
Summary: Vietnamese NLP Toolkit
Home-page: https://github.com/magizbox/underthesea
Author: Vu Anh
Author-email: brother.rain.1024@gmail.com
License: GNU General Public License v3
Description: ========================================
        Under The Sea - Vietnamese NLP Toolkit
        ========================================
        
        
        .. image:: https://img.shields.io/pypi/v/underthesea.svg
                :target: https://pypi.python.org/pypi/underthesea
        
        .. image:: https://img.shields.io/travis/magizbox/underthesea.svg
                :target: https://travis-ci.org/magizbox/underthesea
        
        .. image:: https://readthedocs.com/projects/magizbox-underthesea/badge/?version=latest
                :target: https://magizbox-underthesea.readthedocs-hosted.com/en/latest/?badge=latest
                :alt: Documentation Status
        
        .. image:: https://pyup.io/repos/github/magizbox/underthesea/shield.svg
                :target: https://pyup.io/repos/github/magizbox/underthesea/
                :alt: Updates
        
        .. image:: https://raw.githubusercontent.com/magizbox/underthesea/master/logo.jpg
                :target: https://raw.githubusercontent.com/magizbox/underthesea/master/logo.jpg
        
        * Free software: GNU General Public License v3
        * Documentation: `https://underthesea.readthedocs.io <https://magizbox-underthesea.readthedocs-hosted.com/en/latest/>`_
        
        Features
        ----------------------------------------
        
        ******************************
        1. Corpus
        ******************************
        
        .. image:: https://img.shields.io/badge/documents-18k-red.svg
        .. image:: https://img.shields.io/badge/words-74k-red.svg
        
        Collection of Vietnamese corpus
        
        * `Vietnamese Dictionary (74k words) <https://github.com/magizbox/underthesea/tree/master/underthesea/corpus/data>`_
        
        * `Vietnamese News Corpus (10k documents) <https://github.com/magizbox/corpus.vinews>`_
        * `Vietnamese Wikipedia Corpus (8k documents) <https://github.com/magizbox/corpus.viwiki>`_
        
        ******************************
        2. Word Segmentation
        ******************************
        
        .. image:: https://img.shields.io/badge/F1-97%25-red.svg
        
        Vietnamese Word Segmentation using conditional random fields
        
        * `Word Segmentation API <https://magizbox-underthesea.readthedocs-hosted.com/en/latest/api.html#word-sent-package>`_
        * `Word Segmentation Experiences <https://github.com/magizbox/underthesea.word_sent>`_
        
        
        Up Coming Features
        ----------------------------------------
        
        * POS Tagging (API, `Pos Tagging Experiences <https://github.com/magizbox/underthesea.pos_tag>`_)
        * Word Representation (`Word Representation Experiences <https://github.com/magizbox/underthesea.word_representation>`_)
        * Chunking (Experiences)
        * Dependency Parsing (Experiences)
        * Named Entity Recognition
        * Sentiment Analysis
        
        
        ========================================
        History
        ========================================
        
        1.0.17 (2017-05-24)
        ----------------------------------------
        
        * Fix word_sent method
        * Enhance performance
        * Add word_sent package
        
        1.0.9 (2017-03-07)
        ----------------------------------------
        
        * Add Corpus class
        * Add Transformer classes
        * Integrated with dictionary of Ho Ngoc Duc
        * Add travis-CI
        * Auto build with PyPI
        
        1.0.0 (2017-03-01)
        ----------------------------------------
        
        * First release on PyPI.
        * First release on Readthedocs
        
Keywords: underthesea
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.6
