Metadata-Version: 2.1
Name: kotokenizer
Version: 0.1.0
Summary: Korean tokenizer, sentence classification, and spacing model.
Home-page: https://github.com/dsdanielpark/ko-tokenizer
Author: daniel park
Author-email: parkminwoo1991@gmail.com
Keywords: Python,Tokenizer,Korean,Korean Tokenizer,NLP,Natural Language Process,LLM,Large Language Model
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Natural Language :: Korean
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE

# Korean Tokenizer

# References
1. https://github.com/hyunwoongko/kss
2. https://github.com/likejazz/korean-sentence-splitter
3. https://github.com/open-korean-text/open-korean-text
4. https://github.com/jeongukjae/korean-spacing-model
5. https://littlefoxdiary.tistory.com/42
6. https://github.com/bab2min/kiwipiepy
7. http://semantics.kr/%ED%95%9C%EA%B5%AD%EC%96%B4-%ED%98%95%ED%83%9C%EC%86%8C-%EB%B6%84%EC%84%9D%EA%B8%B0-%EB%B3%84-%EB%AC%B8%EC%9E%A5-%EB%B6%84%EB%A6%AC-%EC%84%B1%EB%8A%A5%EB%B9%84%EA%B5%90/
8. https://bab2min.tistory.com/669
9. https://github.com/songys/AwesomeKorean_Data
10. https://github.com/kakao/khaiii/wiki/CNN-%EB%AA%A8%EB%8D%B8
