Metadata-Version: 2.1
Name: textacy
Version: 0.13.0
Summary: NLP, before and after spaCy
Maintainer-email: Burton DeWilde <burtdewilde@gmail.com>
License: Copyright 2016 Chartbeat, Inc.
        
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
        You may obtain a copy of the License at
        
          http://www.apache.org/licenses/LICENSE-2.0
        
        Unless required by applicable law or agreed to in writing, software
        distributed under the License is distributed on an "AS IS" BASIS,
        WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        See the License for the specific language governing permissions and
        limitations under the License.
        
Project-URL: Docs, https://textacy.readthedocs.io
Project-URL: Repo, https://github.com/chartbeat-labs/textacy
Project-URL: Changelog, https://github.com/chartbeat-labs/textacy/blob/main/CHANGES.md
Keywords: spacy,nlp,text processing,linguistics
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Natural Language :: English
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: cachetools (>=4.0.0)
Requires-Dist: catalogue (~=2.0)
Requires-Dist: cytoolz (>=0.10.1)
Requires-Dist: floret (~=0.10.0)
Requires-Dist: jellyfish (>=0.8.0)
Requires-Dist: joblib (>=0.13.0)
Requires-Dist: networkx (>=2.7)
Requires-Dist: numpy (>=1.17.0)
Requires-Dist: pyphen (>=0.10.0)
Requires-Dist: requests (>=2.10.0)
Requires-Dist: scipy (>=1.8.0)
Requires-Dist: scikit-learn (>=1.0)
Requires-Dist: spacy (~=3.0)
Requires-Dist: tqdm (>=4.19.6)
Provides-Extra: check
Requires-Dist: black (~=23.0) ; extra == 'check'
Requires-Dist: isort (~=5.0) ; extra == 'check'
Requires-Dist: mypy (~=1.0.0) ; extra == 'check'
Requires-Dist: pytest (~=7.0) ; extra == 'check'
Requires-Dist: pytest-cov ; extra == 'check'
Requires-Dist: ruff ; extra == 'check'
Provides-Extra: dev
Requires-Dist: black (~=23.0) ; extra == 'dev'
Requires-Dist: build ; extra == 'dev'
Requires-Dist: isort (~=5.0) ; extra == 'dev'
Requires-Dist: mypy (~=1.0.0) ; extra == 'dev'
Requires-Dist: recommonmark (<0.7.0,>=0.6.0) ; extra == 'dev'
Requires-Dist: sphinx (~=3.0) ; extra == 'dev'
Requires-Dist: pytest (~=7.0) ; extra == 'dev'
Requires-Dist: pytest-cov ; extra == 'dev'
Requires-Dist: ruff ; extra == 'dev'
Requires-Dist: twine (~=4.0) ; extra == 'dev'
Requires-Dist: wheel ; extra == 'dev'
Provides-Extra: docs
Requires-Dist: Jinja2 (<3.1) ; extra == 'docs'
Requires-Dist: recommonmark (<0.7.0,>=0.6.0) ; extra == 'docs'
Requires-Dist: sphinx (~=3.0) ; extra == 'docs'
Provides-Extra: viz
Requires-Dist: matplotlib (~=3.0) ; extra == 'viz'

## textacy: NLP, before and after spaCy

`textacy` is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, `textacy` focuses primarily on the tasks that come before and follow after.

[![build status](https://img.shields.io/travis/chartbeat-labs/textacy/master.svg?style=flat-square)](https://travis-ci.org/chartbeat-labs/textacy)
[![current release version](https://img.shields.io/github/release/chartbeat-labs/textacy.svg?style=flat-square)](https://github.com/chartbeat-labs/textacy/releases)
[![pypi version](https://img.shields.io/pypi/v/textacy.svg?style=flat-square)](https://pypi.python.org/pypi/textacy)
[![conda version](https://anaconda.org/conda-forge/textacy/badges/version.svg)](https://anaconda.org/conda-forge/textacy)

### features

- Access and extend spaCy's core functionality for working with one or many documents through convenient methods and custom extensions
- Load prepared datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
- Clean, normalize, and explore raw text before processing it with spaCy
- Extract structured information from processed documents, including n-grams, entities, acronyms, keyterms, and SVO triples
- Compare strings and sequences using a variety of similarity metrics
- Tokenize and vectorize documents then train, interpret, and visualize topic models
- Compute text readability and lexical diversity statistics, including Flesch-Kincaid grade level, multilingual Flesch Reading Ease, and Type-Token Ratio

... *and much more!*

### links

- Download: https://pypi.org/project/textacy
- Documentation: https://textacy.readthedocs.io
- Source code: https://github.com/chartbeat-labs/textacy
- Bug Tracker: https://github.com/chartbeat-labs/textacy/issues

### maintainer

Howdy, y'all. 👋

- Burton DeWilde (<burtdewilde@gmail.com>)
