Metadata-Version: 2.3
Name: vokab
Version: 0.0.1
Summary: vokab: named entity linking through hybrid (lexical and semantic) search engine.
Author-email: Ian Maurer <ian@genomoncology.com>
License-File: LICENSE
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.9
Requires-Dist: aiosql>=10.1
Requires-Dist: click>=8.0.0
Requires-Dist: pydantic>=2.6.1
Requires-Dist: rapidfuzz>=3.7.0
Requires-Dist: sentence-transformers>=2.2.2
Requires-Dist: streamlit>=1.33.0
Requires-Dist: torch>=2.2.2
Requires-Dist: tqdm
Provides-Extra: ext
Requires-Dist: lancedb; extra == 'ext'
Requires-Dist: sqlite-vss; extra == 'ext'
Requires-Dist: tantivy; extra == 'ext'
Provides-Extra: local
Requires-Dist: build; extra == 'local'
Requires-Dist: ipython; extra == 'local'
Requires-Dist: pip; extra == 'local'
Requires-Dist: setuptools; extra == 'local'
Requires-Dist: twine; extra == 'local'
Provides-Extra: test
Requires-Dist: coverage[toml]; extra == 'test'
Requires-Dist: pytest; extra == 'test'
Requires-Dist: pytest-mock; extra == 'test'
Description-Content-Type: text/markdown

# vokab

vokab is a python module for storing, searching and matching named entities on their name or aliases. This 
project is useful for auto-correcting or linking named entities in NLP information extraction use cases or 
when cleaning up user submitted data.

Supports:
- Exact Matching
- Case-insensitive Matching
- Lexical Matching (i.e. Fuzzy String Matching)
- Semantic Matching (i.e. Vector Similarity Searching)
- Hybrid (Lexical/Fuzzy + Semantic/Vector)


## Getting Started


## Installation

Available on [PyPI](https://pypi.org/project/vokab/):

```bash
pip install vokab
```


## Maintainer

vokab was created by [Ian Maurer](https://x.com/imaurer), the CTO of [GenomOncology](https://genomoncology.com).

This MIT-based open-source project was extracted from our product which includes the ability to normalize biomedical
data for use in precision oncology clinical decision support systems. Contact me to learn more about our product
offerings.


