Metadata-Version: 2.1
Name: semantic_text_similarity
Version: 1.0.3
Summary: implementations of models and metrics for semantic text similarity. that's it.
Home-page: https://github.com/AndriyMulyar/semantic-text-similarity
Author: Andriy Mulyar, Elliot Schumacher and Mark Dredze
Author-email: contact@andriymulyar.com
License: MIT
Description: # semantic-text-similarity
        an easy-to-use interface to fine-tuned BERT models for computing semantic similarity. that's it.
        
        This project contains an interface to fine-tuned, BERT-based semantic text similarity models. It modifies [pytorch-transformers](https://github.com/huggingface/pytorch-transformers) by abstracting away all the research benchmarking code for ease of real-world applicability.
        
        | Model             |          Dataset | Dev. Correlation |
        |-------------------|------------------|------------------|
        | Web STS BERT      | STS-B            |     0.893        |
        | Clinical STS BERT | MED-STS          |     0.854        |
        
        # Installation
        
        Install with pip:
        
        ```
        pip install semantic-text-similarity
        ```
        
        or directly:
        
        ```
        pip install git+https://github.com/AndriyMulyar/semantic-text-similarity
        ```
        
        # Use
        Maps batches of sentence pairs to real-valued scores in the range [0,5]
        ```python
        from semantic_text_similarity.models import WebBertSimilarity
        from semantic_text_similarity.models import ClinicalBertSimilarity
        
        web_model = WebBertSimilarity(device='cpu', batch_size=10) #defaults to GPU prediction
        
        clinical_model = ClinicalBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction
        
        web_model.predict([("She won an olympic gold medal","The women is an olympic champion")])
        ```
        More [examples](/examples).
        
        
        
        # Notes
        - You will need a GPU to apply these models if you would like any hint of speed in your predictions.
        - Model downloads are cached in `~/.cache/torch/semantic_text_similarity/`. Try clearing this folder if you have issues.
        
        
        # Acknowledgement
        Clinical models in this project were submitted to the 2019 N2C2 Shared Task Track 1.
        Implementation and model training in this project was supported by funding from the Mark Dredze Lab at Johns Hopkins University.
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.5
Classifier: Natural Language :: English
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Intended Audience :: Science/Research
Description-Content-Type: text/markdown
