Metadata-Version: 2.1
Name: extr-ds
Version: 0.0.2
Summary: Library to quickly build basic datasets for Named Entity Recognition (NER) and Relation Extraction (RE) Machine Learning tasks.
Home-page: https://github.com/dpasse/extr-ds
License: UNKNOWN
Description: # extr-ds
        > Library to quickly build basic datasets for Named Entity Recognition (NER) and Relation Extraction (RE) Machine Learning tasks.
        
        <br />
        
        ## Install
        
        ```
        pip install extr-ds
        ```
        
        ## Example
        
        ```python
        text = 'Ted Johnson is a pitcher. Ted went to my school.'
        ```
        
        ### 1. Label Entities for Named-Entity Recognition Task (NER)
        
        ```python
        from extr import RegEx, RegExLabel, EntityExtactor
        from extr-ds import IOB
        
        entity_extractor = EntityExtactor([
            RegExLabel('PERSON', [
                RegEx([r'(ted\s+johnson|ted)'], re.IGNORECASE)
            ]),
            RegExLabel('POSITION', [
                RegEx([r'pitcher'], re.IGNORECASE)
            ]),
        ])
        
        sentence_tokenizer = ## 3rd party tokenizer ##
        labels = IOB(sentence_tokenizer, entity_extractor).label(text)
        
        ## labels ==  [
        ##     ['B-PERSON', 'I-PERSON', 'O', 'O', 'B-POSITION', 'O'],
        ##     ['B-PERSON', 'O', 'O', 'O', 'O', 'O']
        ## ]
        ```
        
Platform: UNKNOWN
Description-Content-Type: text/markdown
