Metadata-Version: 2.1
Name: meta-cleaner
Version: 0.2.0
Summary: A Python package to clean text from META tags using a BERT NER model.
Home-page: https://github.com/pirr-me/meta_cleaner
Author: Tim Isbister
Author-email: tim.isbister@pirr.me
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE

# meta_cleaner

`meta_cleaner` is a Python package designed to clean text from META tags using XLM-RoBERTa (large-sized model).

`trainer.ipynb` is a notebook that creates a dataset and a NER model.

## Installation

```bash
pip install meta-cleaner
```

or

```bash
pip install git+https://github.com/pirr-me/meta_cleaner.git
```

### Install Locally

To install locally in editable mode (for development):

```
pip install -e .
```

## Usage

```python
from meta_cleaner.cleaner import TextCleaner

text_cleaner = TextCleaner(model_name='Pirr/xlmr-large-meta-ner-1464', confidence_threshold=0.25)

# CPU
# text_cleaner = TextCleaner(model_name='Pirr/xlmr-large-meta-ner-1464', confidence_threshold=0.25, device="cpu")

# GPU
# text_cleaner = TextCleaner(model_name='Pirr/xlmr-large-meta-ner-1464', confidence_threshold=0.25, device="cuda")

# Example usage
text = """This is my first story please enjoy it!\nChapter 1\n It was a late evening, we were out for a few drinks and had been chatting for hours. We began to kiss and touched each other. Authors note: Please share this storty on Facebook"""

cleaned_text = text_cleaner.clean_text(text)
print("Cleaned Text:", cleaned_text)

```


