Metadata-Version: 2.2
Name: srt2corpus
Version: 0.0.1
Summary: collection of biblical humans and their family relations
Author: Dhruv Kunzru
Project-URL: Homepage, https://github.com/dk10ws/srt2corpus
Project-URL: Issues, https://github.com/dk10ws/srt2corpus
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3
Description-Content-Type: text/markdown
License-File: LICENCE

# SRT2CORPUS

This project contains a Python class Clean that processes text files by cleaning unwanted elements like timestamps, HTML tags, locations, actions, hyphens, and more. It helps clean and format subtitles into usable and clean corpus for nlp projects.


``` python
from srt2corpus import Clean

# Initialize the Clean class with options for cleaning
cleaner = Clean("path_to_your_file")

# Clean the file and save the result
cleaner.corpus()
```
