Metadata-Version: 2.1
Name: youtube2text
Version: 0.0.9
Summary: Convert youtube urls to text with speech recognition
Home-page: https://github.com/codenamewei/youtube2text
Author: codenamewei
Author-email: codenamewei@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE


## Converts Youtube URLs to Text with Speech Recognition 

<p>
  <p align="center">
<img alt="project status: active" src="https://img.shields.io/badge/Project%20Status-%F0%9F%94%A5Active-brightgreen"> <img alt="supported language: english" src="https://img.shields.io/badge/Supported%20Language-English-blueviolet">

</p>



<div align="center">
  <img alt="banner" src="https://user-images.githubusercontent.com/33477318/147850310-902fa3c3-910c-48de-815a-9e8f54487d73.jpg" width="800"><br>
</div>

### What does the library does?

- **Youtube -> Text**: Translate youtube urls as text file (csv)
- **Youtube -> Audio**: Downloads youtube urls as audio file (wav, flac)
- **Audio -> Text**: Translate audio file (wav, flac) to text file (csv)


Three folders will be created to store the output files.  
```
<Own Path> or <HOME_DIRECTORY>/youtube2text
│
├── audio/
│   └── 2022Jan02_011802.flac
|
├── audio-chunks/
│   └── 2022Jan02_011802
│       ├── chunk1.flac
│       ├── chunk2.flac
│       └── chunk3.flac
│   
└── text/
    └── 2022Jan02_011802.csv
```


### How to install
Install and update using [pip](https://pypi.org/project/youtube2text/)
```
pip install youtube2text
```


### Build from source 
```
git clone <this_repo>
cd <this_repo>
python setup.py install
```

### How to use 
- Using the library requires **internet connection** for both downloading youtube videos and speech recognition operation
```
from youtube2text import Youtube2Text

converter = Youtube2Text()

converter.url2text(urlpath="https://www.youtube.com/watch?v=Ad9Q8rM0Am0&t=114s")
```

Check out more at [howtouse.ipynb](tests/howtouse.ipynb)

### Functions 
- Support audio output of   
    - wav
    - flac
- Support Automatic Speech Recognition with backend
    - Native Python backend 
    - Huggingface

#### Youtube -> Text
```
def url2text(self, urlpath, outfile = None, audioformat = "flac", audiosamplingrate=16000):
    '''
    Convert youtube url to text

    Parameters:
        urlpath (str): Youtube url
        outfile (str, optional): File path/name of output file (.csv)
        audioformat (str, optional): Audioformat supported in self.__audioextension
        audiosamplingrate (int, optional): Audio sampling rate
    '''
```

#### Youtube -> Audio
```
def url2audio(self, urlpath, audiofile = None, audiosamplingrate=16000):
    '''
    Convert youtube url to audiofile

    Parameters:
        urlpath (str): Youtube url
        audiofile (str, optional): File path/name to save audio file
        audiosamplingrate (int, optional): Audio sampling rate
    '''
```

#### Audio -> Text
```
def audio2text(self, audiofile, textfile = None):
    '''
    Convert audio to csv file

    Parameters:
        audiofile (str): File path/name of audio file
        textfile (str, optional): File path/name of text file (*.csv)
    '''
```


