Metadata-Version: 2.3
Name: tvsm-extractor
Version: 2.1.0
Summary: Extract TVSM dataset
Project-URL: Homepage, https://github.com/tvquizphd/tvsm-extractor
Project-URL: Issues, https://github.com/tvquizphd/tvsm-extractor/issues
Author: John Hoffer
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Requires-Dist: librosa==0.10.2.post1
Requires-Dist: matplotlib==3.9.3
Requires-Dist: numpy==2.0.2
Requires-Dist: soundfile==0.12.1
Description-Content-Type: text/markdown

### TVSM Extractor

Tooling for reading [this dataset](https://zenodo.org/records/7025971).

### Usage

Run from within a parent directory of unzipped TVSM datasets.
Files are output to "TVSM-extractor-audio" and "TVSM-extractor-images".

### Test data

All samples by default

```
tvsm-extractor
```

### Specific language

```
tvsm-extractor --language en
```

### Specific language and genre

```
tvsm-extractor --language en --genre Thrillers
```

## Specific file/files

```
tvsm-extractor test 3235
```

### Cue sheet training data

Note, the speech timings are very noisy.

```
tvsm-extractor cuesheet
```


### Pseudo training data

This project has not been tested on `TVSM-pseudo`.


### Dev

```
hatch build
hatch publish
```
