Metadata-Version: 2.1
Name: geda
Version: 0.1.6
Summary: 
Home-page: https://github.com/thawro/geda
Author: thawro
Author-email: tomaszhawro.kontakt@gmail.com
Requires-Python: >=3.9,<3.13
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: black (>=23.3.0,<24.0.0)
Requires-Dist: colored (>=1.4.4,<2.0.0)
Requires-Dist: colorlog (>=6.7.0,<7.0.0)
Requires-Dist: isort (>=5.12.0,<6.0.0)
Requires-Dist: mat73 (>=0.60,<0.61)
Requires-Dist: matplotlib (>=3.7.2,<4.0.0)
Requires-Dist: numpy (>=1.25.2,<2.0.0)
Requires-Dist: opencv-python (>=4.8.0.76,<5.0.0.0)
Requires-Dist: pillow (>=10.0.0,<11.0.0)
Requires-Dist: pre-commit (>=3.2.2,<4.0.0)
Requires-Dist: pypng (>=0.20220715.0,<0.20220716.0)
Requires-Dist: rich (>=13.3.5,<14.0.0)
Requires-Dist: scipy (>=1.11.2,<2.0.0)
Requires-Dist: tqdm (>=4.65.0,<5.0.0)
Project-URL: Repository, https://github.com/thawro/geda
Description-Content-Type: text/markdown

# GeDa

**GeDa** is a Python package that helps you to **Ge**t the **Da**ta for your project.

## Installation

```bash
pip install geda
```

## Usage

### Using specific data provider class

```python
from geda.data_providers.voc import VOCSemanticSegmentationDataProvider

root = "<directory>/<to>/<store>/<data>" # e.g. "data/VOC"
dataprovider = VOCSemanticSegmentationDataProvider(root)
dataprovider.get_data()
```

### Using `get_data` shortcut

```python
from geda import get_data

root = "<directory>/<to>/<store>/<data>" # e.g. "data/VOC"
dataprovider = get_data(name="VOC_SemanticSegmentation", root=root)
dataprovider.get_data()
```

> The `get_data` function currently supported names:
> `DUTS`, `NYUDv2`, `VOC_InstanceSegmentation`, `VOC_SemanticSegmentation`, `VOC_PersonPartSegmentation`, `VOC_Main`, `VOC_Action`, `VOC_Layout`


## What it does

By using `dataprovider.get_data()` functionality, the data is subjected to the following pipeline:

1. Download the data from source (specified by the `_URLS` variable in each module)
2. Unzip the files if needed (in case of `tar`, `zip` or `gz` files downloaded)
3. Move the files to `<root>/raw` directory
4. Find the split ids (file basenames or indices - depending on the dataset)
5. Arrange files, i.e. move (or copy) files from `<root>/raw` directory to task-specific directories
6. *[Optional]* Create labels in specific format (f.e. YOLO)

### Example

Resulting directory structure of the `get_data(name="VOC_SemanticSegmentation", root="data/VOC")`

    .
    └── data
        └── VOC
            ├── raw
            │   ├── Annotations
            │   ├── ImageSets
            │   ├── JPEGImages
            │   ├── SegmentationClass
            │   └── SegmentationObject
            ├── SegmentationClass
            │   ├── annots
            │   ├── images
            │   ├── labels
            │   └── masks
            └── trainval_2012.tar

## Currently supported datasets

### Image Segmentation

* [PASCAL VOC 2012](http://host.robots.ox.ac.uk/pascal/VOC)
* [NYUDv2](https://cs.nyu.edu/~silberman/projects/indoor_scene_seg_sup.html)
* [Person-Parts](http://liangchiehchen.com/projects/DeepLab.html)
* [DUTS](http://saliencydetection.net/duts/)


## Contributing

Pull requests are welcome. For major changes, please open an issue first
to discuss what you would like to change.


## License

[MIT](https://choosealicense.com/licenses/mit/)
