Metadata-Version: 2.1
Name: cocosuite
Version: 0.1.0
Summary: A toolkit for COCO dataset
Author-Email: =?utf-8?q?Jorge_N=C3=BA=C3=B1ez_Santiago?= <jorge.nusan@gmail.com>
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Project-URL: Homepage, https://github.com/jorgenusan/cocosuite
Project-URL: Issues, https://github.com/jorgenusan/cocosuite/issues
Project-URL: Documentation, https://github.com/jorgenusan/cocosuite/blob/main/README.md
Requires-Python: >=3.10
Requires-Dist: numpy==2.0.0
Requires-Dist: matplotlib==3.9.0
Requires-Dist: pre-commit==3.7.1
Requires-Dist: loguru==0.7.2
Requires-Dist: pytest==8.2.2
Requires-Dist: fire==0.6.0
Requires-Dist: pathlib==1.0.1
Description-Content-Type: text/markdown

# COCOSuite

COCOSuite is a comprehensive collection of tools designed to facilitate the management and manipulation of datasets in COCO format.

## Tools

| Tool | Description |
| ---- | ----------- |
| [coco_merge](./cocosuite/scripts/manipulation/coco_merge.py) | Merge two COCO datasets into a single one |
| [merge_multiple](./cocosuite/scripts/manipulation/merge_multiple_coco_files.py) | Allows merging of multiple COCO files into a single dataset |
| [random_split](./cocosuite/scripts/manipulation/random_split.py) | Performs a random division of the dataset into training and validation subsets, configurable in terms of data proportion. |
| [property_split](./cocosuite/scripts/manipulation/property_split.py) | Divides a COCO dataset into training and validation sets according to specific image properties |
| [coco_filter](./cocosuite/scripts/manipulation/coco_filter.py) | Filters a COCO dataset based on certain criteria |

## Installation

```bash
git clone https://github.com/jorgenusan/cocosuite.git
cd cocosuite
pip3 install -r requirements.txt
```

## Usage

> [!IMPORTANT]
> In all scripts, if no path is specified in the <output_filename> argument and only a name is specified. The resulting file will be created in the input annotations file path

### Basic example

```bash
python3 /cocosuite/scripts/manipulation/coco_merge.py <annotations_file_1> <annotations_file_2> <output_filename>
```

### config_split

Applies to **property_split** and **coco_filter**

```bash
python3 /cocosuite/scripts/manipulation/property_split.py <annotations_file> <config_split>
```

For these scripts you have to pass a configuration file with the criteria for separating the data in the case of **property_split** and the criteria for filtering in the case of **coco_filter**.

Here is an example for each of these cases:

1. property_split

   ```json
    "criteria": {
      "file_name": ["image1", "image2"],
      "height": [480]
    },
    "match_all": true
   ```

2. coco_filter

    ```json
    "filter": {
      "file_name": ["image1", "image2"],
      "height": [480]
    },
    "match_all": true
   ```

> [!NOTE]
> the **match_all** property, when set to `true` means that both properties have to match in order to filter or split a new file.<br>
> If set to `false`, it filters or splits for each property.
