Metadata-Version: 2.1
Name: selfclean
Version: 0.0.15
Summary: A holistic self-supervised data cleaning strategy to detect irrelevant samples, near duplicates and label errors.
Home-page: https://github.com/Digital-Dermatology/SelfClean
Author: Fabian Groeger
Author-email: fabian.groeger@unibas.ch
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# SelfClean

[**SelfClean Paper**](https://arxiv.org/abs/2305.17048) | [**Data Cleaning Protocol Paper**](https://arxiv.org/abs/2309.06961)

![SelfClean Teaser](https://github.com/Digital-Dermatology/SelfClean/raw/main/assets/SelfClean_Teaser.png)

<h2 align="center">

[![PyPI version](https://badge.fury.io/py/selfclean.svg)](https://badge.fury.io/py/selfclean)
![Contribotion](https://img.shields.io/badge/Contribution-Welcome-brightgreen)

</h2>

A holistic self-supervised data cleaning strategy to detect irrelevant samples, near duplicates, and label errors.

## Development Environment
Run `make` for a list of possible targets.

### Installation
Run these commands to install the project:
```bash
make init
make install
```

To run linters on all files:
```bash
pre-commit run --all-files
```

### Code and test conventions
- `black` for code style
- `isort` for import sorting
- `pytest` for running tests


