Metadata-Version: 2.1
Name: duplicate-image-finder
Version: 0.2.7
Summary: duplicate image finder helps you find duplicate or similar images as well as delete them.
Home-page: https://github.com/LordAmit/duplicate_image_finder
Keywords: image,similar image,duplicate image,imagehash
Author: Amit
Author-email: lordamit@gmail.com
Requires-Python: >=3.9,<3.10
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: Flask (>=2.1.2,<3.0.0)
Requires-Dist: Flask-Cors (>=3.0.10,<4.0.0)
Requires-Dist: ImageHash (>=4.2.1,<5.0.0)
Requires-Dist: Jinja2 (>=3.1.2,<4.0.0)
Requires-Dist: Pillow (>=9.1.1,<10.0.0)
Requires-Dist: more-itertools (>=8.13.0,<9.0.0)
Requires-Dist: pandas (>=1.4.2,<2.0.0)
Requires-Dist: pathlib (>=1.0.1,<2.0.0)
Requires-Dist: python-magic-bin (==0.4.14)
Requires-Dist: termcolor (>=1.1.0,<2.0.0)
Requires-Dist: types-termcolor (>=1.1.4,<2.0.0)
Project-URL: Repository, https://github.com/LordAmit/duplicate_image_finder
Description-Content-Type: text/markdown

# Duplicate Image Finder

Duplicate image finder uses image hashing to find similar/duplicate images in your local storage. All you gotta do is

1. install,
2. run (will setup the database with table) if no configuration is provided,
3. run specifying which directory to look for images, and finally
4. run asking it to show duplicate/similar images.


For example:

```sh
# installing
python3.9 -m pip install --user duplicate-image-finder

# show help
duplicate-image-finder --help
# add directory images and calculate hashes using 4 threads
duplicate-image-finder --add <directory> --parallel 4
# show the duplicate/similar images found in your browser
python duplicate-image-finder --show
```

## Requirements

Lots, but all of them can be installed as dependencies as long as you are using `python3.9`. Unfortunately, some of its dependencies have not been made available in `python3.10` yet, so we are stuck there.

## Poetry

Installing dependencies

```sh
poetry install
```

Running

```sh
poetry run python duplicate_image_finder/duplicate_finder.py --show
```

Testing

```sh
poetry run pytest
```
etc.

This duplicate image finder source code is inspired/partially copied from https://github.com/philipbl/duplicate-images.git.

Significant changes from the referred version are:

1. moved from `mongodb` to `sqlite`
2. Is probably better in terms of finding similar images (or perhaps I misunderstood the previous code)

Concepts/Technologies I learned/tried to learn while doing this:

1. `poetry` for dependency
2. `pytest` for unit test
3. `pysqlite3` for database
4. `concurrency` for performance
5. `imagehash` for perpetual image hashing for finding similarity
6. grouping CLI arguments in python (mutually exclusive, etc) using `argparser`

