Metadata-Version: 2.1
Name: duplicate-image-finder
Version: 0.2.6
Summary: duplicate image finder helps you find duplicate or similar images as well as delete them.
Home-page: https://github.com/LordAmit/duplicate_image_finder
Keywords: image,similar image,duplicate image,imagehash
Author: Amit
Author-email: lordamit@gmail.com
Requires-Python: >=3.9,<3.10
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: Flask (>=2.1.2,<3.0.0)
Requires-Dist: Flask-Cors (>=3.0.10,<4.0.0)
Requires-Dist: ImageHash (>=4.2.1,<5.0.0)
Requires-Dist: Jinja2 (>=3.1.2,<4.0.0)
Requires-Dist: Pillow (>=9.1.1,<10.0.0)
Requires-Dist: more-itertools (>=8.13.0,<9.0.0)
Requires-Dist: pandas (>=1.4.2,<2.0.0)
Requires-Dist: pathlib (>=1.0.1,<2.0.0)
Requires-Dist: python-magic-bin (==0.4.14)
Requires-Dist: termcolor (>=1.1.0,<2.0.0)
Requires-Dist: types-termcolor (>=1.1.4,<2.0.0)
Project-URL: Repository, https://github.com/LordAmit/duplicate_image_finder
Description-Content-Type: text/markdown

# Duplicate Image Finder

Duplicate image finder uses image hashing to find similar/duplicate images in your local storage. All you gotta do is

1. install
2. install dependencies (using `poetry`)
3. run it (using `poetry` maybe?)

For example:

```sh
# installing
pip install duplicate-image-finder
# show help
python duplicate_finder.py --help
# add directory images and calculate hashes using 4 threads
python duplicate_finder.py --add <directory> --parallel 4
# show the duplicate/similar images found in your browser
python duplicate_finder.py --show
```
## Poetry

Installing dependencies

```sh
poetry install
```

Running

```sh
poetry run python duplicate_image_finder/duplicate_finder.py --show
```

Testing

```sh
poetry run pytest
```
etc.

This duplicate image finder source code is inspired/partially copied from https://github.com/philipbl/duplicate-images.git.

Significant changes from the referred version are:

1. moved from `mongodb` to `sqlite`
2. Is probably better in terms of finding similar images (or perhaps I misunderstood the previous code)

Concepts/Technologies I learned/tried to learn while doing this:

1. `poetry` for dependency
2. `pytest` for unit test
3. `pysqlite3` for database
4. `concurrency` for performance
5. `imagehash` for perpetual image hashing for finding similarity
6. grouping CLI arguments in python (mutually exclusive, etc) using `argparser`

