Metadata-Version: 2.4
Name: difpy2
Version: 0.1.0
Summary: A super-fast in-memory duplicate & similar image finder
Requires-Python: >=3.12
Description-Content-Type: text/markdown
Requires-Dist: imagehash>=4.3.2
Requires-Dist: numba>=0.61.2
Requires-Dist: numpy>=2.2.5
Requires-Dist: pillow>=11.2.1

# difpy2

[![PyPI version](https://img.shields.io/pypi/v/difpy2.svg)](https://pypi.org/project/difpy2/)  
[![Python versions](https://img.shields.io/pypi/pyversions/difpy2.svg)](https://pypi.org/project/difpy2/)  
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE.txt)

A **super-fast**, **in-memory** duplicate & similar image finder built on perceptual-hash bucketing and Numba-accelerated comparison.

---

## Features

- **Zero on-disk output**: everything runs in RAM  
- **Exact & “similar” mode** (custom MSE threshold)  
- **Perceptual-hash + histogram pre-bucketing** to prune comparisons  
- **Numba-JIT** mean-squared-error with early bailout  
- **Thread-pooled** image loading & feature extraction  
- **CLI** and **Python API**  

---

## Installation

```bash
pip install difpy2
Requires Python ≥ 3.12

Quickstart
CLI
bash
Copy
Edit
difpy2 \
  -D /path/to/images \
  --px_size 50 \
  --bins 8 \
  --sim 0.0     # exact duplicates only; use >0 for “similar” mode
Options

-D, --dirs … one or more image directories

-r, --recursive … recurse into subfolders

-px, --px_size … resize images to px×px for comparison

-b, --bins … per-channel histogram buckets

-s, --sim … MSE threshold (0.0 = exact only)

-t, --threads … number of worker threads

Python API
python
Copy
Edit
from difpy2 import DuplicateFinder

finder = DuplicateFinder(
    directories=["/path/to/images"],
    px_size=50,
    hist_bins=8,
    similarity=0.0,    # exact duplicates
    threads=4,
)

results, lower_quality, stats = finder.run()

# results: { primary_image_path: [[duplicate_path, mse], …], … }
# lower_quality: [all duplicate/similar image paths]
# stats: { total_files, featurized, groups, duration_s }
Project Layout
arduino
Copy
Edit
difpy2/
├── difpy_opt.py         # core implementation
├── README.md
├── LICENSE.txt
├── pyproject.toml
└── …
Contributing
Fork the repo

Create a feature branch

Run tests & linters

Submit PR

License
This project is licensed under the MIT License. See LICENSE.txt.
