Metadata-Version: 2.1
Name: sklearn-ann
Version: 0.1.2
Summary: Various integrations for ANN (Approximate Nearest Neighbours) libraries into scikit-learn.
Project-URL: Source, https://github.com/scikit-learn-contrib/sklearn-ann
Project-URL: Documentation, https://sklearn-ann.readthedocs.io/
Author-email: Frankie Robertson <frankie@robertson.name>, Philipp Angerer <phil.angerer@gmail.com>
License-Expression: BSD-3-Clause
License-File: LICENSE
Requires-Python: <3.13,>=3.9
Requires-Dist: scikit-learn>=0.24.0
Requires-Dist: scipy<2.0.0,>=1.11.1
Provides-Extra: annlibs
Requires-Dist: sklearn-ann[annoy,faiss,nmslib,pynndescent]; extra == 'annlibs'
Provides-Extra: annoy
Requires-Dist: annoy<2.0.0,>=1.17.0; extra == 'annoy'
Provides-Extra: docs
Requires-Dist: matplotlib>=3.3.3; extra == 'docs'
Requires-Dist: numpydoc>=1.1.0; extra == 'docs'
Requires-Dist: scanpydoc; extra == 'docs'
Requires-Dist: sphinx-book-theme>=1.1.0rc1; extra == 'docs'
Requires-Dist: sphinx-gallery>=0.8.2; extra == 'docs'
Requires-Dist: sphinx-issues>=1.2.0; extra == 'docs'
Requires-Dist: sphinx>=7; extra == 'docs'
Provides-Extra: faiss
Requires-Dist: faiss-cpu<2.0.0,>=1.6.5; extra == 'faiss'
Provides-Extra: nmslib
Requires-Dist: nmslib<3.0.0,>=2.1.1; (python_version < '3.11') and extra == 'nmslib'
Provides-Extra: pynndescent
Requires-Dist: pynndescent<1.0.0,>=0.5.1; extra == 'pynndescent'
Provides-Extra: tests
Requires-Dist: pytest-cov>=2.10.1; extra == 'tests'
Requires-Dist: pytest>=6.2.1; extra == 'tests'
Description-Content-Type: text/x-rst

**sklearn-ann** eases integration of approximate nearest neighbours
libraries such as annoy, nmslib and faiss into your sklearn
pipelines. It consists of:

* ``Transformers`` conforming to the same interface as
  ``KNeighborsTransformer`` which can be used to transform feature matrices
  into sparse distance matrices for use by any estimator that can deal with
  sparse distance matrices. Many, but not all, of scikit-learn's clustering and
  manifold learning algorithms can work with this kind of input.
* RNN-DBSCAN: a variant of DBSCAN based on reverse nearest
  neighbours.

Installation
============

To install the latest release from PyPI, run:

.. code-block:: bash

    pip install sklearn-ann

To install the latest development version from GitHub, run:

.. code-block:: bash

    pip install git+https://github.com/scikit-learn-contrib/sklearn-ann.git#egg=sklearn-ann

Why? When do I want this?
=========================

The main scenarios in which this is needed is for performing
*clustering or manifold learning or high dimensional data*. The
reason is that currently the only neighbourhood algorithms which are
build into scikit-learn are essentially the standard tree approaches
to space partitioning: the ball tree and the K-D tree. These do not
perform competitively in high dimensional spaces.

Development
===========

This project is managed using Hatch_ and pre-commit_. To get started, run ``pre-commit
install`` and ``hatch env create``. Run all commands using ``hatch run python
<command>`` which will ensure the environment is kept up to date. pre-commit_ comes into
play on every `git commit` after installation.

Consult ``pyproject.toml`` for which dependency groups and extras exist,
and the Hatch help or user guide for more info on what they are.

.. _Hatch: https://hatch.pypa.io/
.. _pre-commit: https://pre-commit.com/
