Metadata-Version: 1.0
Name: repoze.catalog
Version: 0.7.1dev
Summary: Searching and indexing based on zope.index
Home-page: http://www.repoze.org
Author: Agendaless Consulting
Author-email: repoze-dev@lists.repoze.org
License: BSD-derived (http://www.repoze.org/LICENSE.txt)
Description: ================
        repoze.catalog
        ================
        
        An indexing and searching system based on `zope.index`_.
        
        .. _`zope.index`: http://pypi.python.org/pypi/zope.index
        
        See the ``docs`` subdirectory for documentation or the `online docs
        <http://docs.repoze.org/catalog/>`_.
        
        
        
        0.7.1 (2009-09-27)
        ==================
        
        - Minimally get docs into shape for First PyPI release; no
        functionality changes from 0.7.0.
        
        0.7.0 (2009-08-03)
        ==================
        
        - Fixed bug in ``DocumentMap.add`` which left orphan mappings for previous
        addresses when adding an existing docid with a new address.
        
        - Added the ability to sort by text relevance. Use the name of the text
        index as the ``sort_index`` in the query.
        
        0.6.2 (2009-04-15)
        ==================
        
        - Add metadata-related APIs to ``repoze.catalog.document.DocumentMap``:
        ``add_metadata``, ``remove_metadata``, ``get_metadata``.
        "Metadata" is a freeform set of key/value pairs related to a docid.
        See the API documentation for more information.
        
        0.6.1 (2009-02-25)
        ==================
        
        - Fixed constructor inheritance issues which kept ``repoze.catalog``
        from working under Python 2.6.  Note that this change involved removing
        the ``*args, **kw`` arguments from index constructors:  those values were
        never used, but had (bogus) tests.
        
        0.6.0 (2009-02-16)
        ==================
        
        - N-Best ascending fieldindex sort was being chosen incorrectly when
        there was no limit.  Symptom: ``RuntimeError, 'n-best used without
        limit'``.
        
        0.5.9 (2009-02-16)
        ==================
        
        - Add ``reindex_doc`` method as an alias for ``index_doc`` to both
        CatalogFieldIndex and CatalogKeywordIndex (for performance,
        ``index_doc`` for both indexes has special case code for reindexing).
        
        0.5.8 (2009-02-16)
        ==================
        
        - Speed up path2 index attribute search by using __getitem__ rather
        than .get in some places.
        
        - Override textindex reindex_doc method: calling index_doc only
        instead of calling unindex_doc and then index_doc is much more
        efficient.
        
        0.5.7 (2009-02-14)
        ==================
        
        - Attributes returned to attribute checker were not correct.
        
        0.5.6 (2009-02-12)
        ==================
        
        - Add "attribute discriminator" and "attribute checker" support to
        path2 index.  If an index is created with an attribute
        discriminator, when any object is indexed, the value of the
        attribute will be stored in the path index.  The path index will
        know that that attribute belongs to a particular path.  Later, when
        the "attribute checker" feature of the ``apply`` or ``search``
        method is used, a user-supplied attribute checker function will be
        able to filter the result set returned by the index.  This is used
        by the author primarily to support security-filtered searches of a
        path index.  It is not otherwise documented.
        
        0.5.5 (2009-02-11)
        ==================
        
        - Add a ``reindex_doc`` method to the catalog and to the ``common``
        shared index base class.  The catalog's ``reindex_doc`` calls each
        index's ``reindex_doc`` method when called.  The common shared index
        base class implementation unindexes the docid and then subsequently
        indexes the document using the docid.  This method can be overridden
        for specific indexes to do something different on a reindex.
        
        - ``repoze.catalog.indexes.path2.CatalogPathIndex2`` now takes an
        extra argument to its search method named ``include_path``.  If this
        is true, the docid set returned will include the docid for the path
        specified by the path query parameter.  The ``apply`` method of the
        index allows for the specification of the ``include_path`` as a
        dictionary member in an ``apply`` call which specifies the query as
        a dictionary.
        
        - Give ``path2.CatalogPathIndex2`` index a better ``reindex_doc``
        method than the default.
        
        - The CatalogKeywordIndex's ``apply`` method mutated the query passed
        in if it was a dict.  To fix, we override the ``apply`` method from
        the zope.index implementation.
        
        - Added a Range class importable as ``repoze.catalog.Range``.  The
        Range class should be used to represent a range query to a
        CatalogFieldIndex.  The old-style of passing a 2-tuple to represent
        a range is still supported, but will be eventually removed in favor
        of requring a Range object to represent a Range query.  A Range
        object can be instantiated ala "Range(start, end)".
        
        - It is now possible to pass a sequence of items to the
        CatalogFieldIndex ``apply`` method.  When a sequence of terms that
        is passed in is *not* a tuple with two items in it (the previous API
        representing a range, which is deprecated), it will be considered a
        query for multiple terms.  The docids returned for each term will be
        unioned together to form the result.
        
        - It is now possible to pass a dictionary to the CatalogFieldIndex
        ``apply`` method.  When a dictionary is passed, the member of the
        dictionary named ``query`` is treated as the query.  It may be a
        single term, a sequence of terms, or a Range.  An additional
        dictionary member named ``operator`` may also be specified: when
        this is specified, it must be one of ``or`` or ``and`` (the default
        is ``or``).  If the query specifies multiple terms, and the operator
        is ``or``, the results will be unioned; if the query specifies
        multiple terms and the operator is ``and``, the results will be
        intersected.
        
        0.5.4 (2009-02-05)
        ==================
        
        Features
        --------
        
        - A newer path index implementation importable as
        ``repoze.catalog.path2.CatalogPathIndex2`` has been added as another
        index type.  The path2 index type is an improvement inasmuch as it
        actually uses a graph to represent structure instead of the "levels"
        scheme pioneered within Zope2 (and used by
        ``repoze.catalog.path.CatalogPathIndex``). By eye, the "levels"
        scheme looks like it can return the wrong results for any given path
        for a sufficiently dense tree.
        
        - Catalog indexes must now supply an ``apply_intersect`` method; it
        receives a query and a set of docids (the result intersection "so
        far").  It should have the same sort of return value as the
        ``apply`` method.  Indexes which inherit from
        ``common.CatalogIndex`` will inherit a default implementation.
        
        - It is now possible to specify index query/merge order within a
        catalog query.  See ``Index Query/Merge Order`` in the docs.
        
        0.5.3 (2009-01-05)
        ==================
        
        Features
        --------
        
        - Better detection of when to use fwdscan on ascending sorts in field
        indexes.
        
        - Better detection of when to use nbest vs. timsort on ascending sorts
        in field indexes.
        
        0.5.2 (2009-01-04)
        ==================
        
        Features
        --------
        
        - Allow a new catalog search method keyword: ``sort_type``.  For
        ascending sorts, this can be one of ``nbest``, ``fwscan``, or
        ``timsort``.  For descending sorts, only ``nbest`` and ``timsort``
        are supported.  This argument allows fine-grained control of what
        algorithm should be chosen to perform sorting within FieldIndex
        code.
        
        - Better automatic detection of which sort algorithm to use (when it's
        not supplied via ``sort_type``) based on empirical testing.
        
        - Depend on zope.index 3.5.0 rather than any earlier version
        (repoze.catalog fixes migrated upstream in zope.index 3.5.0).
        
        - Add 'sortbench' script to test various field index sort strategies
        (requires 'benchmark' extra to create charts).
        
        Bug Fixes
        ---------
        
        - Prevent the potential for a zero division error when attempting to
        sort an empty set of results.
        
        0.5.1 (2008-12-31)
        ==================
        
        Features
        --------
        
        - Optimize the choice of fieldindex sort strategy.
        
        - Speed up keyword index merges slightly.
        
        - Fix a bug in the return value of the catalog: it would try to return
        the minimum of the number of docs or the limit event if there was no
        limit.
        
        Bug Fixes
        ---------
        
        - Sean Upton pointed out that the document map code artificially
        limited the number of documents to half the number that it could
        actually handle.
        
        0.5 (2008-11-10)
        ================
        
        Features
        --------
        
        - Add path index.
        
        - Speed up keyword index 'and' (intersection) queries nominally by
        sorting intersected sets from smallest-to-largest first.
        
        - Benchmarking suite provided by Chris Rossi.
        
        - Add a "facet" index
        (``repoze.catalog.indexes.facet.CatalogFacetIndex``).  This index is
        much like a keyword index, but unlike a keyword index it contains a
        facet list (a sequence of known colon-separated values) and accepts
        values that are sequences of colon-separated terms.  Each term is
        split on its colons, forming a sequence of categories, then each
        concatenation of the categories is indexed.  For example, if you
        indexed a document as ``['style:gucci:handbag']``, and the facet
        list contained ``'style'``, ``'style:gucci'`` and
        ``'style:gucci:handbag'``, the document would be indexed three
        times: as ``style``, as ``style:gucci`` and as
        ``style:gucci:handbag``.  Querying a facet index returns a set of
        document ids that match the facets passed in.  A facet index also
        has a ``counts`` method which provided a set of document ids,
        returns a dictionary containing "further constraint information" for
        use in a narrowing UI.  This count implementation is not meant for
        very large-scale sites; it is naive.
        
        0.4 (2008-10-06)
        ================
        
        Features
        --------
        
        - Speed up keyword index 'or' (union) queries by using a single
        IFBTree.multiunion instead of multiple calls to IFBTree.union; this
        is most helpful for speeding up 'or' queries where there are lots of
        terms in the query sequence.
        
        Documentation
        -------------
        
        - Add ``overview`` page.
        
        0.3 (2008-10-04)
        ================
        
        Features
        --------
        
        - Add ``repoze.catalog.document.DocumentMap`` class, which provides a
        mechanism to map "addresses" (paths) to document ids.
        
        Documentation
        -------------
        
        - Add API documentation for catalog and document map.
        
        Backwards incompatibilities
        ---------------------------
        
        - Rename ``searchResults`` method to ``search``.
        
        - Removed ``updateIndex`` and ``updateIndexes`` methods of catalog.
        
        - All index implementations moved into ``repoze.catalog.indexes``.
        
        - All interfaces moved to ``repoze.catalog.interfaces``.
        
        0.2 (2008-09-26)
        ================
        
        - Provide ``sort_index`` capability.
        
        0.1 (2008-07-26)
        ================
        
        - Initial release.
        
Keywords: indexing catalog search
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
