Metadata-Version: 2.1
Name: pandas-select
Version: 0.2.0
Summary: Supercharged DataFrame indexing
Home-page: https://github.com/jeffzi/pandas-select/
License: BSD-3-Clause
Keywords: pandas,scikit-learn
Author: Jean-Francois Zinque
Author-email: jzinque@gmail.com
Requires-Python: >=3.6.1,<3.10
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering
Provides-Extra: docs
Provides-Extra: tests
Requires-Dist: Sphinx (>=3.4.0,<4.0.0); extra == "docs"
Requires-Dist: furo (>=2020.12.9-beta.21,<2021.0.0); extra == "docs"
Requires-Dist: importlib_metadata (>=1.5.0,<2.0.0); python_version < "3.8"
Requires-Dist: ipython (>=7.12.0,<8.0.0); extra == "docs"
Requires-Dist: pandas (>=0.25.3)
Requires-Dist: pandera (>=0.6.0,<0.7.0); extra == "docs" or extra == "tests"
Requires-Dist: pytest (>=6.2.1,<7.0.0); extra == "tests"
Requires-Dist: scikit-learn (>=0.20); extra == "docs" or extra == "tests"
Requires-Dist: sphinx-copybutton (>=0.3.1,<0.4.0)
Requires-Dist: sphinx-panels (>=0.5.2,<0.6.0); extra == "docs"
Requires-Dist: xdoctest (>=0.15.0,<0.16.0); extra == "docs"
Project-URL: Documentation, https://pandas-select.readthedocs.io/
Project-URL: Repository, https://github.com/jeffzi/pandas-select/
Description-Content-Type: text/x-rst

==================================================
``pandas-select``: Supercharged DataFrame indexing
==================================================

.. image:: https://github.com/jeffzi/pandas-select/workflows/tests/badge.svg
   :target: https://github.com/jeffzi/pandas-select/actions
   :alt: Github Actions status

.. image:: https://codecov.io/gh/jeffzi/pandas-select/branch/master/graph/badge.svg
   :target: https://codecov.io/gh/jeffzi/pandas-select
   :alt: Coverage

.. image:: https://readthedocs.org/projects/project-template-python/badge/?version=latest
   :target: https://pandas-select.readthedocs.io/
   :alt: Documentation status

.. image:: https://img.shields.io/pypi/v/pandas-select.svg
   :target: https://pypi.org/project/pandas-select/
   :alt: Latest PyPI version

.. image:: https://img.shields.io/pypi/pyversions/pandas-select.svg
   :target: https://pypi.org/project/pandas-select/
   :alt: Python versions supported

.. image:: https://img.shields.io/pypi/l/pandas-select.svg
   :target: https://pypi.python.org/pypi/pandas-select/
   :alt: License

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
   :target: https://github.com/psf/black
   :alt: Code style: black

.. image:: https://img.shields.io/badge/style-wemake-000000.svg
   :target: https://github.com/wemake-services/wemake-python-styleguide

``pandas-select`` is a collection of DataFrame selectors that facilitates indexing
and selecting data, fully compatible with pandas vanilla indexing.

The selector functions can choose variables based on their
`name <https://pandas-select.readthedocs.io/en/latest/reference/label_selectors.html>`_,
`data type <https://pandas-select.readthedocs.io/en/latest/reference/label_selection.html#data-type-selectors>`_,
`arbitrary conditions <https://pandas-select.readthedocs.io/en/latest/reference/api/pandas_select.label.LabelMask.htmlk>`_,
or any `combination of these <https://pandas-select.readthedocs.io/en/latest/reference/label_selection.html#logical-operators>`_.

``pandas-select`` is inspired by the excellent R library `tidyselect <https://tidyselect.r-lib.org/reference/language.html>`_.

.. installation-start

Installation
------------

``pandas-select`` is a Python-only package `hosted on PyPI <https://pypi.org/project/pandas-select/>`_.
It can be installed via `pip <https://pip.pypa.io/en/stable/>`_:

.. code-block:: console

   pip install pandas-select

.. installation-end

Design goals
------------

* Fully compatible with the
  `pandas.DataFrame <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html>`_
  ``[]`` operator and the
  `pandas.DataFrame.loc <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html?highlight=loc#pandas.DataFrame.loc>`_
  accessor.

* Emphasise readability and conciseness by cutting boilerplate:

.. code-block:: python

   # pandas-select
   df[AllNumeric()]
   # vanilla
   df.select_dtypes("number").columns

   # pandas-select
   df[StartsWith("Type") | "Legendary"]
   # vanilla
   df.loc[:, df.columns.str.startswith("Type") | (df.columns == "Legendary")]

* Ease the challenges of `indexing with hierarchical index <https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced-indexing-with-hierarchical-index>`_
  and offers an alternative to `slicers <https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced-mi-slicers>`_
  when the labels cannot be listed manually.

.. code-block:: python

    # pandas-select
    df_mi.loc[Contains("Jeff", axis="index", level="Name")]

    # vanilla
    df_mi.loc[df_mi.index.get_level_values("Name").str.contains("Jeff")]

* Play well with machine learning applications.

  - Respect the columns order.
  - Allow *deferred selection* when the DataFrame's columns are not known in advance,
    for example in automated machine learning applications.
  - Offer integration with `sklearn <https://scikit-learn.org/stable/>`_.

    .. code-block:: python

        from pandas_select import AnyOf, AllBool, AllNominal, AllNumeric, ColumnSelector
        from sklearn.compose import make_column_transformer
        from sklearn.preprocessing import OneHotEncoder, StandardScaler

        ct = make_column_transformer(
           (StandardScaler(), ColumnSelector(AllNumeric() & ~AnyOf("Generation"))),
           (OneHotEncoder(), ColumnSelector(AllNominal() | AllBool() | "Generation")),
        )
        ct.fit_transform(df)


Project Information
-------------------

``pandas-select`` is released under the `BS3 <https://choosealicense.com/licenses/bsd-3-clause/>`_ license,
its documentation lives at `Read the Docs <https://pandas-select.readthedocs.io/>`_,
the code on `GitHub <https://github.com/jeffzi/pandas-select>`_,
and the latest release on `PyPI <https://pypi.org/project/pandas-select/>`_.
It is tested on Python 3.6+.

