Metadata-Version: 2.1
Name: civisml-extensions
Version: 0.2.1
Summary: scikit-learn-compatible estimators from Civis Analytics
Home-page: https://www.civisanalytics.com
Author: Civis Analytics
Author-email: opensource@civisanalytics.com
License: BSD-3
Platform: UNKNOWN
Description-Content-Type: text/x-rst
Requires-Dist: scikit-learn (>=0.18.1)
Requires-Dist: scipy (<2.0,>=0.14)
Requires-Dist: numpy (~=1.10)
Requires-Dist: six (~=1.10)
Requires-Dist: pandas (<=0.23,>=0.19) ; python_version == "2.7"
Requires-Dist: pandas (<0.21,>=0.19) ; python_version == "3.4"
Requires-Dist: pandas (~=0.19) ; python_version >= "3.5"

civisml-extensions
==================

.. image:: https://www.travis-ci.org/civisanalytics/civisml-extensions.svg?branch=master
    :target: https://www.travis-ci.org/civisanalytics/civisml-extensions

scikit-learn-compatible estimators from Civis Analytics

Installation
------------

Installation with ``pip`` is recommended::

    $ pip install civisml-extensions

For development, a few additional dependencies are needed::

    $ pip install -r dev-requirements.txt

Contents and Usage
------------------

This package contains `scikit-learn`_-compatible estimators for stacking (
``StackedClassifier``, ``StackedRegressor``), non-negative linear regression (
``NonNegativeLinearRegression``), preprocessing pandas_ ``DataFrames`` (
``DataFrameETL``), and using Hyperband_ for cross-validating hyperparameters (
``HyperbandSearchCV``).

Usage of these estimators follows the standard sklearn conventions. Here is an
example of using the ``StackedClassifier``:

    .. code-block:: python

        >>> from sklearn.linear_model import LogisticRegression
        >>> from sklearn.ensemble import RandomForestClassifier
        >>> from civismlext.stacking import StackedClassifier
        >>> # Note that the final estimator 'metalr' is the meta-estimator
        >>> estlist = [('rf', RandomForestClassifier()),
        >>>            ('lr', LogisticRegression()),
        >>>            ('metalr', LogisticRegression())]
        >>> mysm = StackedClassifier(estlist)
        >>> # Set some parameters, if you didn't set them at instantiation
        >>> mysm.set_params(rf__random_state=7, lr__random_state=8,
        >>>                 metalr__random_state=9, metalr__C=10**7)
        >>> # Fit
        >>> mysm.fit(Xtrain, ytrain)
        >>> # Predict!
        >>> ypred = mysm.predict_proba(Xtest)

You can learn more about stacking and see an example use of the  ``StackedRegressor`` and ``NonNegativeLinearRegression`` estimators in `a talk presented at PyData NYC`_ in November, 2017.

See the doc strings of the various estimators for more information.

Contributing
------------

See ``CONTRIBUTING.md`` for information about contributing to this project.

License
-------

BSD-3

See ``LICENSE.md`` for details.

.. _scikit-learn: http://scikit-learn.org/
.. _pandas: http://pandas.pydata.org/
.. _Hyperband: https://arxiv.org/abs/1603.06560
.. _a talk presented at PyData NYC: https://www.youtube.com/watch?v=3gpf1lGwecA


