Metadata-Version: 1.1
Name: pysqldf
Version: 1.2.1
Summary: sqldf for pandas
Home-page: https://github.com/airtoxin/pysqldf/
Author: Ryoji Ishii
Author-email: airtoxin@icloud.com
License: MIT
Description: pysqldf
        =======
        
        |Build Status| |PyPI Version| |PyPI Monthly Downloads| |PyPI License|
        
        ``pysqldf`` allows you to query ``pandas`` DataFrames using SQL syntax.
        It works similarly to ``sqldf`` in R. ``pysqldf`` seeks to provide a
        more familiar way of manipulating and cleaning data for people new to
        Python or ``pandas``.
        
        Installation
        ------------
        
        ``$ pip install pysqldf``
        
        Basics
        ------
        
        The main class in pysqldf is ``SQLDF``. ``SQLDF`` accepts 1 enviroment
        variable sets or more parametrs in constructor. - an set of
        session/environment variables (dictionary of valiables, ``locals()`` or
        ``globals()``) - temporary file type - user defined functions - user
        defined aggregate functions
        
        ``pysqldf`` uses `SQLite syntax <http://www.sqlite.org/lang.html>`__.
        Any convertable data to ``pandas`` DataFrames will be automatically
        detected by ``pysqldf``. You can query them as you would any regular SQL
        table.
        
        .. code:: python
        
            $ python
            >>> from pysqldf import SQLDF, load_meat, load_births
            >>> sqldf = SQLDF(globals())
            >>> meat = load_meat()
            >>> births = load_births()
            >>> print sqldf.execute("SELECT * FROM meat LIMIT 10;").head()
                              date  beef  veal  pork  lamb_and_mutton broilers other_chicken turkey
            0  1944-01-01 00:00:00   751    85  1280               89     None          None   None
            1  1944-02-01 00:00:00   713    77  1169               72     None          None   None
            2  1944-03-01 00:00:00   741    90  1128               75     None          None   None
            3  1944-04-01 00:00:00   650    89   978               66     None          None   None
            4  1944-05-01 00:00:00   681   106  1029               78     None          None   None
        
            >>> q = "SELECT m.date, m.beef, b.births FROM meat m INNER JOIN births b ON m.date = b.date;"
            >>> print sqldf.execute(q).head()
                                date    beef  births
            403  2012-07-01 00:00:00  2200.8  368450
            404  2012-08-01 00:00:00  2367.5  359554
            405  2012-09-01 00:00:00  2016.0  361922
            406  2012-10-01 00:00:00  2343.7  347625
            407  2012-11-01 00:00:00  2206.6  320195
        
            >>> q = "SELECT strftime('%Y', date) AS year, SUM(beef) AS beef_total FROM meat GROUP BY year;"
            >>> print sqldf.execute(q).head()
               year  beef_total
            0  1944        8801
            1  1945        9936
            2  1946        9010
            3  1947       10096
            4  1948        8766
        
        user defined functions and user defined aggregate functions also
        supported.
        
        .. code:: python
        
            $ python
            >>> from pysqldf import SQLDF, load_iris
            >>> import math
            >>> import numpy
            >>> ceil = lambda x: math.ceil(x)
            >>> udfs = { "ceil": lambda x: math.ceil(x) }
            >>> udafs = { "variance": lambda values: numpy.var(values) }
            >>> # or you can also define aggregation function as class
            >>> # class variance(object):
            ... #     def __init__(self):
            ... #         self.a = []
            ... #     def step(self, x):
            ... #         self.a.append(x)
            ... #     def finalize(self):
            ... #         return numpy.var(self.a)
            ...
            >>> # udafs={ "variance": variance }
            >>> iris = load_iris()
            >>> sqldf = SQLDF(globals(), udfs=udfs, udafs=udafs)
            >>> sqldf.execute("""
                SELECT
                    ceil(sepal_length) AS sepal_length,
                    ceil(sepal_width) AS sepal_width,
                    ceil(petal_length) AS petal_length,
                    ceil(petal_width) AS petal_width,
                    species
                FROM iris;
                """).head()
               sepal_length  sepal_width  petal_length  petal_width      species
            0             6            4             2            1  Iris-setosa
            1             5            3             2            1  Iris-setosa
            2             5            4             2            1  Iris-setosa
            3             5            4             2            1  Iris-setosa
            4             5            4             2            1  Iris-setosa
            >>> sqldf.execute("SELECT species, variance(sepal_width) AS var FROM iris GROUP BY species;")
                       species       var
            0      Iris-setosa  0.142276
            1  Iris-versicolor  0.096500
            2   Iris-virginica  0.101924
        
        Documents
        ---------
        
        ``SQLDF(env, inmemory=True, udfs={}, udafs={})``
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        
        ``env``: variable mapping dictionary of sql executed enviroment. key is
        sql variable name and value is your program variable. ``locals()`` or
        ``globals()`` is used for simple assign.
        
        ``inmemory``: sqlite db option.
        
        ``udfs``: dictionary of user defined functions. dictionary key is
        function name, dictionary value is function. see `sqlite3
        document <https://docs.python.org/2.7/library/sqlite3.html#sqlite3.Connection.create_function>`__
        
        ``udafs``: dictionary of user defined aggregate functions. dictionary
        key is function name, dictionary value is aggregate function or class.
        If value is function, function gets one argument that is list of column
        values and it should return aggregated a value. Another case(value is
        class), see `sqlite3
        document <https://docs.python.org/2.7/library/sqlite3.html#sqlite3.Connection.create_aggregate>`__.
        
        ``load_iris()``, ``load_meat()``, ``load_births()``
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        
        load example DataFrame data.
        
        -  iris: `data
           description <https://archive.ics.uci.edu/ml/datasets/Iris>`__
        -  meat: `data
           description <http://www.ers.usda.gov/data-products/livestock-meat-domestic-data.aspx>`__
        -  births: `data
           description <http://data.un.org/Data.aspx?d=POP&f=tableCode:55>`__
        
        .. |Build Status| image:: https://travis-ci.org/airtoxin/pysqldf.svg
           :target: https://travis-ci.org/airtoxin/pysqldf
        .. |PyPI Version| image:: https://img.shields.io/pypi/v/pysqldf.svg
           :target: https://pypi.python.org/pypi/pysqldf
        .. |PyPI Monthly Downloads| image:: https://img.shields.io/pypi/dm/pysqldf.svg
           :target: https://pypi.python.org/pypi/pysqldf
        .. |PyPI License| image:: https://img.shields.io/pypi/l/pysqldf.svg
           :target: https://pypi.python.org/pypi/pysqldf
        
Keywords: sqldf pandas dataframe sql pandasql
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: SQL
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
