Metadata-Version: 2.1
Name: django-simple-stats
Version: 0.7.4
Summary: A django package for creating simple stats from a query
Home-page: https://github.com/spapas/django-simple-stats
Author: Serafeim Papastefanos
Author-email: spapas@gmail.com
License: MIT
Classifier: Environment :: Web Environment
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Framework :: Django
Classifier: Framework :: Django :: 4.2
Classifier: Framework :: Django :: 5.0
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Software Development :: Libraries
License-File: LICENSE

.. image:: https://badge.fury.io/py/django-simple-stats.svg
    :target: https://badge.fury.io/py/django-simple-stats
    
.. image:: https://github.com/spapas/django-simple-stats/actions/workflows/ghtox.yml/badge.svg
    :target: https://github.com/spapas/django-simple-stats/actions/workflows/ghtox.yml
    
django-simple-stats
-------------------

A django package for creating stats from a query. 
This package should be compatible with all 
Django versions > 3.x

You can use something like this:

.. code-block:: python

    class TravelStats(StatSet):
        total = QueryAggregateSingleStat(label='Total')
        amount = QueryAggregateSingleStat(label='Total amount', method='sum')
        port__name = QueryAggregateStat(label='By port', )
    
to get something like this:

.. image:: ./showme.png
  :alt: How does it look

Installation
============

Install it from pip:

``pip install django-simple-stats``

or the latest version from git:

``pip install git+https://github.com/spapas/django-simple-stats``

No other installation is needed.


Stat kinds
==========

* The ``query_aggregate_single`` / ``QueryAggregateSingleStat``  will run the aggregate function on a field and return a single value. For example you can get the total  number of rows of your query or the sum of all fields. 

* The ``query_aggregate`` / ``QueryAggregateStat`` will run the aggregate function on a field and return the list of values. You can run the aggregate function *on a different field* by passing ``aggr_field`` (so you can group by a field and return the sum of another field for each group). This is mainly useful for foreign keys and if you've got distinct values in your queries. For example count the number of rows per user. Also it is useful for booleans for example to get the number of rows that have a flag turned on and off. 

* The ``choice_aggregate``  / ``ChoiceAggregateStat`` is similar to the ``query_aggregate`` but will use a ``choices`` attribute to return better looking values. This will not return Null values

* The ``choice_aggregate_with_null`` / ``ChoiceAggregateNullStat`` is the same as ``choice_aggregate`` but will return Null values (so you can add a ``(None, "Empty")`` choice to your choices)

* The ``query_aggregate_date`` / ``QueryAggregateDateStat`` is similar to the ``query_aggregate`` but will return the aggregates on a specific date field; use ``what`` to pass ``year``, ``month``, ``day``.

* The ``query_aggregate_datetime`` / ``QueryAggregateDateTimeStat`` is similar to the ``query_aggregate_date`` but will return the aggregates on time also.

* The ``query_aggregate_extract_date`` / ``QueryAggregateExtractDateStat`` is similar to ``query_aggregate_date`` but will use ``Extract`` for the date instead of ``Trunc``. This is useful if you want to group by the month/day/hour etc as a *specific* value, i.e this will group all rows of June on the same row while ``query_aggregate_date`` will differntiate between June 21 and June 22 and June 23.

* Finally, the ``query_aggregate_buckets`` // ``QueryAggregateBucketsStat`` is used to create buckets of values. You'll pass a list of buckets, and the query will return the results that belong in each bucket. The stats module will run individual queries using field__gte for single-value buckets and field__gte / field__lte for range buckets. For example, if you pass [100, 50, (10, 20)] for a field price, it will run price__gte=100, price__gte=50, and for the range bucket (10, 20), it will run price__gte=10 AND price__lte=20 and return the results accordingly.


Usage
=====

Both a declarative (similar to how Django models/forms work) and a functional API can be used.

For the declarative API, you need to declare a class inheriting from ``simple_stats.StatSet`` and
declare the stats fields in it using the corresponding stats classes like ``simple_stats.QueryAggregateStat``.
Then you'll initialize this class in your Django view passing it the query. For example:

.. code-block:: python

    from simple_stats import StatSet, QueryAggregateSingleStat
    
    class MyStats(StatSet):
        id = QueryAggregateSingleStat(label='Total number')
    
    # Then in your view
    stats = MyStats(query=MyModel.objects.all())

The stats classes take the following initialization parameters:

* field (optional): The field that the aggreate will run on; use ``__`` for joins i.e ``fiedld1__field2``
* label (optional): The textual description of this statistic (will be filled with the field if not passed)
* method (optional): The aggregate method. Can be one of ``count``, ``sum``, ``max``, ``min``, ``avg``. Default is count
* what (optional): Only required for ``query_aggregate_date``, it is eithed ``year``, ``month``, ``day``
* choices (optional): Only required for ``choice_aggregate`` and ``choice_aggregate_with_null``, it must be a django choices list 
* buckets (optional): Only required for ``query_aggregate_buckets``. Must be a list from the biggest to the lowest value.
* aggr_field (optional): This field is optional and can be used for ``query_aggregate``, ``query_aggregate_date``, ``choice_aggregate`` and ``choice_aggregate_with_null``. It denotes a field that would run the aggregate function on. It can also be a list to return multiple fields.
* formatter (optional): A callback that can be used to format the value, it should get a value and return a rendered value (i.e ``lambda v: "${}".format(v)``)
* headers (optional): A list of strings to be used as headers for the table.

Please notice that if you don't pass the ``field`` parameter then the name of the attribute will be used (i.e. it will be 
``field=id`` in the example above).

The ``StatSet`` instance will be an enumerable returning a list of dictionaries with the following attributes:

* label: Same as the label in the configuration
* value: Will have a value if you use the query_aggregate_single, else will be None 
* values: Will be empty for query_aggregate_single else will be a list of tuples. Each tuple will have two elements, ``(label, value)``

On the other hand, the only supported method in the functional API the ``simple_stats.get_stats``. 
It expects a django query and the stats configuration (list of dicts). 
Each element of the configuration list is a dictionary that has the same attributes as the init parameters
for the class. There are only two differences:

* The field is now required 
* We must pass the kind of aggregate we need (similar to the class we used on the declarative API). Choices here are: ``query_aggregate_single``, ``query_aggregate``, ``choice_aggregate``, ``choice_aggregate_with_null``, ``query_aggregate_date``, ``query_aggregate_buckets``. 

See below for a complete example.

The response will be a list of dictionaries with the same attributes as the ``StatSet`` instance.

Please notice that the declarative api will create a dictionary and actually call the ``get_stats``
function so in both cases the result will be exactly the same.


Example declarative
===================

Please remember in the example below that if you don't pass the ``field`` parameter then the name of the 
attribute will be used. Also by default the ``method`` is ``count``.

.. code-block:: python

    from simple_stats import from simple_stats import (
        StatSet,
        QueryAggregateStat,
        QueryAggregateSingleStat,
        ChoiceAggregateStat,
        QueryAggregateDateStat,
        QueryAggregateBucketsStat,
    )

    class MyStats(StatSet):
        id = QueryAggregateSingleStat(label='Total number')
        price = QueryAggregateSingleStat(label='Total price', method='sum')
        pilot_authority__name = QueryAggregateStat(label='Per authority')
        pilot_authority__name = QueryAggregateStat(label='Per authority by price', aggr_field='price')
        status = ChoiceAggregateStat(label='Per status', choices=MyModel.STATUS_CHOICES)
        status_price = ChoiceAggregateStat(
            label='Per status by price', 
            choices=MyModel.STATUS_CHOICES, 
            field='status', 
            aggr_field='price'
            formatter=lambda v: "€ {}".format(v) if v else '-'
        )
        year = QueryAggregateDateStat(label='Per year', what='year', field='created_on')
        year_price = QueryAggregateDateStat(label='Per year by price', what='year', aggr_field='price', field='created_on')
        buckets = QueryAggregateBucketsStat(label='Buckets', buckets=[100, 50, 10])

    def my_view(request):
        qs = TestModel.objects.all()

        stats = MyStats(qs)
        return render(request, 'my_template.html', {'stats': stats})

the ``stats`` result will be an enumerable similar to this one:

.. code-block:: python

  [
    {'label': 'Total', 'values': [], 'value': 1216}, 
    {'label': 'Total price', 'values': [], 'value': 323.16}, 
    {'label': 'Per authority', 'values': [('Authority 1', 200), ('Authority 2', 9),   ], 'value': None}, 
    {'label': 'Per authority by price', 'values': [('Authority 1', 123.23), ('Authority 2', 42.12),   ], 'value': None}, 
    {'label': 'Per status', 'values': [('New', 200), ('Cancel', 0)], 'value': None},
    {'label': 'Per status by price', 'values': [('New', '€ 32.01'), ('Cancel', '€ 44.23')], 'value': None},
    {'label': 'Per year', 'values': [(2021, 582), (2022, 634)], 'value': None}
    {'label': 'Per year by price', 'values': [(2021, 5.82), (2022, 6.34)], 'value': None}
    {'label': 'Per price', 'values': [('> 5000', 1), ('> 1000', 29), ('> 500', 86), ('> 0', 305)], 'value': None}
  ]

  
You can display this in your template using something like this (using bootstrap):

.. code-block:: html

  <div class='row'>
    {% for s in stats %}
    <div class='col-md-4 mb-5' style='max-height: 500px; overflow: auto;'>
        <h4>{{ s.label }}</h4>
        {% if s.values %}
            <table class='table table-condensed table-striped small table-sm'>
                {% for v in s.values %}
                    <tr>
                        <td>{{ v.0 }}</td>
                        <td>{{ v.1 }}</td>
                    </tr>
                {% endfor %}
            </table>
        {% else %}
            <b>{{ s.value }}</b>
        {% endif %}
    </div>
    {% endfor %}
  </div>


Example functional
===================

.. code-block:: python

    from simple_stats import get_stats

    STATS_CFG = cfg = [
            {
                'kind': 'query_aggregate_single',
                'label': 'Total',
                'field': 'id',
            }, {
                'kind': 'query_aggregate_single',
                'label': 'Total price',
                'method': 'sum',
                'field': 'price',
            }, {
                'kind': 'query_aggregate',
                'label': 'Per authority',
                'field': 'pilot_authority__name',
            }, {
                'kind': 'query_aggregate',
                'label': 'Per authority by price',
                'field': 'pilot_authority__name',
                'aggr_field': 'price',
                'formatter': lambda v: "€ {}".format(v) if v else '-'
            }, {
                'kind': 'choice_aggregate',
                'label': 'Per status',
                'field': 'status',
                'choices': models.STATUS_CHOICES,
            }, {
                'kind': 'choice_aggregate',
                'label': 'Per status by price',
                'field': 'status',
                'aggr_field': 'price',
                'choices': models.STATUS_CHOICES,
            }, {
                'kind': 'query_aggregate_date',
                'label': 'Per year',
                'field': 'created_on',
                'what': 'year',
            }, {
                'kind': 'query_aggregate_date',
                'label': 'Per year by price',
                'field': 'created_on',
                'what': 'year',
                'aggr_field': 'price',
            }, {
                'kind': 'query_aggregate_buckets',
                'label': 'Per price',
                'field': 'price',
                'buckets': [100_00, 50_00, 1_000, 500, 0]
            }
        ]

    def my_view(request):
        qs = TestModel.objects.all()

        stats = get_stats(qs, STATS_CFG)
        return render(request, 'my_template.html', {'stats': stats})

The ``stats`` will be an array of dictionaries, similar to the declarative example.


Exporting the stats
===================

You can easily export these stats in xls using the xlwt (https://pypi.org/project/xlwt/) library and this function:

.. code-block:: python

    import xlwt
    
    def create_xls_resp(stats, response):
        context = self.get_context_data()
        import xlwt
        wb = xlwt.Workbook(encoding="utf-8")
        for stat in stats:
            ws = wb.add_sheet(stat["label"][:31])
            ws.write(0,0,stat["label"], xlwt.easyxf('font: name Calibri, bold on', ))
            if stat["value"]:
                ws.write(0,1,stat["value"], xlwt.easyxf('font: name Calibri, bold on', ))

            for i, val in enumerate(stat["values"], start=2):
                for j,v in enumerate(val, start=0):
                    ws.write(i,j,v)
        wb.save(response)

Now you can call it like this from your view:

.. code-block:: python

    from django.http import HttpResponse

    def my_export_view(request):
        qs = TestModel.objects.all()

        stats = get_stats(qs, STATS_CFG)
        response = HttpResponse(content_type="application/ms-excel")
        response["Content-Disposition"] = "attachment; filename=export.xls"
        create_xls_resp(response)
        return response
            

Changelog
=========

* v.0.7.3: Add range for buckets
* v.0.7.3: Make sure it works if dates are null
* v.0.7.2: Small fixes
* v.0.7.1: Allow aggr_field to be a list
* v.0.7.0: Add ``query_aggregate_extract_date`` and ``QueryAggregateExtractDateStat``
* v.0.6.0: Add tests!
* v.0.5.1: Allow adding a formatter for the values
* v.0.5.0: Add declarative API
* v.0.4.0: Allow the aggregate function to run on a different field using ``aggr_field``
* v.0.3.1: Fix small bug with ``choice_aggregate_with_null``
* v.0.3.0: Add ``choice_aggregate_with_null`` and throw if stat kind is not found
* v.0.2.1: Fix small bug with column aliases
* v.0.2.0: Changed API; use ``query_aggregate_datetime`` for a datetime field and ``query_aggregate_date`` for a date field
* v.0.1.0: Initial version
