Metadata-Version: 2.1
Name: django-partitioned-audit
Version: 1.0.4
Summary: Keep audit log based on PostgreSql triggers and partitioned tables
Home-page: https://gitlab.com/hgdeoro/django-partitioned-audit
Author: Horacio G. de Oro
Author-email: hgdeoro@gmail.com
License: BSD-3-Clause
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Framework :: Django
Classifier: Framework :: Django :: 3.2
Classifier: Framework :: Django :: 4.1
Classifier: Framework :: Django :: 4.2
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
License-File: LICENSE

========================
django-partitioned-audit
========================

Creates an audit log of the modifications made to models, by using triggers.

* Only PostgreSql supported
* Models to audit are configured in ``AppConfig``
* Triggers are created automatically when running ``python manage.py migrate``
* The whole row is saved by the trigger
* Two different implementations for audit tables (see `SimpleAuditTableManager` and `AdvancedAuditTableManager`)
* The audit table is partitioned
* There are 3 partition strategies: one partition per month, one per week, one per day
* Management command `manage_partition_tables` creates required partitions

Quick start (SimpleAuditTableManager)
-------------------------------------

1. Add "django_partitioned_audit" to your INSTALLED_APPS setting like this::

    INSTALLED_APPS = [
        ...
        'django_partitioned_audit',

        'your.app1.App1Config',
        'your.app2.App2Config',
        'your.app3.App3Config',
    ]

2. Run ``python manage.py migrate`` to create the partitioned table.

There is no real migration to create the partitioned table. Instead, a `post_migrate` signal is used.

3. Run ``manage.py manage_partition_tables create --extra-days=60`` to create the partitions::

    $ manage.py manage_partition_tables create --extra-days=60
    +----------------------------------+--------------+--------------+---------------+
    |  table_name                      |  from_date   |  to_date     |  status       |
    +----------------------------------+--------------+--------------+---------------+
    |  audit_simple_20220201_20220301  |  2022-02-01  |  2022-03-01  |  ⚠ to create  |
    |  audit_simple_20220301_20220401  |  2022-03-01  |  2022-04-01  |  ⚠ to create  |
    |  audit_simple_20220401_20220501  |  2022-04-01  |  2022-05-01  |  ⚠ to create  |
    +----------------------------------+--------------+--------------+---------------+

You should run that command periodically, to make sure the database always contains a partition where to insert data.

4. Register the models you want to audit in your ``AppConfig`` instances::

    from django.apps import AppConfig
    class MyAppConfig(AppConfig):
        name = "myapp"
        trigger_audit_models = (
            'Model1',
            'Model2',
            'Model3',
        )

5. Run ``python manage.py migrate`` to create the triggers.


Quick start (AdvancedAuditTableManager)
---------------------------------------

There is a second implementation of the audit table that can be used, based on:

- https://wiki.postgresql.org/wiki/Audit_trigger_91plus
- https://github.com/2ndQuadrant/audit-trigger

The code can be seen at `django_partitioned_audit.audit_table.advanced_audit_table_manager.AdvancedAuditTableManager`.

To use this implementation, you can reference that class in `settings.py`::

    DPA_AUDIT_TABLE_MANAGER = "django_partitioned_audit.audit_table.advanced_audit_table_manager.AdvancedAuditTableManager"

and follow the steps described above, at `Quick start (SimpleAuditTableManager)`.

This implementation also audit `TRUNCATE` operations.

Security of audit entries
+++++++++++++++++++++++++

DBA should correctly configure permissions, in a way that the user used to
connect to the database from Django has permissions to INSERT rows in the
``trigger_audit_entries`` table, but NO permissions to UPDATE / DELETE them.


How it works
------------

#. A trigger is executed with each insert/update/delete operation

   * the database trigger is created by ``python manage.py migrate``.
   * only for the tables associated to the models that are explicitly
     specified in your ``AppConfig``.

#. The trigger creates a new row in ``trigger_audit_entries`` table containing:

   * ``object_table``: table where the modification happened (one of your models)
   * ``object_payload``: JSON representation of the whole row (after modification)
   * ``audit_entry_created``: timestamp
   * ``audit_txid_current``: PostgreSql TXID in which the modification occurred
   * ``audit_operation``: operation: ``INSERT``, ``UPDATE``, ``DELETE``


Trigger
+++++++

The solution is very simple in terms of code running in PostgreSQL: just a trigger that calls a function.

The trigger just invokes the function for each ``INSERT``, ``UPDATE``, ``DELETE`` ::

    CREATE TRIGGER trigger_audit_entry_creator_trigger
        AFTER INSERT OR UPDATE OR DELETE ON {table_name}
        FOR EACH ROW EXECUTE FUNCTION trigger_audit_entry_creator_func_v2();

The function just serializes the row into a JSON and insert it in the audit table::

    CREATE FUNCTION trigger_audit_entry_creator_func_v2() RETURNS TRIGGER AS $scr$
        DECLARE
            object_payload  varchar;
        BEGIN
            IF (TG_OP = 'INSERT') THEN
                object_payload  = row_to_json(NEW);
            ELSIF (TG_OP = 'UPDATE') THEN
                object_payload  = row_to_json(NEW);
            ELSIF (TG_OP = 'DELETE') THEN
                object_payload  = row_to_json(OLD);
            ELSE
                RAISE EXCEPTION 'Unexpected TG_OP = %', TG_OP;
            END IF;

            INSERT INTO audit_simple (
                    object_table,
                    object_payload,
                    audit_entry_created,
                    audit_txid_current,
                    audit_operation
                )
                SELECT
                    TG_TABLE_NAME,
                    object_payload,
                    now(),
                    txid_current(),
                    TG_OP;
            RETURN NULL;
        END;
    $scr$ LANGUAGE plpgsql;

Management of partitions
------------------------

The Django custom management command `manage_partition_tables` can be used to manage the partitions.

Sample usage
++++++++++++

If you want to have enough partition to handle next 90 days (around 3 months), you can use `--extra-days=90`.
Because it's the first time we run the command, no partition exists, and the plan will report that all
partitions need to be created::


    $ manage.py manage_partition_tables simulate --extra-days=90
    +----------------------------------+--------------+--------------+---------------+
    |  table_name                      |  from_date   |  to_date     |  status       |
    +----------------------------------+--------------+--------------+---------------+
    |  audit_simple_20220201_20220301  |  2022-02-01  |  2022-03-01  |  ⚠ to create  |
    |  audit_simple_20220301_20220401  |  2022-03-01  |  2022-04-01  |  ⚠ to create  |
    |  audit_simple_20220401_20220501  |  2022-04-01  |  2022-05-01  |  ⚠ to create  |
    |  audit_simple_20220501_20220601  |  2022-05-01  |  2022-06-01  |  ⚠ to create  |
    +----------------------------------+--------------+--------------+---------------+


We can also see the plan if no extra days are requested (this way, we'll only create partitions for
the current month::


    $ manage.py manage_partition_tables simulate --extra-days=0
    +----------------------------------+--------------+--------------+---------------+
    |  table_name                      |  from_date   |  to_date     |  status       |
    +----------------------------------+--------------+--------------+---------------+
    |  audit_simple_20220201_20220301  |  2022-02-01  |  2022-03-01  |  ⚠ to create  |
    +----------------------------------+--------------+--------------+---------------+


Now let's create the partitions::


    $ manage.py manage_partition_tables create --extra-days=0
    +----------------------------------+--------------+--------------+---------------+
    |  table_name                      |  from_date   |  to_date     |  status       |
    +----------------------------------+--------------+--------------+---------------+
    |  audit_simple_20220201_20220301  |  2022-02-01  |  2022-03-01  |  ⚠ to create  |
    +----------------------------------+--------------+--------------+---------------+


If we run the command and we pass `--extra-days=90`, the partition for the current month already exists, and
only partitions for next months (to cover 90 days) will be created::


    $ manage.py manage_partition_tables create --extra-days=90
    +----------------------------------+--------------+--------------+----------------+
    |  table_name                      |  from_date   |  to_date     |  status        |
    +----------------------------------+--------------+--------------+----------------+
    |  audit_simple_20220201_20220301  |  2022-02-01  |  2022-03-01  |  ✓ exists      |
    |  audit_simple_20220301_20220401  |  2022-03-01  |  2022-04-01  |  ❌ to create  |
    |  audit_simple_20220401_20220501  |  2022-04-01  |  2022-05-01  |  ❌ to create  |
    |  audit_simple_20220501_20220601  |  2022-05-01  |  2022-06-01  |  ❌ to create  |
    +----------------------------------+--------------+--------------+----------------+


We can use `list` to list existing partitions::


    $ manage.py manage_partition_tables list
    +----------------------------------+--------------+--------------+
    |  table_name                      |  from_date   |  to_date     |
    +----------------------------------+--------------+--------------+
    |  audit_simple_20220201_20220301  |  2022-02-01  |  2022-03-01  |
    |  audit_simple_20220301_20220401  |  2022-03-01  |  2022-04-01  |
    |  audit_simple_20220401_20220501  |  2022-04-01  |  2022-05-01  |
    |  audit_simple_20220501_20220601  |  2022-05-01  |  2022-06-01  |
    +----------------------------------+--------------+--------------+


Partition per week
++++++++++++++++++

We can use one partition per week::


    $ manage.py manage_partition_tables create --extra-days=30 --time-range-generator=WeeklyTimeRangeGenerator
    +----------------------------------+--------------+--------------+---------------+
    |  table_name                      |  from_date   |  to_date     |  status       |
    +----------------------------------+--------------+--------------+---------------+
    |  audit_simple_20220222_20220301  |  2022-02-22  |  2022-03-01  |  ⚠ to create  |
    |  audit_simple_20220301_20220308  |  2022-03-01  |  2022-03-08  |  ⚠ to create  |
    |  audit_simple_20220308_20220315  |  2022-03-08  |  2022-03-15  |  ⚠ to create  |
    |  audit_simple_20220315_20220322  |  2022-03-15  |  2022-03-22  |  ⚠ to create  |
    |  audit_simple_20220322_20220329  |  2022-03-22  |  2022-03-29  |  ⚠ to create  |
    +----------------------------------+--------------+--------------+---------------+


Partition per day
+++++++++++++++++

We can use one partition per day::


    $ manage.py manage_partition_tables create --extra-days=10 --time-range-generator=DailyTimeRangeGenerator
    +----------------------------------+--------------+--------------+---------------+
    |  table_name                      |  from_date   |  to_date     |  status       |
    +----------------------------------+--------------+--------------+---------------+
    |  audit_simple_20220222_20220223  |  2022-02-22  |  2022-02-23  |  ⚠ to create  |
    |  audit_simple_20220223_20220224  |  2022-02-23  |  2022-02-24  |  ⚠ to create  |
    |  audit_simple_20220224_20220225  |  2022-02-24  |  2022-02-25  |  ⚠ to create  |
    |  audit_simple_20220225_20220226  |  2022-02-25  |  2022-02-26  |  ⚠ to create  |
    |  audit_simple_20220226_20220227  |  2022-02-26  |  2022-02-27  |  ⚠ to create  |
    |  audit_simple_20220227_20220228  |  2022-02-27  |  2022-02-28  |  ⚠ to create  |
    |  audit_simple_20220228_20220301  |  2022-02-28  |  2022-03-01  |  ⚠ to create  |
    |  audit_simple_20220301_20220302  |  2022-03-01  |  2022-03-02  |  ⚠ to create  |
    |  audit_simple_20220302_20220303  |  2022-03-02  |  2022-03-03  |  ⚠ to create  |
    |  audit_simple_20220303_20220304  |  2022-03-03  |  2022-03-04  |  ⚠ to create  |
    |  audit_simple_20220304_20220305  |  2022-03-04  |  2022-03-05  |  ⚠ to create  |
    +----------------------------------+--------------+--------------+---------------+


Test
----

Tested on:

* Python 3.8, 3.9, 3.10, 3.11
* Django 3.2, 4.1, 4.2
* PostgreSql 12, 13, 14, 15


Known issues
------------

* JAdvancedAuditTableManager: SONB is used as intermediary format in the view & Django models, should be migrated to HSTORE
* Audit table and other objects are not managed by using Django migrations
* Not tested with psycopg3
* Coupled to Django (would be nice if Django is supported but possible to use it without Django)
* Works only on default db schema
* Lack feature: remove old partitions

TODO
----
* Refactor responsibilities on `PartitionManager` and `AuditTableManager`
