Metadata-Version: 2.1
Name: pandas-transform-checker
Version: 0.1.1
Summary: function annotations to check properties on pandas dataframe transformations
Home-page: https://github.com/thib-s/pandas_transform_checker
Author: thibaut boissin
Author-email: thibaut.boissin@gmail.com
License: BSD 3-Clause License
Description: # Pandas transform checker
        
        ## what is it ?
        
        This library is focused on data quality checking on pandas transformations.
        Transformations are functions that takes a pandas DataFrame as input ( plus
        other params ) and output a DataFrame.
        
        This library allow the user to specify a contract that the function must respect.
        In this contract the user can specify:
         - the added columns
         - the deleted columns
         - the modified columns
         - if the function add/drop records
         - if the function modify the index ( ex: resampling )
        
        Once the contract if specified, the function will raise a RuntimeError
        if one of it's specifications is violated.
        
        ## how to use it ?
        
        The package contains the decorator that performs the check it can be 
        imported the following way:
        ```
        from pandas_transform_checker.decorator_contract_checker import input_df_contract
        ```
        
        ### Args
        
        df_param: name of the param of the function that is the input df
        contract_params: dict defining the contract of the function in the following format:
        ```
        contract_dict = {
            "col_additions": {
                "col_a": "int",
                "col_b": "float"
            },
            "col_deletions": {
                "col_c",
                "col_d"
            },
            "col_editions": {
                "col_e",
                "col_f"
            },
            "allow_index_edition": False,
            "allow_drop_record": True
        }
        ```
        which means that the function must create "col_a", "col_b", delete "col_c", "col_d", must
        not modify any column data except "col_e", "col_f", and must not edit the index
        
        here is the list of keys allowed in this dict:
        - col_additions: dict where keys are column names and values are dtypes (string)
        - col_deletions: set of str representing the deleted columns
        - col_editions: set of str representing the modified columns
        - allow_index_edition: bool indicating if the function modify the index
        - allow_add_drop_record (bool): indicate if the function can drop some records (ex. when dropna is used)
        
        ### Usage
        when you have a function that takes a df as input:
        ```
        def super_func(df_input):
            ...
        ```
        just add the annotation to automatically check properties
        ```
        @input_df_contract(df_param="df_input", contract_dict={"col_editions": {"col_e","col_f"}})
        def super_func(df_input):
            ...
        ```
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
