Metadata-Version: 2.4
Name: direl-ts-tool-kit
Version: 0.4.9
Summary: A toolbox for time series analysis and visualization.
Home-page: https://gitlab.com/direl/direl_tool_kit
Author: Diego Restrepo-Leal
Author-email: diegorestrepoleal@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Visualization
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENCE
Requires-Dist: pandas>=1.0.0
Requires-Dist: numpy>=1.18.0
Requires-Dist: matplotlib>=3.0.0
Requires-Dist: openpyxl
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# direl-ts-tool-kit
> A Toolbox for Time Series Analysis and Visualization

A lightweight Python library developed to streamline common tasks in time series processing, including data preparation,
visualization with a consistent aesthetic style, and handling irregular indices.

## Key features and functions

The library provides the following key functionalities, primarily centered around data preparation and plotting.

### Data preparation and index management
#### parse_datetime_index
`parse_datetime_index(df_raw, date_column="date", format=None)`

Parses a specified column into datetime objects and sets it as the DataFrame index.

This function prepares raw data for time series analysis by ensuring the
DataFrame is indexed by the correct datetime type.

#### generate_dates
`generate_dates(df_ts, freq="MS")`

Generates a continuous DatetimeIndex covering the time span of the input DataFrame.

The function determines the start and end dates from the existing DataFrame index
and creates a new, regular date sequence based on the specified frequency.

#### reindex_and_aggregate
`reindex_and_aggregate(df_ts, column_name, freq="MS")`

Re-indexes a time series DataFrame to a regular frequency, aggregates values,
and introduces NaN for missing time steps.

This function first identifies the time range from the original (potentially irregular)
index, aggregates data if necessary (e.g., if multiple entries exist per time step),
and then merges the data onto a complete date range, effectively filling gaps
with NaN values.

#### remove_outliers_by_threshold
`remove_outliers_by_threshold(df_ts, column_name, lower_bound, upper_bound)`

Replaces values in a specified column with NaN if they fall outside a defined range (outlier removal).

This function identifies data points that are either below the lower
bound or above the upper bound and treats them as missing data.


### Visualization and styling

#### plot_time_series
`plot_time_series(df_ts, variable, units="", color="BLUE_LINES", time_unit="Year", rot=90, auto_format_label=True)`

Plots a time series with custom styling and dual-level grid visibility.

This function automatically sets major and minor time-based locators
on the x-axis based on the specified time unit, and formats the y-axis
to use scientific notation.

#### save_figure
`save_figure(fig, file_name, variable_name="", path="./")`

Saves a Matplotlib figure in three common high-quality formats (PNG, PDF, SVG).

The function creates a consistent file name structure:
{path}/{file_name}_{variable_name}.{extension}.

# Examples
- [Example 1](https://gitlab.com/direl/direl_tool_kit/-/blob/main/example/example_01.md?ref_type=heads)
- [Example 2](https://gitlab.com/direl/direl_tool_kit/-/blob/main/example/example_02.md?ref_type=heads)

