Metadata-Version: 2.4
Name: identibench
Version: 0.1.1
Summary: Downloads and prepares various system identification benchmark datasets
Home-page: https://github.com/daniel-om-weber/identibench
Author: Daniel Weber
Author-email: daniel.om.weber@gmail.com
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: nonlinear-benchmarks
Requires-Dist: h5py
Requires-Dist: easyDataverse>=0.4.4
Requires-Dist: pandas
Requires-Dist: gdown
Requires-Dist: bagpy
Provides-Extra: dev
Requires-Dist: matplotlib; extra == "dev"
Requires-Dist: nbdev; extra == "dev"
Requires-Dist: ipykernel; extra == "dev"
Requires-Dist: sysidentpy; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary



<img src="https://raw.githubusercontent.com/daniel-om-weber/identibench/main/assets/logo.svg" width="200" align="left" alt="identibench logo">

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Identibench

[![PyPI
version](https://badge.fury.io/py/identibench.svg)](https://badge.fury.io/py/identibench)
[![License: Apache
2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Docs
Status](https://img.shields.io/badge/docs-up_to_date-brightgreen.svg)](https://daniel-om-weber.github.io/identibench/)
[![Python
Versions](https://img.shields.io/pypi/pyversions/identibench.png)](https://pypi.org/project/identibench/)

Identibench is a Python library designed to streamline and standardize
the benchmarking of system identification models. Evaluating and
comparing dynamic models often requires repetitive setup for data
handling, evaluation protocols, and metrics implementation, making fair
comparisons and reproducing results challenging. Identibench tackles
this by offering a collection of pre-defined benchmark specifications
for simulation and prediction tasks, built upon common datasets. It
automates data downloading and processing into a consistent format and
provides standard evaluation metrics via a simple interface
(run_benchmark). This allows you to focus your efforts on developing
innovative models, while relying on Identibench for robust and
reproducible evaluation.

## Key Features

- **Access Many Benchmarks from different systems:** Instantly utilize
  pre-configured benchmarks covering diverse domains like electronics
  (Silverbox), mechanics (Industrial Robot), process control (Cascaded
  Tanks), aerospace (Quadrotors), and more, available for both
  simulation and prediction tasks.
- **Automate Data Management:** Forget manual downloading and
  processing; the library handles fetching data from various sources
  (web, Drive, Dataverse), extracting archives (ZIP, RAR, MAT, BAG),
  converting to a standard HDF5 format, and caching locally.
- **Integrate Any Model to evaluate on all benchmarks:** Plug in your
  custom models, regardless of the Python framework used (NumPy, SciPy,
  PyTorch, TensorFlow, JAX, etc.), using a straightforward function
  interface (`build_model`) that receives all necessary context.
- **Capture Comprehensive Results:** Obtain detailed evaluation reports
  including standard metrics (RMSE, NRMSE, FIT%, etc.), task-specific
  scores, execution timings, configuration parameters (hyperparameters,
  seed), and raw model predictions for thorough analysis.
- **Easily Define New Benchmarks:** Go beyond the included datasets by
  creating your own benchmark specifications
  ([`BenchmarkSpecSimulation`](https://daniel-om-weber.github.io/identibench/benchmark.html#benchmarkspecsimulation),
  [`BenchmarkSpecPrediction`](https://daniel-om-weber.github.io/identibench/benchmark.html#benchmarkspecprediction))
  for private data or unique tasks, leveraging the library’s structure
  and transparent data format.

## Installation

You can install `identibench` using pip:

``` bash
pip install identibench
```

To install the latest development version directly from GitHub, use:

``` bash
pip install git+https://github.com/daniel-om-weber/identibench.git
```

``` python
# Basic usage
import identibench as idb
from pathlib import Path

# Example: Download a single dataset
# Note: Always use a Path object, not a string
save_path = Path('./tmp/wh')
idb.datasets.workshop.dl_wiener_hammerstein(save_path)
```

``` python
from sysidentpy.model_structure_selection import FROLS
from sysidentpy.parameter_estimation import LeastSquares
def build_frols_model(context):
    u_train, y_train, _ = next(context.get_train_sequences())
    
    ylag = context.hyperparameters.get('ylag', 5)
    xlag = context.hyperparameters.get('xlag', 5)
    n_terms = context.hyperparameters.get('n_terms', 10)
    estimator = context.hyperparameters.get('estimator', LeastSquares())

    _model = FROLS(xlag=xlag, ylag=ylag, n_terms=n_terms,estimator=estimator)
    _model.fit(X=u_train, y=y_train)

    def model(u_test, y_init):
        nonlocal _model
        yhat_full = _model.predict(X=u_test, y=y_init[:_model.max_lag])
        y_pred = yhat_full[_model.max_lag:]
        return y_pred
    
    return model
```

``` python
hyperparams = {
    'ylag': 2,
    'xlag': 2,
    'n_terms': 10, # Number of terms for FROLS
    'estimator': LeastSquares()
}

results = idb.run_benchmark(
    spec=idb.BenchmarkWH_Simulation,
    build_model=build_frols_model,
    hyperparameters=hyperparams
)
```

## Simulation Benchmarks

| Key                | Benchmark Name                    |
|--------------------|-----------------------------------|
| `WH_Sim`           | BenchmarkWH_Simulation            |
| `Silverbox_Sim`    | BenchmarkSilverbox_Simulation     |
| `Tanks_Sim`        | BenchmarkCascadedTanks_Simulation |
| `EMPS_Sim`         | BenchmarkEMPS_Simulation          |
| `NoisyWH_Sim`      | BenchmarkNoisyWH_Simulation       |
| `RobotForward_Sim` | BenchmarkRobotForward_Simulation  |
| `RobotInverse_Sim` | BenchmarkRobotInverse_Simulation  |
| `Ship_Sim`         | BenchmarkShip_Simulation          |
| `QuadPelican_Sim`  | BenchmarkQuadPelican_Simulation   |
| `QuadPi_Sim`       | BenchmarkQuadPi_Simulation        |

## Prediction Benchmarks

| Key                 | Benchmark Name                    |
|---------------------|-----------------------------------|
| `WH_Pred`           | BenchmarkWH_Prediction            |
| `Silverbox_Pred`    | BenchmarkSilverbox_Prediction     |
| `Tanks_Pred`        | BenchmarkCascadedTanks_Prediction |
| `EMPS_Pred`         | BenchmarkEMPS_Prediction          |
| `NoisyWH_Pred`      | BenchmarkNoisyWH_Prediction       |
| `RobotForward_Pred` | BenchmarkRobotForward_Prediction  |
| `RobotInverse_Pred` | BenchmarkRobotInverse_Prediction  |
| `Ship_Pred`         | BenchmarkShip_Prediction          |
| `QuadPelican_Pred`  | BenchmarkQuadPelican_Prediction   |
| `QuadPi_Pred`       | BenchmarkQuadPi_Prediction        |

## Workflow Details

This section provides more detail on the core concepts and components of
the `identibench` workflow.

### Benchmark Types

`identibench` defines two main types of benchmark tasks, specified using
different classes:

- **Simulation
  ([`BenchmarkSpecSimulation`](https://daniel-om-weber.github.io/identibench/benchmark.html#benchmarkspecsimulation))**:
  - **Goal:** Evaluate a model’s ability to perform a free-run
    simulation, predicting the system’s output over an extended period
    given the input sequence.
  - **Typical Input to Predictor:** The full input sequence (`u_test`)
    and potentially an initial segment of the output sequence
    (`y_test[:init_window]`) for warm-up or state initialization.
  - **Expected Output from Predictor:** The predicted output sequence
    (`y_pred`) corresponding to the input, usually excluding the warm-up
    period.
  - **Use Case:** Assessing models intended for long-term prediction,
    control simulation, or understanding overall system dynamics.
- **Prediction
  ([`BenchmarkSpecPrediction`](https://daniel-om-weber.github.io/identibench/benchmark.html#benchmarkspecprediction))**:
  - **Goal:** Evaluate a model’s ability to predict the system’s output
    *k* steps into the future based on recent past data.
  - **Typical Input to Predictor:** Often involves windows of past
    inputs and outputs (e.g., `u[t:t+H]`, `y[t:t+H]`).
  - **Expected Output from Predictor:** The predicted output at a
    specific future time step (e.g., `y[t+H+k]`). The `pred_horizon`
    parameter defines ‘k’, and `pred_step` defines how frequently
    predictions are made.
  - **Use Case:** Evaluating models focused on short-to-medium term
    forecasting, state estimation, or receding horizon control.
- **`init_window`**: Both benchmark types often use an `init_window`.
  This specifies an initial number of time steps whose data might be
  provided to the model for initialization or warm-up. Importantly, data
  within this window is typically *excluded* from the final performance
  metric calculation to ensure a fair evaluation of the model’s
  predictive capabilities beyond the initial transient.

### Model Interface (`build_model`)

The core of integrating your custom logic is the `build_model` function
you provide to
[`run_benchmark`](https://daniel-om-weber.github.io/identibench/benchmark.html#run_benchmark).

- **Purpose:** This function is responsible for defining your model
  architecture, training it using the provided data, and returning a
  callable predictor function.
- **Input (`context: TrainingContext`):** Your `build_model` function
  receives a single argument, `context`, which is a
  [`TrainingContext`](https://daniel-om-weber.github.io/identibench/benchmark.html#trainingcontext)
  object. This object gives you access to:
  - `context.spec`: The full specification of the current benchmark
    being run (including dataset paths, input/output columns,
    `init_window`, etc.).
  - `context.hyperparameters`: A dictionary containing any
    hyperparameters you passed to `run_benchmark`. Use this to configure
    your model or training process.
  - `context.seed`: A random seed for ensuring reproducibility.
  - Data Access Methods: Functions like `context.get_train_sequences()`
    and `context.get_valid_sequences()` provide iterators over the raw,
    full-length training and validation data sequences (as tuples of
    NumPy arrays `(u, y, x)`). **Note:** You need to handle any batching
    or windowing required for your specific training algorithm *within*
    your `build_model` function.
- **Output (Predictor `Callable`):** `build_model` *must* return a
  callable object (e.g., a function, an object’s method) that represents
  your trained model ready for prediction/simulation. This returned
  callable will be used internally by
  [`run_benchmark`](https://daniel-om-weber.github.io/identibench/benchmark.html#run_benchmark)
  on the test set. Its expected signature depends on the benchmark type,
  but typically it accepts NumPy arrays for test inputs (and potentially
  initial outputs) and returns a NumPy array containing the predictions.

### Running Multiple Benchmarks

To evaluate a model across several scenarios efficiently, use the
[`run_multiple_benchmarks`](https://daniel-om-weber.github.io/identibench/benchmark.html#run_multiple_benchmarks)
function:

``` python
# Example: Run on a subset of benchmarks
specs_to_run = {
    'WH_Sim': idb.simulation_benchmarks['WH_Sim'],
    'Tanks_Pred': idb.prediction_benchmarks['Tanks_Pred']
}

# Assume 'my_build_model' is your defined build function
all_results = idb.run_multiple_benchmarks(specs_to_run, build_model=build_frols_model)

# all_results is a list of result dictionaries, one for each spec run
```

    --- Starting benchmark run for 2 specifications ---

    [1/2] Running benchmark: BenchmarkWH_Simulation
      -> Success: BenchmarkWH_Simulation completed.

    [2/2] Running benchmark: BenchmarkCascadedTanks_Prediction
      -> ERROR running benchmark 'BenchmarkCascadedTanks_Prediction': Input shapes must match. Got (45, 1) and (50, 1)

    --- Benchmark run finished. 1/2 completed successfully. ---

This function iterates through the provided list or dictionary of
benchmark specifications, calling
[`run_benchmark`](https://daniel-om-weber.github.io/identibench/benchmark.html#run_benchmark)
for each one using the same `build_model` function and hyperparameters.

### Data Handling & Format

Understanding how `identibench` organizes and stores data is helpful for
direct interaction or adding new datasets.

- **Directory Structure:** Datasets are stored under a root directory
  (default: `~/.identibench_data`, configurable via the
  `IDENTIBENCH_DATA_ROOT` environment variable). The structure follows:
  `DATA_ROOT / [dataset_id] / [subset] / [experiment_file.hdf5]`.
- **Subsets:** Standard subset names are `train`, `valid`, and `test`.
  An optional `train_valid` directory might contain combined data.
- **Download & Cache:** Data is downloaded automatically when a
  benchmark requires it and cached locally to avoid re-downloads. The
  `identibench.datasets.download_all_datasets` function can fetch all
  datasets at once.
- **File Format:** Processed time-series data is stored in the **HDF5
  (`.hdf5`)** format.
- **HDF5 Structure:**
  - Each `.hdf5` file typically represents one experimental run.
  - Signals (inputs, outputs, states) are stored as separate
    1-dimensional datasets within the file, named conventionally as
    `u0`, `u1`, …, `y0`, `y1`, …, `x0`, …
  - Data is usually stored as `float32` NumPy arrays.
  - Metadata like sampling frequency (`fs`) and suggested initialization
    window size (`init_sz`) are stored as attributes on the root group
    of the HDF5 file.
  - *Example Structure:*
    `my_dataset/       └── train/           └── train_run_1.hdf5               ├── u0 (Dataset: shape=(N,), dtype=float32)               ├── y0 (Dataset: shape=(N,), dtype=float32)               └── Attributes:                   └── fs (Attribute: float)`
- **Extensibility:** Adhering to this HDF5 format ensures compatibility
  when adding new dataset loaders. Helper functions like
  [`identibench.utils.write_array`](https://daniel-om-weber.github.io/identibench/utils.html#write_array)
  facilitate creating files in the correct format.

### Understanding Benchmark Results

The
[`run_benchmark`](https://daniel-om-weber.github.io/identibench/benchmark.html#run_benchmark)
function returns a dictionary containing detailed results of the
experiment. Key entries include:

- `benchmark_name` (`str`): The unique name of the benchmark
  specification used.
- `dataset_id` (`str`): Identifier for the dataset source.
- `hyperparameters` (`dict`): The hyperparameters dictionary passed to
  the run.
- `seed` (`int`): The random seed used for the run.
- `training_time_seconds` (`float`): Wall-clock time spent inside your
  `build_model` function.
- `test_time_seconds` (`float`): Wall-clock time spent evaluating the
  returned predictor on the test set.
- `benchmark_type` (`str`): The type of benchmark run (e.g.,
  `'BenchmarkSpecSimulation'`).
- `metric_name` (`str`): The name of the primary metric function defined
  in the spec.
- `metric_score` (`float`): The calculated score for the primary metric
  on the test set (aggregated if multiple test files).
- `custom_scores` (`dict`): Any additional scores calculated by custom
  evaluation logic specific to the benchmark.
- `model_predictions` (`list`): A list containing the raw outputs. For
  simulation, it’s typically
  `[(y_pred_test1, y_true_test1), (y_pred_test2, y_true_test2), ...]`.
  For prediction, the structure might be nested reflecting windowed
  predictions.
