Metadata-Version: 2.4
Name: CADET-RDM
Version: 1.1.2
Summary: A Python toolbox for research data management.
Author-email: Ronald Jäpel <r.jaepel@fz-juelich.de>, Johannes Schmölder <j.schmoelder@fz-juelich.de>, Eric von Lieres <e.von.lieres@fz-juelich.de>, Hannah Lanzrath <h.lanzrath@fz-juelich.de>
License: GPLv3
Project-URL: homepage, https://github.com/cadet/CADET-RDM
Project-URL: documentation, https://cadet-rdm.readthedocs.io/en/latest/index.html
Project-URL: Bug Tracker, https://github.com/cadet/CADET-RDM/issues
Keywords: research data management
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Intended Audience :: Science/Research
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.md
Requires-Dist: gitpython>=3.1
Requires-Dist: python-gitlab
Requires-Dist: pygithub
Requires-Dist: click
Requires-Dist: tabulate
Requires-Dist: keyring
Requires-Dist: addict
Requires-Dist: numpy
Requires-Dist: pyyaml
Requires-Dist: semantic-version
Requires-Dist: docker
Requires-Dist: cookiecutter
Provides-Extra: jupyter
Requires-Dist: nbformat; extra == "jupyter"
Requires-Dist: nbconvert; extra == "jupyter"
Requires-Dist: ipylab; extra == "jupyter"
Requires-Dist: junix; extra == "jupyter"
Requires-Dist: jupytext; extra == "jupyter"
Requires-Dist: jupyterlab; extra == "jupyter"
Dynamic: license-file

# CADET-RDM

[![CI](https://github.com/cadet/CADET-RDM/actions/workflows/CI.yml/badge.svg)](https://github.com/cadet/CADET-RDM/actions/workflows/CI.yml)
[![Documentation](https://readthedocs.org/projects/cadet-rdm/badge/?version=latest)](https://cadet-rdm.readthedocs.io)
[![License](https://img.shields.io/github/license/cadet/cadet-rdm)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.9%2B-blue)](https://www.python.org/)

CADET-RDM is a Research Data Management toolbox developed at Forschungszentrum Jülich.
It supports computational research projects by tracking code, data, environments, and generated results in a reproducible and shareable way.

The toolbox is domain-agnostic and can be applied to any computational project with a structured workflow.


## Scope and purpose

CADET-RDM helps manage and version

- input data
- source code
- configurations and metadata
- software and environment versions
- generated output data

The primary goal is to ensure reproducibility, traceability, and reuse of computational results by explicitly linking them to the project state that produced them.


## Repository structure

A CADET-RDM project consists of two independent but coupled Git repositories:

1. **Project repository**
   Contains source code, configuration files, documentation, and metadata required to execute the computations.

2. **Output repository**
   Contains the results generated by running the project code, including data products, models, figures, and run-specific metadata.

Both repositories have separate Git histories and remotes. CADET-RDM provides workflows that operate on both repositories to maintain a consistent link between code and results.

## Using CADET-RDM

### Result tracking and reproducibility

Each execution of project code creates a new output branch that contains only the files generated by that run.

In addition, a central run history records

- the project repository commit used for the run
- software and environment information
- metadata required to reproduce the result

This commit structure allows results to be reproduced and inspected without manual bookkeeping.

### Interfaces

CADET-RDM can be used through

* a **command line interface (CLI)**, e.g. for scripted or automated bash workflows
* a **Python interface**, e.g. for direct context tracking of code within existing Python workflows

Additionally, CADET-RDM can be used within Jupyter Lab with some limitations.

Detailed descriptions of commands and APIs are provided in the dedicated interface documentation.

* [Command line interface](https://cadet-rdm.readthedocs.io/en/latest/user_guide/command-line-interface.html)
* [Python interface](https://cadet-rdm.readthedocs.io/en/latest/user_guide/python-interface.html)
* [Jupyter interface](https://cadet-rdm.readthedocs.io/en/latest/user_guide/jupyter-interface.html)

### Typical workflow

1. Initialize or clone a CADET-RDM project
2. Develop and commit project code
3. Execute computations with CADET-RDM result tracking
4. Generate versioned output branches automatically
5. Push project and output repositories to their remotes
6. Reuse or reference results via their output branches


Results are referenced by unique output branch names that encode the timestamp, active project branch, and project commit hash. CADET-RDM provides a local cache mechanism that allows results from previous runs or from other CADET-RDM projects to be reused as input data while preserving provenance information.


## Getting started

The full documentation is available at
https://cadet-rdm.readthedocs.io

It includes installation instructions, usage guides for the different interfaces, and detailed descriptions of repository and result management workflows.


## Project information

- **License:** see [LICENSE](LICENSE)
- **Authors and contributors:** see [AUTHORS](AUTHORS.md)
