Metadata-Version: 2.1
Name: mex-artificial
Version: 0.5.3
Summary: Create artificial data for the MEx project.
Author-Email: MEx Team <mex@rki.de>
License: MIT License
         
         Copyright (c) 2025 Robert Koch-Institut
         
         Permission is hereby granted, free of charge, to any person obtaining a copy
         of this software and associated documentation files (the "Software"), to deal
         in the Software without restriction, including without limitation the rights
         to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
         copies of the Software, and to permit persons to whom the Software is
         furnished to do so, subject to the following conditions:
         
         The above copyright notice and this permission notice shall be included in all
         copies or substantial portions of the Software.
         
         THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
         IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
         FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
         AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
         LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
         OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
         SOFTWARE.
         
Project-URL: Repository, https://github.com/robert-koch-institut/mex-artificial
Requires-Python: <3.13,>=3.11
Requires-Dist: annotated-types<=0.7
Requires-Dist: faker<38,>=37
Requires-Dist: mex-common>=0.64
Requires-Dist: pydantic>=2
Requires-Dist: typer>=0.13
Provides-Extra: dev
Requires-Dist: ipdb>=0.13; extra == "dev"
Requires-Dist: mypy>=1; extra == "dev"
Requires-Dist: pytest-cov>=6; extra == "dev"
Requires-Dist: pytest-random-order>=1; extra == "dev"
Requires-Dist: pytest-xdist>=3; extra == "dev"
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: ruff>=0.12; extra == "dev"
Requires-Dist: sphinx>=8; extra == "dev"
Description-Content-Type: text/markdown

# MEx artificial

Create artificial data for the MEx project.

[![cookiecutter](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/cookiecutter.yml/badge.svg)](https://github.com/robert-koch-institut/mex-template)
[![cve-scan](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/cve-scan.yml/badge.svg)](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/cve-scan.yml)
[![documentation](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/documentation.yml/badge.svg)](https://robert-koch-institut.github.io/mex-artificial)
[![linting](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/linting.yml/badge.svg)](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/linting.yml)
[![open-code](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/open-code.yml/badge.svg)](https://gitlab.opencode.de/robert-koch-institut/mex/mex-artificial)
[![testing](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/testing.yml/badge.svg)](https://github.com/robert-koch-institut/mex-artificial/actions/workflows/testing.yml)

## Project

The Metadata Exchange (MEx) project is committed to improve the retrieval of RKI
research data and projects. How? By focusing on metadata: instead of providing the
actual research data directly, the MEx metadata catalog captures descriptive information
about research data and activities. On this basis, we want to make the data FAIR[^1] so
that it can be shared with others.

Via MEx, metadata will be made findable, accessible and shareable, as well as available
for further research. The goal is to get an overview of what research data is available,
understand its context, and know what needs to be considered for subsequent use.

RKI cooperated with D4L data4life gGmbH for a pilot phase where the vision of a
FAIR metadata catalog was explored and concepts and prototypes were developed.
The partnership has ended with the successful conclusion of the pilot phase.

After an internal launch, the metadata will also be made publicly available and thus be
available to external researchers as well as the interested (professional) public to
find research data from the RKI.

For further details, please consult our
[project page](https://www.rki.de/DE/Aktuelles/Publikationen/Forschungsdaten/MEx/metadata-exchange-plattform-mex-node.html).

[^1]: FAIR is referencing the so-called
[FAIR data principles](https://www.go-fair.org/fair-principles/) – guidelines to make
data Findable, Accessible, Interoperable and Reusable.

**Contact** \
For more information, please feel free to email us at [mex@rki.de](mailto:mex@rki.de).

### Publisher

**Robert Koch-Institut** \
Nordufer 20 \
13353 Berlin \
Germany

## Package

Create artificial extracted items, transform them into merged items and write the
results into a configured sink.

## License

This package is licensed under the [MIT license](/LICENSE). All other software
components of the MEx project are open-sourced under the same license as well.

## Development

### Installation

- on unix, consider using pyenv https://github.com/pyenv/pyenv
  - get pyenv `curl https://pyenv.run | bash`
  - install 3.11 `pyenv install 3.11`
  - switch version `pyenv global 3.11`
  - run `make install`
- on windows, consider using pyenv-win https://pyenv-win.github.io/pyenv-win/
  - follow https://pyenv-win.github.io/pyenv-win/#quick-start
  - install 3.11 `pyenv install 3.11`
  - switch version `pyenv global 3.11`
  - run `.\mex.bat install`

### Linting and testing

- run all linters with `pdm lint`
- run only unit tests with `pdm unit`
- run unit and integration tests with `pdm test`

### Updating dependencies

- update boilerplate files with `cruft update`
- update global requirements in `requirements.txt` manually
- update git hooks with `pre-commit autoupdate`
- update package dependencies using `pdm update-all`
- update github actions in `.github/workflows/*.yml` manually

### Creating release

- run `pdm release RULE` to release a new version where RULE determines which part of
  the version to update and is one of `major`, `minor`, `patch`.

### Container workflow

- build image with `make image`
- run local version using docker `make run`

### Pre-built workflow

- you can run the latest artificial data generator without building it locally
- just pull it from the container registry and configure using cli arguments
- `docker run -v $(pwd):/out ghcr.io/robert-koch-institut/mex-artificial:latest --count=100 --chattiness=10`
- use `-v $(pwd):/out` to specify an output directory for the resulting `ndjson` file
- `--count` controls the number of items to generate
- `--chattiness` controls the number of words in textual fields

## Commands

- run `pdm run artificial --help` to print instructions
