Metadata-Version: 2.1
Name: metadata_guardian
Version: 0.2.7
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Dist: rich
Requires-Dist: loguru
Requires-Dist: pyarrow
Requires-Dist: typer
Requires-Dist: pydantic
Requires-Dist: google-cloud-bigquery; extra == 'gcp'
Requires-Dist: confluent-kafka; extra == 'kafka_schema_registry'
Requires-Dist: boto3; extra == 'aws'
Requires-Dist: boto3-stubs[athena,glue]; extra == 'aws'
Requires-Dist: avro; extra == 'avro'
Requires-Dist: deltalake; extra == 'deltalake'
Requires-Dist: pandas; extra == 'deltalake'
Requires-Dist: avro; extra == 'all'
Requires-Dist: snowflake-connector-python; extra == 'all'
Requires-Dist: boto3; extra == 'all'
Requires-Dist: boto3-stubs[athena,glue]; extra == 'all'
Requires-Dist: deltalake; extra == 'all'
Requires-Dist: google-cloud-bigquery; extra == 'all'
Requires-Dist: confluent-kafka; extra == 'all'
Requires-Dist: PyMySQL; extra == 'all'
Requires-Dist: types-PyMySQL; extra == 'all'
Requires-Dist: pandas; extra == 'all'
Requires-Dist: snowflake-connector-python; extra == 'snowflake'
Requires-Dist: PyMySQL; extra == 'mysql'
Requires-Dist: types-PyMySQL; extra == 'mysql'
Requires-Dist: mypy; extra == 'devel'
Requires-Dist: black; extra == 'devel'
Requires-Dist: isort; extra == 'devel'
Requires-Dist: pytest; extra == 'devel'
Requires-Dist: pytest-mock; extra == 'devel'
Requires-Dist: pytest-cov; extra == 'devel'
Requires-Dist: pytest-xdist; extra == 'devel'
Requires-Dist: pytest-clarity; extra == 'devel'
Requires-Dist: sphinx; extra == 'devel'
Requires-Dist: pydata-sphinx-theme; extra == 'devel'
Requires-Dist: toml; extra == 'devel'
Provides-Extra: gcp
Provides-Extra: kafka_schema_registry
Provides-Extra: aws
Provides-Extra: avro
Provides-Extra: deltalake
Provides-Extra: all
Provides-Extra: snowflake
Provides-Extra: mysql
Provides-Extra: devel
License-File: LICENSE.txt
Summary: MetadataGuardian is used to protect data by searching the source metadata.
Keywords: pii,inclusion,biais,metadata_dataguardian,metadata,guardian
Home-Page: https://fvaleye.github.io/metadata-guardian/python
Author: Florian Valeye <fvaleye@github.com>
Author-email: Florian Valeye <fvaleye@github.com>
License: Apache-2.0
Requires-Python: >=3.7
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: documentation, https://fvaleye.github.io/metadata-guardian/python/
Project-URL: repository, https://github.com/fvaleye/metadata-guardian

Metadata Guardian
=================

[![PyPI](https://img.shields.io/pypi/v/metadata_guardian.svg?style=flat-square)](https://pypi.org/project/metadata-guardian/)
[![userdoc](https://img.shields.io/badge/docs-user-blue)](https://fvaleye.github.io/metadata-guardian/python/)
[![apidoc](https://img.shields.io/badge/docs-api-blue)](https://fvaleye.github.io/metadata-guardian/python/api_reference.html)

## Overview

Metadata Guardian is a Python package that provides an easy way to protect your data sources by searching its metadata.
By searching with data rules, it will detect what you are looking to protect.
Using Rust, it makes blazing fast multi-regex matching.

## Usage

Benefit from data sources available of the Python ecosystem while Rust provides fast multi-regex processing with [regex](https://github.com/rust-lang/regex) and parallelizes the process with [rayon](https://github.com/rayon-rs/rayon).

## Data Rules
- [PII](https://github.com/fvaleye/metadata-guardian/blob/main/python/metadata_guardian/rules/pii_rules.yaml)
- [INCLUSION](https://github.com/fvaleye/metadata-guardian/blob/main/python/metadata_guardian/rules/inclusion_rules.yaml)

## Python Development

Install virtualenv:
```sh
make setup-venv
```

Development mode with the library installed in virtualenv:
```sh
make develop
```

Launch the tests:
```sh
make unit-test
```

Format and Runs checks:
```sh
make format
make check-rust
make check-python
```

Build the documentation locally:
```sh
make build-documentation
```

