Skip to content

Metadata Store

The metadata store provides centralized schema registry functionality.

Overview

from pycharter import SQLiteMetadataStore

store = SQLiteMetadataStore("metadata.db")
store.connect()

schema_id = store.store_schema("user", schema, version="1.0.0")

Store Implementations

Class Backend Use Case
InMemoryMetadataStore Memory Testing
SQLiteMetadataStore SQLite Development
PostgresMetadataStore PostgreSQL Production
MongoDBMetadataStore MongoDB Document-oriented
RedisMetadataStore Redis Caching

MetadataStoreClient

Base interface for all stores:

MetadataStoreClient

MetadataStoreClient(connection_string: str | None = None)

Bases: ABC

Client for storing and retrieving metadata. All operations are keyed by data contract (contract_name, contract_version). Implementations extend this for specific backends (PostgreSQL, SQLite, MongoDB, etc.).

Parameters:

Name Type Description Default
connection_string str | None

Database connection string (format depends on implementation)

None

connect abstractmethod

connect() -> None

Establish database connection. Subclasses must implement.

disconnect

disconnect() -> None

Close database connection.

store_schema abstractmethod

store_schema(
    contract_name: str,
    contract_version: str,
    schema: dict[str, Any],
) -> str

Store a JSON Schema for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required
schema dict[str, Any]

JSON Schema dictionary (may contain "version" for schema artifact version).

required

Returns:

Type Description
str

Contract ID or identifier (implementation-defined).

get_schema abstractmethod

get_schema(
    contract_name: str, contract_version: str
) -> dict[str, Any]

Retrieve the schema for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required

Returns:

Type Description
dict[str, Any]

Schema dictionary with version included, or None if not found.

store_coercion_rules abstractmethod

store_coercion_rules(
    contract_name: str,
    contract_version: str,
    coercion_rules: dict[str, Any],
) -> str

Store coercion rules for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required
coercion_rules dict[str, Any]

Dict mapping field names to coercion function names.

required

Returns:

Type Description
str

Coercion rules ID or identifier.

get_coercion_rules abstractmethod

get_coercion_rules(
    contract_name: str, contract_version: str
) -> dict[str, Any]

Retrieve coercion rules for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required

Returns:

Type Description
dict[str, Any]

Coercion rules dictionary or None if not found.

store_validation_rules abstractmethod

store_validation_rules(
    contract_name: str,
    contract_version: str,
    validation_rules: dict[str, Any],
) -> str

Store validation rules for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required
validation_rules dict[str, Any]

Dict mapping field names to validation configs.

required

Returns:

Type Description
str

Validation rules ID or identifier.

get_validation_rules abstractmethod

get_validation_rules(
    contract_name: str, contract_version: str
) -> dict[str, Any]

Retrieve validation rules for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required

Returns:

Type Description
dict[str, Any]

Validation rules dictionary or None if not found.

store_metadata abstractmethod

store_metadata(
    contract_name: str,
    contract_version: str,
    metadata: dict[str, Any],
) -> str

Store metadata for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required
metadata dict[str, Any]

Metadata dictionary.

required

Returns:

Type Description
str

Metadata record ID or identifier.

get_metadata abstractmethod

get_metadata(
    contract_name: str, contract_version: str
) -> dict[str, Any]

Retrieve metadata for the given data contract.

Parameters:

Name Type Description Default
contract_name str

Data contract name.

required
contract_version str

Data contract version.

required

Returns:

Type Description
dict[str, Any]

Metadata dictionary or None if not found.

SQLiteMetadataStore

from pycharter import SQLiteMetadataStore

store = SQLiteMetadataStore("metadata.db")
store.connect()

# Store schema
schema_id = store.store_schema("user", {
    "type": "object",
    "properties": {"name": {"type": "string"}}
}, version="1.0.0")

# Retrieve schema
schema = store.get_schema("user")

PostgresMetadataStore

from pycharter import PostgresMetadataStore

store = PostgresMetadataStore(
    "postgresql://user:pass@localhost/pycharter"
)
store.connect()

MongoDBMetadataStore

from pycharter import MongoDBMetadataStore

store = MongoDBMetadataStore(
    "mongodb://localhost:27017/pycharter"
)
store.connect()

RedisMetadataStore

from pycharter import RedisMetadataStore

store = RedisMetadataStore("redis://localhost:6379/0")
store.connect()

InMemoryMetadataStore

from pycharter import InMemoryMetadataStore

store = InMemoryMetadataStore()
store.connect()
# Data is lost when program exits

Examples

Schema Versioning

# Store versions
store.store_schema("user", schema_v1, version="1.0.0")
store.store_schema("user", schema_v2, version="2.0.0")

# Get specific version
schema = store.get_schema("user", version="1.0.0")

# Get latest
schema = store.get_schema("user")  # Returns 2.0.0

# List versions
versions = store.list_schema_versions("user")

Storing Rules

# Coercion rules
store.store_coercion_rules("user", {
    "version": "1.0.0",
    "rules": {"age": "coerce_to_integer"}
}, version="1.0.0")

# Validation rules
store.store_validation_rules("user", {
    "version": "1.0.0",
    "rules": {"age": {"is_positive": {}}}
}, version="1.0.0")

Metadata

# Store metadata
store.store_metadata("user", "schema", {
    "title": "User Schema",
    "owner": "data-team",
    "tags": ["user", "pii"]
})

# Retrieve
metadata = store.get_metadata("user", "schema")

Ontology (wiki-enabled stores)

Stores that support the wiki (e.g. PostgresMetadataStore, InMemoryMetadataStore) can store and retrieve ontology annotations per contract:

# Store ontology for a contract
store.store_ontology("user", "1.0.0", {
    "version": "1.0.0",
    "fields": {
        "email": {"concept": "user_email", "definition": "Primary email address"}
    }
})

# Retrieve ontology
ontology = store.get_ontology("user", "1.0.0")

When you call store.get_contract(name, version), the returned contract dict includes ontology when available.

See Also