Metadata Store¶
The metadata store provides centralized schema registry functionality.
Overview¶
from pycharter import SQLiteMetadataStore
store = SQLiteMetadataStore("metadata.db")
store.connect()
schema_id = store.store_schema("user", schema, version="1.0.0")
Store Implementations¶
| Class | Backend | Use Case |
|---|---|---|
InMemoryMetadataStore |
Memory | Testing |
SQLiteMetadataStore |
SQLite | Development |
PostgresMetadataStore |
PostgreSQL | Production |
MongoDBMetadataStore |
MongoDB | Document-oriented |
RedisMetadataStore |
Redis | Caching |
MetadataStoreClient¶
Base interface for all stores:
MetadataStoreClient
¶
Bases: ABC
Client for storing and retrieving metadata. All operations are keyed by data contract (contract_name, contract_version). Implementations extend this for specific backends (PostgreSQL, SQLite, MongoDB, etc.).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
connection_string
|
str | None
|
Database connection string (format depends on implementation) |
None
|
connect
abstractmethod
¶
Establish database connection. Subclasses must implement.
store_schema
abstractmethod
¶
Store a JSON Schema for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
schema
|
dict[str, Any]
|
JSON Schema dictionary (may contain "version" for schema artifact version). |
required |
Returns:
| Type | Description |
|---|---|
str
|
Contract ID or identifier (implementation-defined). |
get_schema
abstractmethod
¶
Retrieve the schema for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Schema dictionary with version included, or None if not found. |
store_coercion_rules
abstractmethod
¶
store_coercion_rules(
contract_name: str,
contract_version: str,
coercion_rules: dict[str, Any],
) -> str
Store coercion rules for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
coercion_rules
|
dict[str, Any]
|
Dict mapping field names to coercion function names. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Coercion rules ID or identifier. |
get_coercion_rules
abstractmethod
¶
Retrieve coercion rules for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Coercion rules dictionary or None if not found. |
store_validation_rules
abstractmethod
¶
store_validation_rules(
contract_name: str,
contract_version: str,
validation_rules: dict[str, Any],
) -> str
Store validation rules for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
validation_rules
|
dict[str, Any]
|
Dict mapping field names to validation configs. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validation rules ID or identifier. |
get_validation_rules
abstractmethod
¶
Retrieve validation rules for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Validation rules dictionary or None if not found. |
store_metadata
abstractmethod
¶
Store metadata for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
metadata
|
dict[str, Any]
|
Metadata dictionary. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Metadata record ID or identifier. |
get_metadata
abstractmethod
¶
Retrieve metadata for the given data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract_name
|
str
|
Data contract name. |
required |
contract_version
|
str
|
Data contract version. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Metadata dictionary or None if not found. |
SQLiteMetadataStore¶
from pycharter import SQLiteMetadataStore
store = SQLiteMetadataStore("metadata.db")
store.connect()
# Store schema
schema_id = store.store_schema("user", {
"type": "object",
"properties": {"name": {"type": "string"}}
}, version="1.0.0")
# Retrieve schema
schema = store.get_schema("user")
PostgresMetadataStore¶
from pycharter import PostgresMetadataStore
store = PostgresMetadataStore(
"postgresql://user:pass@localhost/pycharter"
)
store.connect()
MongoDBMetadataStore¶
from pycharter import MongoDBMetadataStore
store = MongoDBMetadataStore(
"mongodb://localhost:27017/pycharter"
)
store.connect()
RedisMetadataStore¶
from pycharter import RedisMetadataStore
store = RedisMetadataStore("redis://localhost:6379/0")
store.connect()
InMemoryMetadataStore¶
from pycharter import InMemoryMetadataStore
store = InMemoryMetadataStore()
store.connect()
# Data is lost when program exits
Examples¶
Schema Versioning¶
# Store versions
store.store_schema("user", schema_v1, version="1.0.0")
store.store_schema("user", schema_v2, version="2.0.0")
# Get specific version
schema = store.get_schema("user", version="1.0.0")
# Get latest
schema = store.get_schema("user") # Returns 2.0.0
# List versions
versions = store.list_schema_versions("user")
Storing Rules¶
# Coercion rules
store.store_coercion_rules("user", {
"version": "1.0.0",
"rules": {"age": "coerce_to_integer"}
}, version="1.0.0")
# Validation rules
store.store_validation_rules("user", {
"version": "1.0.0",
"rules": {"age": {"is_positive": {}}}
}, version="1.0.0")
Metadata¶
# Store metadata
store.store_metadata("user", "schema", {
"title": "User Schema",
"owner": "data-team",
"tags": ["user", "pii"]
})
# Retrieve
metadata = store.get_metadata("user", "schema")
Ontology (wiki-enabled stores)¶
Stores that support the wiki (e.g. PostgresMetadataStore, InMemoryMetadataStore) can store and retrieve ontology annotations per contract:
# Store ontology for a contract
store.store_ontology("user", "1.0.0", {
"version": "1.0.0",
"fields": {
"email": {"concept": "user_email", "definition": "Primary email address"}
}
})
# Retrieve ontology
ontology = store.get_ontology("user", "1.0.0")
When you call store.get_contract(name, version), the returned contract dict includes ontology when available.