API Reference
Complete reference documentation for PyCharter's Python API.
Module Overview
pycharter
├── Pipeline # ETL pipeline orchestration
├── Validator # Data validation
├── QualityCheck # Quality monitoring
├── etl_generator/ # ETL components
│ ├── extractors/ # Data extraction
│ ├── transformers/ # Data transformation
│ ├── loaders/ # Data loading
│ ├── state # Incremental extraction state stores
│ └── testing # Mock components and test harness
├── contract_parser/ # Contract parsing
├── contract_builder/ # Contract building
├── metadata_store/ # Schema registry
├── schema_evolution/ # Schema versioning
├── pydantic_generator/ # Model generation
├── json_schema_converter/ # Schema conversion
├── docs_generator/ # Contract documentation generation
├── domain/ # Lifecycle binding (FSM integration)
├── wiki/ # Ontology, knowledge graph, governance
└── shared/ # Utilities and errors
Quick Links
Core Classes
ETL Components
Contract Management
Storage
Utilities & Extensions
Import Patterns
Recommended Imports
# Core classes
from pycharter import Pipeline , Validator , QualityCheck
# ETL components
from pycharter import (
HTTPExtractor , FileExtractor , DatabaseExtractor , CloudStorageExtractor ,
Rename , Filter , AddField , Drop , Select , Convert , CustomFunction ,
PostgresLoader , FileLoader , CloudStorageLoader ,
)
# Metadata stores
from pycharter import (
InMemoryMetadataStore ,
SQLiteMetadataStore ,
PostgresMetadataStore ,
MongoDBMetadataStore ,
RedisMetadataStore ,
)
# Convenience functions
from pycharter import (
from_dict , from_file , from_json ,
to_dict , to_file , to_json ,
validate , validate_batch ,
parse_contract_file , build_contract ,
)
# Errors
from pycharter.shared.errors import (
PyCharterError ,
ConfigError ,
ConfigValidationError ,
ExpressionError ,
)
Type Annotations
PyCharter is fully typed with py.typed marker:
from pycharter import Validator , ValidationResult
def process_data ( validator : Validator , data : dict ) -> ValidationResult :
return validator . validate ( data )
Async Support
All pipeline operations are async:
import asyncio
from pycharter import Pipeline
# From script
result = asyncio . run ( pipeline . run ())
# From async function
async def main ():
result = await pipeline . run ()
return result
See the Async Execution Model guide for detailed guidance on running pipelines from scripts, FastAPI, notebooks, and Celery.