Contributing to PyCharter¶
Thank you for your interest in contributing to PyCharter! This guide will help you get started.
Code of Conduct¶
By participating in this project, you agree to abide by our Code of Conduct. Please be respectful and constructive in all interactions.
Getting Started¶
Development Setup¶
- Fork and clone the repository
- Create a virtual environment
- Install development dependencies
- Install pre-commit so Black and isort run on every commit (recommended)
Without this step, formatting hooks will not run when you commit, and CI may fail with "would reformat" errors.
- Run tests to verify setup
Project Structure¶
pycharter/
├── src/pycharter/ # Main package
│ ├── api/ # REST API
│ ├── etl_generator/ # ETL pipelines
│ ├── quality/ # Quality monitoring
│ ├── runtime_validator/ # Validation
│ └── ...
├── tests/ # Test suite
├── docs/ # Documentation
└── docs/notebooks/ # Example Jupyter notebooks
Branch model and releases¶
We use a simple Git flow so anyone can contribute without learning a heavy process:
- Branches: Work on feature branches, merge into develop, then develop → main.
- Releases: Done from main. Pushing a version tag (e.g.
v0.0.33) triggers the release pipeline: tests run, the package is built and published to PyPI, and a GitHub Release is created from that tag.
To release a new version (from main):
- Merge develop into main and ensure main is in a good state.
- Create and push a tag (the tag name is the version used for PyPI and the release):
- The CI workflow will run tests, build, publish to PyPI, and create the GitHub Release. No need to edit
pyproject.tomlfor the version when using tags—the tag is the source of truth for that run.
Optionally, keep pyproject.toml’s version in sync with the latest release so local installs and docs show the right number. You can use ./bin/release.sh 0.0.33 to bump version, tag, and push (the workflow still uses the tag for the published version).
Making Changes¶
Branch Naming¶
Use descriptive branch names:
feature/add-kafka-extractorfix/validation-error-handlingdocs/improve-etl-tutorial
Coding Standards¶
Python Style¶
We use Black and isort for formatting:
# Format code (paths and config must match CI)
black --config pyproject.toml src/pycharter tests
isort --settings-path pyproject.toml src/pycharter tests
# Check formatting (run before pushing to avoid CI failures)
black --check --config pyproject.toml src/pycharter tests
isort --check --settings-path pyproject.toml src/pycharter tests
Pre-commit (recommended) — So Black and isort run automatically on each commit, install the hooks once (see step 4 in Development Setup):
After that, Black and isort run on src/pycharter and tests when you commit. To run the hooks manually: pre-commit run --all-files.
Type Hints¶
All public functions should have type hints:
def validate_data(
schema: Dict[str, Any],
data: List[Dict[str, Any]],
strict: bool = False
) -> List[ValidationResult]:
"""Validate data against schema."""
...
Docstrings¶
Use Google-style docstrings:
def process_record(record: Dict[str, Any], options: Options) -> Dict[str, Any]:
"""Process a single record.
Args:
record: The input record to process.
options: Processing options.
Returns:
The processed record.
Raises:
ValueError: If record is invalid.
"""
...
Testing¶
Running Tests¶
# All tests
pytest
# Specific file
pytest tests/test_validator.py
# With coverage
pytest --cov=pycharter --cov-report=html
# Only fast tests
pytest -m "not slow"
Writing Tests¶
- Place tests in
tests/directory - Name test files
test_*.py - Name test functions
test_* - Use fixtures from
conftest.py
import pytest
from pycharter import Validator
class TestValidator:
def test_validate_valid_data(self, sample_schema):
validator = Validator.from_dict(schema=sample_schema)
result = validator.validate({"name": "Alice"})
assert result.is_valid
def test_validate_invalid_data(self, sample_schema):
validator = Validator.from_dict(schema=sample_schema)
result = validator.validate({"name": ""})
assert not result.is_valid
Documentation¶
Building Docs¶
# Install docs dependencies
pip install pycharter[docs]
# or: pip install mkdocs-material mkdocstrings[python]
# Serve locally (from project root, or set DOCS_ROOT)
pycharter docs serve
# or: mkdocs serve
# Build
pycharter docs build
# or: mkdocs build
Documentation Style¶
- Use clear, concise language
- Include code examples
- Link to related documentation
- Keep tutorials beginner-friendly
Documentation policy (single source of truth)¶
- Python API: Document in docstrings; MkDocs pulls from the library. New public APIs must have docstrings and a place in
docs/api/. Do not duplicate Python API prose on the in-app Documentation page. - REST API: Document via FastAPI route
summary/descriptionand models; Swagger is the reference. New endpoints are added in code; the in-app API Playground builds its list fromGET /api/v1/docs/playground-routes(OpenAPI). Do not maintain a separate hand-written list of REST endpoints in the UI. - UI: The Documentation page (nav: "API Playground") is a playground and links only: try REST endpoints, link to the full Python API (MkDocs site), and link to Swagger. No long prose about endpoints or Python API on that page.
Submitting Changes¶
Pull Request Process¶
- Create a feature branch
-
Make your changes
-
Write code
- Add tests
-
Update documentation
-
Run checks
# Format (paths match CI)
black src/pycharter tests
isort src/pycharter tests
# Lint
mypy pycharter
# Test
pytest
- Commit with descriptive message
git add .
git commit -m "Add Kafka extractor for streaming data sources
- Implement KafkaExtractor class
- Add consumer group support
- Include comprehensive tests
- Update documentation"
- Push and create PR
Then create a Pull Request on GitHub.
PR Guidelines¶
- Title: Clear, descriptive summary
- Description: Explain what and why
- Tests: Include tests for new functionality
- Docs: Update relevant documentation
- Breaking changes: Clearly mark if applicable
PR Template¶
## Summary
Brief description of changes.
## Changes
- Added X
- Fixed Y
- Updated Z
## Testing
How to test these changes.
## Checklist
- [ ] Tests pass
- [ ] Documentation updated
- [ ] Code formatted
- [ ] No breaking changes (or documented)
Types of Contributions¶
Bug Reports¶
Open an issue with:
- PyCharter version
- Python version
- Steps to reproduce
- Expected vs actual behavior
- Error messages/stack traces
Feature Requests¶
Open an issue with:
- Use case description
- Proposed solution
- Alternatives considered
Code Contributions¶
Good first issues are labeled good-first-issue.
Areas we'd love help with:
- New extractors (Kafka, Snowflake, etc.)
- New loaders (BigQuery, Redshift, etc.)
- Documentation improvements
- Test coverage
- Performance optimizations
Documentation¶
- Fix typos
- Improve explanations
- Add examples
- Translate documentation
Maintainer Guides¶
- Publishing to PyPI – PyPI publishing via GitHub Releases and Trusted Publishing
- Release Workflow – Step-by-step release commands and the
bin/release.shscript
Getting Help¶
- GitHub Issues: Bug reports, feature requests
- GitHub Discussions: Questions, ideas
- Email: maintainers@example.com
Recognition¶
Contributors are recognized in:
- GitHub Contributors page
- Release notes
- Documentation credits
Thank you for contributing to PyCharter! 🎉