Skip to content

Contributing to PyCharter

Thank you for your interest in contributing to PyCharter! This guide will help you get started.

Code of Conduct

By participating in this project, you agree to abide by our Code of Conduct. Please be respectful and constructive in all interactions.

Getting Started

Development Setup

  1. Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/pycharter.git
cd pycharter
  1. Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install development dependencies
pip install -e ".[dev,api,ui,etl]"
  1. Install pre-commit so Black and isort run on every commit (recommended)
pre-commit install

Without this step, formatting hooks will not run when you commit, and CI may fail with "would reformat" errors.

  1. Run tests to verify setup
pytest

Project Structure

pycharter/
├── src/pycharter/       # Main package
│   ├── api/            # REST API
│   ├── etl_generator/  # ETL pipelines
│   ├── quality/        # Quality monitoring
│   ├── runtime_validator/  # Validation
│   └── ...
├── tests/              # Test suite
├── docs/               # Documentation
└── docs/notebooks/     # Example Jupyter notebooks

Branch model and releases

We use a simple Git flow so anyone can contribute without learning a heavy process:

  • Branches: Work on feature branches, merge into develop, then developmain.
  • Releases: Done from main. Pushing a version tag (e.g. v0.0.33) triggers the release pipeline: tests run, the package is built and published to PyPI, and a GitHub Release is created from that tag.

To release a new version (from main):

  1. Merge develop into main and ensure main is in a good state.
  2. Create and push a tag (the tag name is the version used for PyPI and the release):
    git tag v0.0.33
    git push origin v0.0.33
    
  3. The CI workflow will run tests, build, publish to PyPI, and create the GitHub Release. No need to edit pyproject.toml for the version when using tags—the tag is the source of truth for that run.

Optionally, keep pyproject.toml’s version in sync with the latest release so local installs and docs show the right number. You can use ./bin/release.sh 0.0.33 to bump version, tag, and push (the workflow still uses the tag for the published version).

Making Changes

Branch Naming

Use descriptive branch names:

  • feature/add-kafka-extractor
  • fix/validation-error-handling
  • docs/improve-etl-tutorial

Coding Standards

Python Style

We use Black and isort for formatting:

# Format code (paths and config must match CI)
black --config pyproject.toml src/pycharter tests
isort --settings-path pyproject.toml src/pycharter tests

# Check formatting (run before pushing to avoid CI failures)
black --check --config pyproject.toml src/pycharter tests
isort --check --settings-path pyproject.toml src/pycharter tests

Pre-commit (recommended) — So Black and isort run automatically on each commit, install the hooks once (see step 4 in Development Setup):

pre-commit install

After that, Black and isort run on src/pycharter and tests when you commit. To run the hooks manually: pre-commit run --all-files.

Type Hints

All public functions should have type hints:

def validate_data(
    schema: Dict[str, Any],
    data: List[Dict[str, Any]],
    strict: bool = False
) -> List[ValidationResult]:
    """Validate data against schema."""
    ...

Docstrings

Use Google-style docstrings:

def process_record(record: Dict[str, Any], options: Options) -> Dict[str, Any]:
    """Process a single record.

    Args:
        record: The input record to process.
        options: Processing options.

    Returns:
        The processed record.

    Raises:
        ValueError: If record is invalid.
    """
    ...

Testing

Running Tests

# All tests
pytest

# Specific file
pytest tests/test_validator.py

# With coverage
pytest --cov=pycharter --cov-report=html

# Only fast tests
pytest -m "not slow"

Writing Tests

  • Place tests in tests/ directory
  • Name test files test_*.py
  • Name test functions test_*
  • Use fixtures from conftest.py
import pytest
from pycharter import Validator

class TestValidator:
    def test_validate_valid_data(self, sample_schema):
        validator = Validator.from_dict(schema=sample_schema)
        result = validator.validate({"name": "Alice"})
        assert result.is_valid

    def test_validate_invalid_data(self, sample_schema):
        validator = Validator.from_dict(schema=sample_schema)
        result = validator.validate({"name": ""})
        assert not result.is_valid

Documentation

Building Docs

# Install docs dependencies
pip install pycharter[docs]
# or: pip install mkdocs-material mkdocstrings[python]

# Serve locally (from project root, or set DOCS_ROOT)
pycharter docs serve
# or: mkdocs serve

# Build
pycharter docs build
# or: mkdocs build

Documentation Style

  • Use clear, concise language
  • Include code examples
  • Link to related documentation
  • Keep tutorials beginner-friendly

Documentation policy (single source of truth)

  • Python API: Document in docstrings; MkDocs pulls from the library. New public APIs must have docstrings and a place in docs/api/. Do not duplicate Python API prose on the in-app Documentation page.
  • REST API: Document via FastAPI route summary/description and models; Swagger is the reference. New endpoints are added in code; the in-app API Playground builds its list from GET /api/v1/docs/playground-routes (OpenAPI). Do not maintain a separate hand-written list of REST endpoints in the UI.
  • UI: The Documentation page (nav: "API Playground") is a playground and links only: try REST endpoints, link to the full Python API (MkDocs site), and link to Swagger. No long prose about endpoints or Python API on that page.

Submitting Changes

Pull Request Process

  1. Create a feature branch
git checkout -b feature/my-feature
  1. Make your changes

  2. Write code

  3. Add tests
  4. Update documentation

  5. Run checks

# Format (paths match CI)
black src/pycharter tests
isort src/pycharter tests

# Lint
mypy pycharter

# Test
pytest
  1. Commit with descriptive message
git add .
git commit -m "Add Kafka extractor for streaming data sources

- Implement KafkaExtractor class
- Add consumer group support
- Include comprehensive tests
- Update documentation"
  1. Push and create PR
git push origin feature/my-feature

Then create a Pull Request on GitHub.

PR Guidelines

  • Title: Clear, descriptive summary
  • Description: Explain what and why
  • Tests: Include tests for new functionality
  • Docs: Update relevant documentation
  • Breaking changes: Clearly mark if applicable

PR Template

## Summary

Brief description of changes.

## Changes

- Added X
- Fixed Y
- Updated Z

## Testing

How to test these changes.

## Checklist

- [ ] Tests pass
- [ ] Documentation updated
- [ ] Code formatted
- [ ] No breaking changes (or documented)

Types of Contributions

Bug Reports

Open an issue with:

  • PyCharter version
  • Python version
  • Steps to reproduce
  • Expected vs actual behavior
  • Error messages/stack traces

Feature Requests

Open an issue with:

  • Use case description
  • Proposed solution
  • Alternatives considered

Code Contributions

Good first issues are labeled good-first-issue.

Areas we'd love help with:

  • New extractors (Kafka, Snowflake, etc.)
  • New loaders (BigQuery, Redshift, etc.)
  • Documentation improvements
  • Test coverage
  • Performance optimizations

Documentation

  • Fix typos
  • Improve explanations
  • Add examples
  • Translate documentation

Maintainer Guides

Getting Help

Recognition

Contributors are recognized in:

  • GitHub Contributors page
  • Release notes
  • Documentation credits

Thank you for contributing to PyCharter! 🎉