Kedro plugins¶
Note: This documentation is based on
Kedro 0.17.1, if you spot anything that is incorrect then please create an issue or pull request.
Kedro plugins allow you to create new features for Kedro and inject additional commands into the CLI. Plugins are developed as separate Python packages that exist outside of any Kedro project.
Overview¶
Kedro uses setuptools, which is a collection of enhancements to the Python distutils to allow developers to build and distribute Python packages. Kedro uses various entry points in pkg_resources to provide plugin functionality.
Example of a simple plugin¶
Here is a simple example of a plugin that prints the pipeline as JSON:
kedrojson/plugin.py
import click
from kedro.framework.session import KedroSession
@click.group(name="JSON")
def commands():
""" Kedro plugin for printing the pipeline in JSON format """
pass
@commands.command()
@click.pass_obj
def to_json(metadata):
""" Display the pipeline in JSON format """
session = KedroSession.create(metadata.package_name)
context = session.load_context()
print(context.pipeline.to_json())
The plugin provides the following entry_points config in setup.py:
setup(
entry_points={
"kedro.project_commands": ["kedrojson = kedrojson.plugin:commands"],
}
)
Once the plugin is installed, you can run it as follows:
kedro to_json
Working with click¶
Commands must be provided as click Groups
The click Group will be merged into the main CLI Group. In the process, the options on the group are lost, as is any processing that was done as part of its callback function.
Project context¶
When they run, plugins may request information about the current project by creating a session and loading its context:
from pathlib import Path
from kedro.framework.startup import _get_project_metadata
from kedro.framework.session import KedroSession
project_path = Path.cwd()
metadata = _get_project_metadata(project_path)
session = KedroSession.create(metadata.package_name, project_path)
context = session.load_context()
Initialisation¶
If the plugin initialisation needs to occur prior to Kedro starting, it can declare the entry_point key kedro.init. This entry point must refer to a function that currently has no arguments, but for future proofing you should declare it with **kwargs.
global and project commands¶
Plugins may also add commands to the Kedro CLI, which supports two types of commands:
global - available both inside and outside a Kedro project. Global commands use the
entry_pointkeykedro.global_commands.project - available only when a Kedro project is detected in the current directory. Project commands use the
entry_pointkeykedro.project_commands.
Suggested command convention¶
We use the following command convention: kedro <plugin-name> <command>, with kedro <plugin-name> acting as a top-level command group. This is our suggested way of structuring your plugin bit it is not necessary for your plugin to work.
Hooks¶
You can develop hook implementations and have them automatically registered to the project context when the plugin is installed. To enable this for your custom plugin, simply add the following entry in your setup.py:
setup(
entry_points={"kedro.hooks": ["plugin_name = plugin_name.plugin:hooks"]},
)
where plugin.py is the module where you declare hook implementations:
import logging
from kedro.framework.hooks import hook_impl
class MyHooks:
@hook_impl
def after_catalog_created(self, catalog): # pylint: disable=unused-argument
logging.info("Reached after_catalog_created hook")
hooks = MyHooks()
Note: Here,
hooksshould be an instance of the class defining the hooks.
Contributing process¶
When you are ready to submit your code:
Create a separate repository using our naming convention for
plugins (kedro-<plugin-name>)Choose a command approach:
globaland / orprojectcommands:All
globalcommands should be provided as a singleclickgroupAll
projectcommands should be provided as anotherclickgroupThe
clickgroups are declared through thepkg_resourcesentry_point system
Include a
README.mddescribing your plugin’s functionality and all dependencies that should be includedUse GitHub tagging to tag your plugin as a
kedro-pluginso that we can find it
Supported Kedro plugins¶
Kedro-Docker, a tool for packaging and shipping Kedro projects within containers
Kedro-Airflow, a tool for converting your Kedro project into an Airflow project
Kedro-Viz, a tool for visualising your Kedro pipelines
Community-developed plugins¶
Note: See the full list of plugins using the GitHub tag kedro-plugin. Note: Your plugin needs to have an Apache 2.0 compatible license to be considered for this list.
Kedro-Pandas-Profiling, by Justin Malloy, uses Pandas Profiling to profile datasets in the Kedro catalog
find-kedro, by Waylon Walker, automatically constructs pipelines using
pytest-style pattern matchingkedro-static-viz, by Waylon Walker, generates a static Kedro-Viz site (HTML, CSS, JS)
steel-toes, by Waylon Walker, prevents stepping on toes by automatically branching data paths
kedro-wings, by Tam-Sanh Nguyen, simplifies and speeds up pipeline creation by auto-generating catalog datasets
kedro-great, by Tam-Sanh Nguyen, integrates Kedro with Great Expectations, enabling catalog-based expectation generation and data validation on pipeline run
Kedro-Accelerator, by Deepyaman Datta, speeds up pipelines by parallelizing I/O in the background
kedro-dataframe-dropin, by Zain Patel, lets you swap out pandas datasets for modin or RAPIDs equivalents for specialised use to speed up your workflows (e.g on GPUs)
kedro-kubeflow, by Mateusz Pytel and Mariusz Strzelecki, lets you run and schedule pipelines on Kubernetes clusters using Kubeflow Pipelines
kedro-mlflow, by Yolan Honoré-Rougé, Kajetan Maurycy Olszewski, and Takieddine Kadiri facilitates Mlflow integration inside Kedro projects while enforcing Kedro’s principles. Its main features are modular configuration, automatic parameters tracking, datasets versioning, Kedro pipelines packaging and serving and automatic synchronization between training and inference pipelines for high reproducibility of machine learning experiments and ease of deployment. A tutorial is provided in the kedro-mlflow-tutorial repo.