kedro.contrib.decorators package¶
This module contains function decorators, which can be used as Node
decorators. See kedro.pipeline.node.decorate
-
kedro.contrib.decorators.pandas_to_spark(spark)[source]¶ Inspects the decorated function’s inputs and converts all pandas DataFrame inputs to spark DataFrames.
Note that in the example below we have enabled
spark.sql.execution.arrow.enabled. For this to work, you should firstpip install pyarrowand addpyarrowtorequirements.txt. Enabling this option makes the convertion between pyspark <-> DataFrames much faster.Parameters: spark ( SparkSession) –The spark session singleton object to use for the creation of the pySpark DataFrames. A possible pattern you can use here is the following:
spark.py
from pyspark.sql import SparkSession def get_spark(): return ( SparkSession.builder .master("local[*]") .appName("kedro") .config("spark.driver.memory", "4g") .config("spark.driver.maxResultSize", "3g") .config("spark.sql.execution.arrow.enabled", "true") .getOrCreate() )
nodes.py
from spark import get_spark @pandas_to_spark(get_spark()) def node_1(data): data.show() # data is pyspark.sql.DataFrame
Return type: CallableReturns: The original function with any pandas DF inputs translated to spark.
-
kedro.contrib.decorators.retry(exceptions=<class 'Exception'>, n_times=1, delay_sec=0)[source]¶ Catches exceptions from the wrapped function at most n_times and then bundles them and propagates them.
Make sure your function does not mutate the arguments
Parameters: - exceptions (
Exception) – The superclass of exceptions to catch. By default catch all exceptions. - n_times (
int) – At most let the function fail n_times. The bundle the errors and propagate them. By default retry only once. - delay_sec (
float) – Delay between failure and next retry in seconds
Return type: CallableReturns: The original function with retry functionality.
- exceptions (
-
kedro.contrib.decorators.spark_to_pandas()[source]¶ Inspects the decorated function’s inputs and converts all pySpark DataFrame inputs to pandas DataFrames.
Return type: CallableReturns: The original function with any pySpark DF inputs translated to pandas.
Submodules¶
kedro.contrib.decorators.decorators module¶
This module contains function decorators, which can be used as Node
decorators. See kedro.pipeline.node.decorate
-
kedro.contrib.decorators.decorators.pandas_to_spark(spark)[source]¶ Inspects the decorated function’s inputs and converts all pandas DataFrame inputs to spark DataFrames.
Note that in the example below we have enabled
spark.sql.execution.arrow.enabled. For this to work, you should firstpip install pyarrowand addpyarrowtorequirements.txt. Enabling this option makes the convertion between pyspark <-> DataFrames much faster.Parameters: spark ( SparkSession) –The spark session singleton object to use for the creation of the pySpark DataFrames. A possible pattern you can use here is the following:
spark.py
from pyspark.sql import SparkSession def get_spark(): return ( SparkSession.builder .master("local[*]") .appName("kedro") .config("spark.driver.memory", "4g") .config("spark.driver.maxResultSize", "3g") .config("spark.sql.execution.arrow.enabled", "true") .getOrCreate() )
nodes.py
from spark import get_spark @pandas_to_spark(get_spark()) def node_1(data): data.show() # data is pyspark.sql.DataFrame
Return type: CallableReturns: The original function with any pandas DF inputs translated to spark.
-
kedro.contrib.decorators.decorators.retry(exceptions=<class 'Exception'>, n_times=1, delay_sec=0)[source]¶ Catches exceptions from the wrapped function at most n_times and then bundles them and propagates them.
Make sure your function does not mutate the arguments
Parameters: - exceptions (
Exception) – The superclass of exceptions to catch. By default catch all exceptions. - n_times (
int) – At most let the function fail n_times. The bundle the errors and propagate them. By default retry only once. - delay_sec (
float) – Delay between failure and next retry in seconds
Return type: CallableReturns: The original function with retry functionality.
- exceptions (