kedro.io.SQLQueryDataSet¶
-
class
kedro.io.SQLQueryDataSet(sql, credentials, load_args=None)[source]¶ SQLQueryDataSetloads data from a provided SQL query. It usespandas.DataFrameinternally, so it supports all allowed pandas options onread_sql_query.It does not support save method so it is a read only data set. To save data to a SQL server use
SQLTableDataSet.Example:
from kedro.io import SQLQueryDataSet import pandas as pd data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5], 'col3': [5, 6]}) sql="SELECT * FROM table_a" credentials = { con: "postgresql://scott:tiger@localhost/test" } data_set = SQLQueryDataSet(sql=sql, credentials=credentials) sql_data = data_set.load()
-
__init__(sql, credentials, load_args=None)[source]¶ Creates a new
SQLQueryDataSet.Parameters: - sql (
str) – The sql query statement. - credentials (
Dict[str,Any]) – A dictionary with aSQLAlchemyconnection string. Users are supposed to provide the connection string ‘con’ through credentials. It overwrites con parameter inload_argsandsave_argsin case it is provided. - load_args (
Optional[Dict[str,Any]]) – Provided to underlying pandasread_sql_queryfunction along with the connection string. To find all supported arguments, see here: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql_query.html
Raises: DataSetError– When eithersqlorconparameters is emtpy.Return type: None- sql (
Methods
__init__(sql, credentials[, load_args])Creates a new SQLQueryDataSet.from_config(name, config[, load_version, …])Create a data set instance using the configuration provided. load()Loads data by delegation to the provided load method. save(data)Saves data by delegation to the provided save method. -