kedro.contrib.io.azure package¶
AbstractDataSet implementation for reading/writing data to Azure Blob
Storage
Submodules¶
kedro.contrib.io.azure.csv_blob module¶
AbstractDataSet implementation to access CSV files directly from
Microsoft’s Azure blob storage.
-
class
kedro.contrib.io.azure.csv_blob.CSVBlobDataSet(filepath, container_name, credentials, blob_to_text_args=None, blob_from_text_args=None, load_args=None, save_args=None)[source]¶ Bases:
kedro.io.core.AbstractDataSetCSVBlobDataSetloads and saves csv files in Microsoft’s Azure blob storage. It uses azure storage SDK to read and write in azure and pandas to handle the csv file locally.Example:
import pandas as pd data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5], 'col3': [5, 6]}) data_set = CSVBlobDataSet(filepath="test.csv", bucket_name="test_bucket", load_args=None, save_args={"index": False}) data_set.save(data) reloaded = data_set.load() assert data.equals(reloaded)
-
__init__(filepath, container_name, credentials, blob_to_text_args=None, blob_from_text_args=None, load_args=None, save_args=None)[source]¶ Creates a new instance of
CSVBlobDataSetpointing to a concrete csv file on Azure blob storage.Parameters: - filepath (
str) – path to a azure blob of a csv file. - container_name (
str) – Azure container name. - credentials (
Dict[str,Any]) – Credentials (account_nameandaccount_keyorsas_token)to access the azure blob - blob_to_text_args (
Optional[Dict[str,Any]]) – Any additional arguments to pass to azure’sget_blob_to_textmethod: https://docs.microsoft.com/en-us/python/api/azure.storage.blob.baseblobservice.baseblobservice?view=azure-python#get-blob-to-text - blob_from_text_args (
Optional[Dict[str,Any]]) – Any additional arguments to pass to azure’screate_blob_from_textmethod: https://docs.microsoft.com/en-us/python/api/azure.storage.blob.blockblobservice.blockblobservice?view=azure-python#create-blob-from-text - load_args (
Optional[Dict[str,Any]]) – Pandas options for loading csv files. Here you can find all available arguments: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html All defaults are preserved. - save_args (
Optional[Dict[str,Any]]) – Pandas options for saving csv files. Here you can find all available arguments: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html All defaults are preserved, but “index”, which is set to False.
Return type: None- filepath (
-