kedro.extras.datasets.api.APIDataSet¶
-
class
kedro.extras.datasets.api.APIDataSet(url, method='GET', data=None, params=None, headers=None, auth=None, json=None, timeout=60)[source]¶ Bases:
kedro.io.core.AbstractDataSetAPIDataSetloads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/master/Example:
from kedro.extras.datasets.api import APIDataSet data_set = APIDataSet( url="https://quickstats.nass.usda.gov" params={ "key": "SOME_TOKEN", "format": "JSON", "commodity_desc": "CORN", "statisticcat_des": "YIELD", "agg_level_desc": "STATE", "year": 2000 } ) data = data_set.load()
Methods
APIDataSet.__init__(url[, method, data, …])Creates a new instance of APIDataSetto fetch data from an API endpoint.APIDataSet.exists()Checks whether a data set’s output already exists by calling the provided _exists() method. APIDataSet.from_config(name, config[, …])Create a data set instance using the configuration provided. APIDataSet.load()Loads data by delegation to the provided load method. APIDataSet.release()Release any cached data. APIDataSet.save(data)Saves data by delegation to the provided save method. -
__init__(url, method='GET', data=None, params=None, headers=None, auth=None, json=None, timeout=60)[source]¶ Creates a new instance of
APIDataSetto fetch data from an API endpoint.Parameters: - url (
str) – The API URL endpoint. - method (
str) – The Method of the request, GET, POST, PUT, DELETE, HEAD, etc… - data (
Optional[Any]) – The request payload, used for POST, PUT, etc requests https://requests.readthedocs.io/en/master/user/quickstart/#more-complicated-post-requests - params (
Optional[Dict[str,Any]]) – The url parameters of the API. https://requests.readthedocs.io/en/master/user/quickstart/#passing-parameters-in-urls - headers (
Optional[Dict[str,Any]]) – The HTTP headers. https://requests.readthedocs.io/en/master/user/quickstart/#custom-headers - auth (
Union[Tuple[str],AuthBase,None]) – Anythingrequestsaccepts. Normally it’s either('login', 'password'), orAuthBase,HTTPBasicAuthinstance for more complex cases. - json (
Union[List[~T],Dict[str,Any],None]) – The request payload, used for POST, PUT, etc requests, passed in to the json kwarg in the requests object. https://requests.readthedocs.io/en/master/user/quickstart/#more-complicated-post-requests - timeout (
int) – The wait time in seconds for a response, defaults to 1 minute. https://requests.readthedocs.io/en/master/user/quickstart/#timeouts
Return type: None- url (
-
exists()¶ Checks whether a data set’s output already exists by calling the provided _exists() method.
Return type: boolReturns: Flag indicating whether the output already exists. Raises: DataSetError– when underlying exists method raises error.
-
classmethod
from_config(name, config, load_version=None, save_version=None)¶ Create a data set instance using the configuration provided.
Parameters: - name (
str) – Data set name. - config (
Dict[str,Any]) – Data set config dictionary. - load_version (
Optional[str]) – Version string to be used forloadoperation if the data set is versioned. Has no effect on the data set if versioning was not enabled. - save_version (
Optional[str]) – Version string to be used forsaveoperation if the data set is versioned. Has no effect on the data set if versioning was not enabled.
Return type: AbstractDataSetReturns: An instance of an
AbstractDataSetsubclass.Raises: DataSetError– When the function fails to create the data set from its config.- name (
-
load()¶ Loads data by delegation to the provided load method.
Return type: AnyReturns: Data returned by the provided load method. Raises: DataSetError– When underlying load method raises error.
-
release()¶ Release any cached data.
Raises: DataSetError– when underlying release method raises error.Return type: None
-
save(data)¶ Saves data by delegation to the provided save method.
Parameters: data ( Any) – the value to be saved by provided save method.Raises: DataSetError– when underlying save method raises error.Return type: None
-