Metadata-Version: 2.3
Name: klondike
Version: 0.2.3
Summary: Klondike is a suite of database connectors, powered by Polars
License: MIT
Keywords: polars,bigquery,snowflake,redshift
Author: Ian Ferguson
Author-email: IANFERGUSONRVA@gmail.com
Requires-Python: >=3.8,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: google-cloud-bigquery (==3.19.0)
Requires-Dist: polars (==0.20.16)
Requires-Dist: pyarrow (==15.0.2)
Requires-Dist: snowflake-connector-python (==3.10.1)
Project-URL: Repository, https://github.com/IanRFerguson/klondike
Description-Content-Type: text/markdown

# Klondike

Klondike offers a lightweight API to read and write data to Google BigQuery using Polars DataFrames.

## Installation

### Installing Klondike
Install at the command line

```
pip install klondike
```

### Installing Rust
Since Polars leverages Rust speedups, you need to have Rust installed in your environment as well. See the Rust installation guide [here](https://www.rust-lang.org/tools/install).


## Usage

In this demo we'll connect to Google BigQuery, read data, transform it, and write it back to the data warehouse.

First, connect to the BigQuery warehouse by supplying the `BigQueryConnector()` object with the relative path to your service account credentials.

```
from klondike import BigQueryConnector

# Connect Klondike to Google BigQuery
bq = BigQueryConnector(
    app_creds="dev/nba-player-service.json"
)
```

Next, supply the object with a SQL query in the `read_dataframe()` function to redner a `DataFrame` object:

```
# Write some valid SQL
sql = """
SELECT
    *
FROM nba_dbt.cln_player__averages
ORDER BY avg_points DESC
"""


# Pull BigQuery data into a Polars DataFrame
nyk = bq.read_dataframe(sql=sql)
```

Now that your data is pulled into a local instance, you can clean and transform it using standard Polars functionality - [see the docs](https://docs.pola.rs/py-polars/html/reference/dataframe/index.html) for more information.

```
# Perform some transformations
key_metrics = [
    "avg_offensive_rating",
    "avg_defensive_rating",
    "avg_effective_field_goal_percentage"
]

summary_stats = nyk[key_metrics].describe()
```

Finally, push your transformed data back to the BigQuery warehouse using the `write_dataframe()` function:

```
# Write back to BigQuery
bq.write_dataframe(
    df=summary_stats,
    table_name="nba_dbt.summary_statistics",
    if_exists="truncate"
)
```

