Metadata-Version: 2.4
Name: jett
Version: 0.0.2
Summary: just a engine template tool
Project-URL: Homepage, https://github.com/ddeutils/jett/
Project-URL: Source Code, https://github.com/ddeutils/jett/
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: click==8.1.8
Requires-Dist: ddeutil-io[toml,yaml]==0.2.17
Requires-Dist: pydantic==2.11.7
Requires-Dist: python-dotenv==1.1.1
Requires-Dist: requests==2.32.4
Provides-Extra: arrow
Requires-Dist: pyarrow==21.0.0; extra == 'arrow'
Provides-Extra: daft
Requires-Dist: daft==0.5.21; extra == 'daft'
Provides-Extra: dev
Requires-Dist: clishelf>=0.2.22; extra == 'dev'
Requires-Dist: coverage>=7.10.0; extra == 'dev'
Requires-Dist: pre-commit>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=8.4.0; extra == 'dev'
Provides-Extra: duckdb
Requires-Dist: duckdb==1.3.2; extra == 'duckdb'
Provides-Extra: polars
Requires-Dist: polars==1.32.0; extra == 'polars'
Requires-Dist: pyiceberg==0.9.1; extra == 'polars'
Provides-Extra: spark
Requires-Dist: pyarrow==21.0.0; extra == 'spark'
Requires-Dist: pyspark[connect]==3.4.1; extra == 'spark'
Description-Content-Type: text/markdown

# Jett

[![pypi version](https://img.shields.io/pypi/v/jett)](https://pypi.org/project/jett/)
[![python support version](https://img.shields.io/pypi/pyversions/jett)](https://pypi.org/project/jett/)
[![size](https://img.shields.io/github/languages/code-size/ddeutils/jett)](https://github.com/ddeutils/jett)

**Just a Engine Template Tool** that easy to use and develop for Data Engineer.
This project support the ETL template for multiple DataFrame engine like
`PySpark`, `Duckdb`, `Polars`, etc.

**Supported Features**:

- Dynamic Supported Engines via YAML template
- JSON Schema Validation support

## 📦 Installation

```shell
uv pip install -U jett
```

**Engine Supported**:

| Name    | Status | Description                                           |
|---------|:------:|-------------------------------------------------------|
| Pyspark |   ✅    | Pyspark and Spark submit CLI for distributed workload |
| DuckDB  |   ✅    | DuckDB and Spark API DuckDB                           |
| Polars  |   ✅    | Polars for Python workload                            |
| Arrow   |   ✅    | Arrow for Python workflow with Columnar               |
| Daft    |   ❌    | Daft for Python distributed workload                  |
| DBT     |   ❌    | DBT for SQL workload                                  |
| GX      |   ❌    | Great Expectation for data quality                    |

> [!NOTE]
> **Version Tracking**:
>
> | Package |   Version    | Next Support |
> |---------|:------------:|:------------:|
> | Python  |  `3.10.13`   |  `>=3.11.0`  |
> | Spark   |   `3.4.2`    |  `>=4.0.0`   |
> | Hadoop  |     `3`      |     `3`      |
> | Java    | `openjdk@11` | `openjdk@17` |
> | Pyspark |   `3.4.1`    |  `>=4.0.0`   |
> | Scala   |  `2.12.17`   |  `2.12.17`   |
> | DuckDB  |   `1.3.2`    |              |
> | Polars  |   `1.32.0`   |              |
> | Arrow   |   `21.0.0`   |              |

## 📝 Usage

For example, making file, `etl.polars.tool` (I use `.tool` be file extension for validate
it with the JSON schema with pattern `*.tool`), for ETL state like:

```yaml
type: polars
name: Load CSV to GGSheet
app_name: load_csv_to_ggsheet
master: local

# 1) 🚰 Load data from source
source:
  type: local
  file_format: csv
  path: ./assets/data/customer.csv

# 2) ⚙️ Transform this data.
transforms:
  - op: rename_to_snakecase
  - op: group
    transforms:
      - op: expr
        sql: "CAST(id AS string)"

# 3) 🎯 Sink result to target
sink:
  type: local
  file_type: google_sheet
  path: ./assets/landing/customer.gsheet

# 4) 📩 Metric that will send after execution.
metric:
  - type: console
    convertor: basic
  - type: restapi
    convertor: basic
    host: "localhost"
    port: 1234
```

Use by Python API:

```python
from jett import Tool

tool = Tool(path="./etl.spark.tool")
tool.execute(allow_raise=True)
```

## 📖 Documents

This project will reference emoji from the [Pipeline Emojis](https://emojidb.org/pipeline-emojis).

## 💬 Contribute

I do not think this project will go around the world because it has specific propose,
and you can create by your coding without this project dependency for long term
solution. So, on this time, you can open [the GitHub issue on this project 🙌](https://github.com/ddeutils/jett/issues)
for fix bug or request new feature if you want it.
