Metadata-Version: 2.4
Name: aptoro
Version: 0.2.0
Summary: A minimal, functional Python ETL library for reading, validating, and transforming data using YAML schemas
Project-URL: Homepage, https://github.com/plataformasindigenas/aptoro
Project-URL: Documentation, https://github.com/plataformasindigenas/aptoro#readme
Project-URL: Repository, https://github.com/plataformasindigenas/aptoro
Author: Plataformas Indígenas
License-Expression: GPL-3.0-or-later
License-File: LICENSE
Keywords: data,etl,pydantic,schema,validation,yaml
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: build; extra == 'dev'
Requires-Dist: mypy>=1.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Requires-Dist: twine; extra == 'dev'
Requires-Dist: types-pyyaml; extra == 'dev'
Provides-Extra: excel
Requires-Dist: openpyxl>=3.0; extra == 'excel'
Provides-Extra: sheets
Requires-Dist: google-auth>=2.0; extra == 'sheets'
Requires-Dist: gspread>=5.0; extra == 'sheets'
Provides-Extra: sql
Requires-Dist: sqlalchemy>=2.0; extra == 'sql'
Description-Content-Type: text/markdown

# Aptoro

Aptoro is a Xavante word for "preparing the arrows for hunting".

A minimal, functional Python ETL library for reading, validating, and transforming data using YAML schemas.

## Installation

```bash
pip install aptoro
```

## Quick Start

```python
from aptoro import load, load_schema, read, validate, to_json

# All-in-one: read + validate
entries = load(source="data.csv", schema="schema.yaml")

# Or step by step:
schema = load_schema("schema.yaml")
data = read("data.csv")
entries = validate(data, schema)

# Export
json_str = to_json(entries)

# Export with embedded metadata (for self-contained files)
json_meta = to_json(entries, schema=schema, include_meta=True)

# Load back with metadata
from aptoro import load_meta
loaded_schema, loaded_data = load_meta("output.json")
```

## Schema Language

Define your data schema in YAML:

```yaml
name: lexicon_entry
description: Dictionary entries

fields:
  id: str
  lemma: str
  pos: str[noun|verb|adj|adv]    # Constrained values
  definition: str
  translation: str?               # Optional field
  examples: list[str]?            # Optional list
  frequency: int = 0              # Default value
```

### Type Syntax

- Basic types: `str`, `int`, `float`, `bool`
- Optional: `str?`
- Default value: `str = "default"`, `int = 0`
- Constrained: `str[a|b|c]`
- Lists: `list[str]`, `list[int]`

### Schema Inheritance

```yaml
# child.yaml
name: child_entry
extends: base.yaml

fields:
  name: str
  # inherits fields from base.yaml
```

## Supported Formats

- CSV
- JSON
- YAML
- TOML

## License

GNU General Public License v3 (GPLv3)
