Metadata-Version: 2.1
Name: koza
Version: 0.1.3
Summary: Koza, an ETL framework for LinkML data models
Home-page: https://github.com/monarch-initiative/koza
Author: The Monarch Initiative
Author-email: info@monarchinitiative.org
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Operating System :: POSIX :: Linux
Requires-Dist: kgx >= 1.0.0
Requires-Dist: pydantic >=1.0.0,<2.0.0
Requires-Dist: pyyaml >=5.3.1,<6.0.0
Requires-Dist: requests >=2.24.0,<3.0.0
Requires-Dist: typer >=0.3
Requires-Dist: biolink-model-pydantic >=0.1.1,<1.0.0 ; extra == "dev"
Requires-Dist: autoflake >=1.3.1,<2.0.0 ; extra == "dev"
Requires-Dist: flake8 >=3.8.3,<4.0.0 ; extra == "dev"
Requires-Dist: black ==20.8b1 ; extra == "dev"
Requires-Dist: isort >=5.0.6,<6.0.0 ; extra == "dev"
Requires-Dist: mkdocs >=1.1.2,<2.0.0 ; extra == "doc"
Requires-Dist: pytest >=6.0.0 ; extra == "test"
Project-URL: Documentation, https://github.com/monarch-initiative/koza
Provides-Extra: dev
Provides-Extra: doc
Provides-Extra: test

[![Pyversions](https://img.shields.io/pypi/pyversions/koza.svg)](https://pypi.python.org/pypi/koza)
![](https://github.com/monarch-initiative/koza/actions/workflows/build.yml/badge.svg)
[![PyPi](https://img.shields.io/pypi/v/koza.svg)](https://pypi.python.org/pypi/koza)

### Koza

![pupa](docs/img/pupa.png) Data transformation framework

*Disclaimer*: Koza is in beta; we are looking for beta testers

Transform csv, json, yaml, jsonl, and xml and converting them to a target
csv, json, or jsonl format based on your dataclass model.  Koza also can output
data in the [KGX format](https://github.com/biolink/kgx/blob/master/specification/kgx-format.md#kgx-format-as-tsv)

**Documentation**: https://koza.monarchinitiative.org/

##### Highlights

- Author data transforms in semi-declarative Python
- Configure source files, expected columns/json properties and path filters, field filters, and metadata in yaml
- Create or import mapping files to be used in ingests (eg id mapping, type mappings)
- Create and use translation tables to map between source and target vocabularies


#### Installation

```
pip install koza
```

#### Getting Started

Send a local or remove csv file through Koza to get some basic information (headers, number of rows)

```bash
koza validate \
  --file https://raw.githubusercontent.com/monarch-initiative/koza/main/examples/data/string.tsv \
  --delimiter ' '
```

Sending a json or jsonl formatted file will confirm if the file is valid json or jsonl
```bash
koza validate \
  --file ./examples/data/ZFIN_PHENOTYPE_0.jsonl.gz \
  --format jsonl
```

```bash
koza validate \
  --file ./examples/data/ddpheno.json.gz \
  --format json \
  --compression gzip
```

###### Example: transforming StringDB

```bash
koza transform --source examples/string/protein-links-detailed.yaml --global-table examples/translation_table.yaml 

koza transform --source examples/string-declarative/protein-links-detailed.yaml --global-table examples/translation_table.yaml
```

