Metadata-Version: 2.1
Name: lokii
Version: 1.0.0
Summary: Generate, Load, Develop and Test with consistent relational datasets!
Home-page: https://github.com/dorukerenaktas/lokii
Author: Doruk Eren Aktaş
Author-email: dorukerenaktas@gmail.com
License: MIT License
Keywords: data generation,relational datasets,development environment,testing,database
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: MIT License
Classifier: Typing :: Typed
Description-Content-Type: text/markdown
License-File: LICENSE

<img src="docs/assets/loki-logo.png" width="200" height="100" alt="Loki Logo" />

[![PyPI version](https://badge.fury.io/py/lokii.svg)](https://badge.fury.io/py/lokii)
[![Downloads](https://static.pepy.tech/personalized-badge/lokii?period=month&units=international_system&left_color=grey&right_color=brightgreen&left_text=Downloads)](https://pepy.tech/project/lokii)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Licence](https://img.shields.io/pypi/l/lokii.svg)](https://github.com/dorukerenaktas/lokii)

Generate fake datasets with ease.

## Dataset Configuration

Define a dataset by using folder and special files. Specify schemas using `schema_name` folders, configure generation
parameters using `table_name.json` and write generation scripts to `table_name.py`.

```
root_folder
    ├── schema_1
    │   ├── table_1.json
    │   ├── table_1.py
    │   ├── table_2.json
    │   └── table_2.py
    └── schema_2
        ├── table_3.json
        ├── table_3.py
        ├── table_4.json
        └── table_4.py
```

### Schema Folders

Tabular data must have a `schema` in many database environments. If your dataset does not take advantage of schema
structures just use a placeholder name like `public` in Postgres or `dbo` in SQLServer.

Create a folder for every schema in your dataset. Store table definition and table generation files under related
schema folder.

### Table Definition Files

Table definition files stores metadata and generation configuration for the tabular data. Database names are extracted
from filenames.

```json5
// table_name.json
{
  "cols": ["col1", "col2", "..."],
  "gen": {
    "type": "simple",
    "count": 1000
  }
}
```

#### Properties

##### cols
> required, type: `List[str]`

Stores column names of the table. Used for output metadata and result check assertions.

---

##### gen
> required, type: `object`

Generation config for detection generation order and generation function parameters.

---

##### gen.type
> required, type: `"simple" | "product"`

Generation type of the tabular data. Each option has own required properties.

* `"simple"`: used for generating standalone table data that can be executed without any other table dependencies (If
    it has no relations.).
* `"product"`: used for generating relational table data that needs other tables for generation function.

##### gen.count
> required if `gen.type="simple"`, type: `int`

Number of rows to be produced. Can not be used with `gen.type="product"`.

##### gen.mul
> required if `gen.type="product"`, type: `List | str`

Table namespace or a list that used as multiplier. Each item or row in multiplier will trigger current table's
generation function. Can not be used with `gen.type="simple"`.

##### gen.rels
> not required, type: `List[str]`

Table relations that used on generation function. 

### Generation Files

Generation files contains simple function that executed for each row.

```python
# table_name.py
from typing import Dict, Any

"""
:param index: row index for this table
:param config: generation config that includes relations, multiplicand and other settings
"""
def gen(index: int, config: Dict[str, Any]) -> Dict:
    return {"index": index, "config": config}
```


## Upload to PyPI
You can create the source distribution of the package by running the command given below:
```shell
python3 setup.py sdist
```

Install twine and upload pypi for `finnetdevlab` username.
```shell
pip3 install twine
twine upload dist/*
```


## Package

Basic structure of package is

```
├── README.md
├── packagename
│   ├── __init__.py
│   ├── packagename.py
│   └── version.py
├── pytest.ini
├── requirements.txt
├── setup.py
└── tests
    ├── __init__.py
    ├── helpers
    │   ├── __init__.py
    │   └── my_helper.py
    ├── tests_helper.py
    └── unit
        ├── __init__.py
        ├── test_example.py
        └── test_version.py
```

## Requirements

Package requirements are handled using pip. To install them do

```
pip install -r requirements.txt
```

## Tests

Testing is set up using [pytest](http://pytest.org) and coverage is handled
with the pytest-cov plugin.

Run your tests with ```py.test``` in the root directory.

Coverage is ran by default and is set in the ```pytest.ini``` file.
To see an html output of coverage open ```htmlcov/index.html``` after running the tests.

## Travis CI

There is a ```.travis.yml``` file that is set up to run your tests for python 2.7
and python 3.2, should you choose to use it.
