Metadata-Version: 2.3
Name: limber-timber
Version: 0.0.2
Summary: Database Migrations Made Easy
License: MIT
Author: Daniel Tashjian
Author-email: thewopple@gmail.com
Requires-Python: >=3.10
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Framework :: Pydantic :: 2
Classifier: Framework :: Pytest
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: devtools (==0.12.2)
Requires-Dist: google-cloud-bigquery (==3.34.0)
Requires-Dist: pydantic (==2.11.4)
Requires-Dist: pyyaml (==6.0.2)
Project-URL: Homepage, https://github.com/Wopple/limber-timber
Project-URL: Repository, https://github.com/Wopple/limber-timber
Description-Content-Type: text/markdown

Limber Timber
=============

***Database Migrations Made Easy***

## Overview

This project is not ready for production use, consider it v0.0.0.

I am writing the migration system I always wanted but does not exist (yet).

## Notable Feature Goals

- Migrations specified in data, not SQL
- Down migrations automatically inferred from up migrations
  - Yes, down migrations for drop table and drop column are automatically inferred
- Separation of database and metadata
- In-memory support
- Database adoption
- No checksums
- Manifest instead of numbered or timestamped migration filenames
- JSON schema for migration files
- Error recovery for backends that do not support DDL transactions

## Not Goals

- ORM
- Parsing SQL
- Specifying all kinds of migrations in pure data (e.g. DML migrations will use SQL)
- Preserving lost data

## Rationale

- It is cumbersome to iterate on migrations without robust down migrations
- Automatically inferred down migrations reduces developer burden
- Writing migrations in data is cleaner and not specific to a database
- Never parsing SQL reduces the complexity of the codebase
- A lightweight open source library makes it each to add missing features
- JSON schema allows IDEs to be configured for migration file validation and auto-completion
- Separation of database and metadata allows for more flexible metadata storage options
- In-memory database and metadata allow for application unit testing
- No checksums allows for modifying migration files without breaking the migrations
- Manifest files cause git merge conflicts when parallel development has collisions

## Roadmap

These are listed in rough priority order if you are interested in contributing.

- ✅ CLI
- ✅ Publish to PyPI
- ➡️ Github Actions
  - ➡️ Unit Tests
  - ➡️ Release
- ✅ In-memory Database
- ✅ In-memory Metadata
- ➡️ Big Query Database
  - ✅ Create Table
  - ✅ Drop Table
  - ✅ Rename Table
  - ✅ Set Table Partition Expiration
  - ✅ Set Table Clustering
  - ✅ Add Column
  - ✅ Drop Column
  - ✅ Rename Column
  - ✅ Set Data Type
  - ➡️ Add Struct Field
  - ➡️ Create View
  - ➡️ Create Materialized View
  - ➡️ Create Snapshot Table
  - ➡️ Create Table Clone
  - ✅ Labels
  - ✅ Tags
- ✅ Big Query Metadata
- ✅ Database Adoption
- ➡️ Raise Unsupported Operations
- ➡️ JSON Schema
  - To validate and auto complete migration files in IDEs
- ✅ Database Specific Validation
- ➡️ Templating
  - To make it easy to swap out repeated values by making a change in one place
- ➡️ Expand Grouped Operations
  - To handle complex operations that do not have atomic support in the backend
- ➡️ Grouped Operation Application
  - To reduce round trips with the backend and reduce migration time
- ➡️ Minimize Scan Output
  - To generate a more human readable operation file
- ➡️ Arbitrary DML SQL Migrations
- ➡️ File System Metadata
- ➡️ SQLite Database
- ➡️ SQLite Metadata
- ➡️ Postgres Database
- ➡️ Postgres Metadata
- ➡️ MySQL Database
- ➡️ MySQL Metadata

## Usage

### Create Migrations

1. Create your target manifest

> Note: All migration files can use any of these extensions:
> - `.json`
> - `.yaml`
> - `.yml`

Create a target directory with a manifest file named `manifest.yaml`.

```yaml
# target_dir/manifest.yaml
version: 1
operation_files:
- path/to/create_user_table.yaml
- path/to/enrich_user_name.yaml
```

2. Create your target migration operations

> Tip: Using a subdirectory for the operations files makes it easy to configure your IDE to apply the correct JSON schema.

Create the files listed in your manifest.

```yaml
# target_dir/path/to/create_user_table.yaml
version: 1
operations:
- kind: create_table
  data:
    table:
      name:
        database: your_project
        schema_name: your_dataset
        table_name: users
      columns:
      - name: id
        datatype: INT64
      - name: name
        datatype: STRING
```

```yaml
# target_dir/path/to/enrich_user_name.yaml
version: 1
operations:
- kind: rename_column
  data:
    table_name:
      database: your_project
      schema_name: your_dataset
      table_name: users
    from_name: name
    to_name: firstname
- kind: add_column
  data:
    table_name:
      database: your_project
      schema_name: your_dataset
      table_name: users
    column:
      name: lastname
      datatype: STRING
```

3. Check what migrations will run

```shell
poetry run liti migrate \
    -t target_dir \
    --db bigquery \
    --meta bigquery \
    --meta-table-name your_project.your_dataset._migrations
```

4. Run the migrations

```shell
poetry run liti migrate -w \
    -t target_dir \
    --db bigquery \
    --meta bigquery \
    --meta-table-name your_project.your_dataset._migrations
```

### Scan Database

You can also scan a schema / table which will print out the operations file that generates that schema / table.

```shell
# scan a schema
poetry run liti scan \
    --db bigquery \
    --scan-database your_project \
    --scan-schema your_dataset
```

```shell
# scan a table
poetry run liti scan \
    --db bigquery \
    --scan-database your_project \
    --scan-schema your_dataset \
    --scan-table your_table
```

## Learn

Being completely new to this project, you will have no idea where to start. Here. This is where you start. This is a
crash course on what Limber Timber is and how its put together.

### The Big Picture

Limber Timber uses the `Operation` to describe changes to a database. These operations are pure data. They can be
serialized to JSON or YAML, and can be deserialized from the same. Developers write JSON or YAML files to describe the
migrations for their application.

The `Operation` can be enhanced to become an `OperationOps`. This type brings behavior to the data. It allows you to:
- check if the operation has been applied to the database
  - useful for recovery from a failure between applying an operation and writing it to the metadata
- apply the operation, i.e. the "up" migration
- produce the inverse `Operation` that will perform the "down" migration

Down migrations are inferred from the up migrations, so developers only ever have to write the up migrations.

### Migration Files

Migration files start with a manifest file. The manifest points to the operation files in the order they should be
applied. Each operation file contains a list of operations in the order they should be applied. In this way,
```
# file1
[op1, op2]

# file2
[op3]
```
is exactly the same as:
```
# file1
[op1]

# file2
[op2, op3]
```

The migrational unit is the `Operation`, not the file. Grouping operations into files can help for organization, but
having a single file with all operations or many files each with one operation are both valid. There are no checksums
and no need to specially name your files. You can also organize your migrations with sub-directories, just specify the
paths in the manifest.

One major benefit to this system is if parallel developers add operations, one will merge first, and then the other will
get a merge conflict. This is much better than having migrations applied out of order (or breaking) after the fact. You
learn right away about the conflict, and the developer is prompted to resolve it. This benefit assumes all developers
are using the same style for adding new migrations: either adding a new file to the manifest, or adding a new operation
to the most recent file.

### Python Modules

`liti.core.model`

This module stores all the data models. The models are versioned, though currently there is only the one version. The
hierarchy is roughly:

> `operation.ops` > `operation.data` > `schema` > `datatype`

`liti.core.model.v1.operation.data`

These are the pure data operations. They are (de)serialized between the operation files and metadata.

`liti.core.model.v1.operation.ops`

These are the wrappers that enhance operations with behavior. There is a 1:1 relationship.

`liti.core.model.v1.datatype`

These are descriptions of column types.

`liti.core.model.v1.schema`

These are descriptions of tables and related constructs.

`liti.core.backend`

Both the database and the metadata can support different backends. You can even use different backends together. The
backends deal in both the `liti` model and backend specific types adapting between the two.

`liti.core.client`

These are clients used by the backends. They solely deal in backend specific types with no dependencies on the `liti`
model.

`liti.core.base`

This module has base classes for applying default values and validating the data. They are implemented using the
Observer / Observable pattern so different backends can define their own behavior.

`liti.core.runner`

This module is for the runners associated with the various ways `liti` can be run. Main code will instantiate a runner
and run it.

