Metadata-Version: 2.4
Name: trellis-datamodel
Version: 0.6.0
Summary: Visual data model editor for dbt projects
Author: Tim Hiebenthal
Project-URL: Homepage, https://app.capacities.io/home/8b7546f6-9028-4209-a383-c4a9ba9be42a
Project-URL: Repository, https://github.com/timhiebenthal/trellis-datamodel
Project-URL: Issues, https://github.com/timhiebenthal/trellis-datamodel/issues
Keywords: dbt,data-modeling,erd,data-engineering,analytics-engineering,visualization,schema
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: dbt-core<2.0,>=1.10.5
Requires-Dist: dbt-colibri>=0.1.0
Requires-Dist: dbt-duckdb>=1.10.0
Requires-Dist: fastapi>=0.121.3
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: ruamel.yaml>=0.18.0
Requires-Dist: typer>=0.9.0
Requires-Dist: uvicorn>=0.38.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: httpx>=0.27.0; extra == "dev"
Provides-Extra: dbt-example
Requires-Dist: dbt-duckdb==1.10; extra == "dbt-example"
Requires-Dist: duckdb>=1.4.2; extra == "dbt-example"
Requires-Dist: faker>=24.0.0; extra == "dbt-example"
Requires-Dist: marimo>=0.18.0; extra == "dbt-example"
Requires-Dist: nba-api>=1.11.3; extra == "dbt-example"
Requires-Dist: pandas>=2.3.3; extra == "dbt-example"
Requires-Dist: tqdm>=4.67.1; extra == "dbt-example"
Dynamic: license-file

# Trellis Data

![Trellis Logo](resources/trellis_with_text.png)

A lightweight, local-first tool to bridge Conceptual Data Modeling, Logical Data Modeling and the Physical Implementation (currently with dbt-core).

## Motivation

**Current workflow pains:**
- ERD diagrams live in separate tools (Lucidchart, draw.io) and quickly become stale or unreadable for large projects
- Data transformations are done isolated from the conceptual data model.
- No single view connecting business concepts to logical schema
- Stakeholders can't easily understand model structure without technical context
- Holistic Data Warehouse Automation Tools exists but do not integrate well with dbt and the Modern Data Stack

**How Trellis helps:**
- Visual data model that stays in sync — reads directly from `manifest.json` / `catalog.json`
- Sketch entities and with their fields and auto-generate schema.yml's for dbt
- Draw relationships on canvas → auto-generates dbt `relationships` tests
- Two views: **Conceptual** (entity names, descriptions) and **Logical** (columns, types, materializations) to jump between high-level architect and execution-view.
- Organize entities based on subdirectories and tags from your pyhsical implementation.
- Write description or tags back to your dbt-project

**Two Ways of getting started**
- Greenfield: draft entities and fields before writing SQL, then sync to dbt YAML
- Brownfield: document your existing data model by loading existing dbt models and utilize relationship tests to infer links

## Dimensional Modeling Support

Trellis includes native support for Kimball dimensional modeling, making it easier to design, visualize, and document star and snowflake schemas.

### Features

**Entity Classification**
- Classify entities as **fact** (transaction tables), **dimension** (descriptive tables), or **unclassified**
- Manual classification during entity creation or via context menu
- Automatic inference from dbt model naming patterns (e.g., `dim_customer` → dimension, `fct_orders` → fact)
- Configurable inference patterns in `trellis.yml`

**Smart Default Positioning**
- Facts are automatically placed in the center area of the canvas
- Dimensions are placed in an outer ring around facts
- Reduces manual layout effort for star/snowflake schemas
- Can be re-applied anytime with "Auto-Layout" button

**Kimball Bus Matrix View**
- Visual matrix showing dimensions (rows) and facts (columns)
- Checkmarks (✓) indicate dimension-fact connections
- Filter by dimension name, fact name, or tags
- Click cells to highlight relationships on the canvas
- Dedicated view mode accessible from navigation bar

### Configuration

Enable dimensional modeling features in `trellis.yml`:

```yaml
modeling_style: dimensional_model  # Options: dimensional_model or entity_model (default)

dimensional_modeling:
  inference_patterns:
    dimension_prefix: ["dim_", "d_"]  # Prefixes for dimension tables
    fact_prefix: ["fct_", "fact_"]  # Prefixes for fact tables
```

- `modeling_style: dimensional_model` enables all dimensional modeling features
- `modeling_style: entity_model` (default) preserves current generic behavior
- Inference patterns customize how entities are auto-classified from dbt model names

### Entity Classification Workflow

**Creating New Entities:**
1. Click "Create Entity" button
2. Fill in entity name and description
3. Select entity type: Fact, Dimension, or Unclassified
4. Entity is placed on canvas according to type (facts center, dimensions outer ring)

**Loading Existing dbt Models:**
1. System automatically infers entity types from naming patterns
2. Entity type icons appear on nodes (database for fact, box for dimension)
3. Override incorrect classifications via context menu: right-click → "Set as Fact/Dimension"

**Bus Matrix Workflow:**
1. Click "Bus Matrix" icon in navigation bar
2. View dimensions (rows) and facts (columns)
3. Checkmarks show connections between entities
4. Filter to focus on specific dimensions, facts, or tags
5. Click checkmark to highlight relationship on canvas

### Use Cases

**When to Use Dimensional Modeling:**
- Designing data warehouses with star/snowflake schemas
- Following Kimball methodology
- Working with fact and dimension tables
- Documenting data models for BI stakeholders

**When to Use Entity Model:**
- Generic data modeling (not strictly dimensional)
- Mixed schema patterns
- Legacy projects with inconsistent naming
- Exploratory modeling

## Tutorial & Guide

Check out our [Full Tutorial](https://app.capacities.io/home/667ad256-ca68-4dfd-8231-e77d83127dcf) with video clips showing the core features in action.  Also [General Information](https://app.capacities.io/home/8b7546f6-9028-4209-a383-c4a9ba9be42a) is available.

## Vision

trellis is currently designed and tested specifically for **dbt-core**, but the vision is to be tool-agnostic. As the saying goes: *"tools evolve, concepts don't"* — data modeling concepts persist regardless of the transformation framework you use.

If this project gains traction, we might explore support for:
- **dbt-fusion** through adapter support
- **Pydantic models** as a simple output format
- Other frameworks like [SQLMesh](https://github.com/TobikoData/sqlmesh) or [Bruin](https://github.com/bruin-data/bruin) through adapter patterns, where compatibility allows

This remains a vision for now — the current focus is on making Trellis work well with dbt-core.

## Prerequisites
- **Node.js 22+ (or 20.19+) & npm**  
  - Recommended: Use [nvm](https://github.com/nvm-sh/nvm) to install a compatible version (e.g., `nvm install 22`).
  - Note: System packages (`apt-get`) may be too old for the frontend dependencies.
  - A `.nvmrc` file is included; run `nvm use` to switch to the correct version automatically.
- **Python 3.11+ & [uv](https://github.com/astral-sh/uv)**  
  - Install uv via `curl -LsSf https://astral.sh/uv/install.sh | sh` and ensure it's on your `$PATH`.
- **Make** (optional) for convenience targets defined in the `Makefile`.

## Installation

### Install from PyPI

```bash
pip install trellis-datamodel
# or with uv
uv pip install trellis-datamodel
```

### Install from Source (Development)

```bash
# Clone the repository
git clone https://github.com/timhiebenthal/trellis-datamodel.git
cd trellis-datamodel

# Install in editable mode
pip install -e .
# or with uv
uv pip install -e .
```

## Quick Start

1. **Navigate to your dbt project directory**
   ```bash
   cd /path/to/your/dbt-project
   ```

2. **Initialize configuration**
   ```bash
   trellis init
   ```
   This creates a `trellis.yml` file. Edit it to point to your dbt manifest and catalog locations.

3. **Start the server**
   ```bash
   trellis run
   ```

   The server will start on **http://localhost:8089** and automatically open your browser.

## Development Setup

For local development with hot reload:

### Install Dependencies
Run these once per machine (or when dependencies change).

1. **Backend**
   ```bash
   uv sync
   ```
2. **Frontend**
   ```bash
   cd frontend
   npm install
   ```

**Terminal 1 – Backend**
```bash
make backend
# or
uv run trellis run
```
Backend serves the API at http://localhost:8089.

**Terminal 2 – Frontend**
```bash
make frontend
# or
cd frontend && npm run dev
```
Frontend runs at http://localhost:5173 (for development with hot reload).

## Building for Distribution

To build the package with bundled frontend:

```bash
make build-package
```

This will:
1. Build the frontend (`npm run build`)
2. Copy static files to `trellis_datamodel/static/`
3. Build the Python wheel (`uv build`)

The wheel will be in `dist/` and can be installed with `pip install dist/trellis_datamodel-*.whl`.

## CLI Options

```bash
trellis run [OPTIONS]

Options:
  --port, -p INTEGER    Port to run the server on [default: 8089]
  --config, -c TEXT     Path to config file (trellis.yml or config.yml)
  --no-browser          Don't open browser automatically
  --help                Show help message
```

## dbt Metadata
- Generate `manifest.json` and `catalog.json` by running `dbt docs generate` in your dbt project.
- The UI reads these artifacts to power the ERD modeller.
- Without these artifacts, the UI loads but shows no dbt models.

## Configuration

Run `trellis init` to create a starter `trellis.yml` file in your project.
The generated file mirrors the annotated defaults in `trellis.yml.example`, so review that example when you need to customize optional sections (lineage, guidance, helpers).

Options:

- `framework`: Transformation framework to use. Currently supported: `dbt-core`. Future: `dbt-fusion`, `sqlmesh`, `bruin`, `pydantic`. Defaults to `dbt-core`.
- `dbt_project_path`: Path to your dbt project directory (relative to `config.yml` or absolute). **Required**.
- `dbt_manifest_path`: Path to `manifest.json` (relative to `dbt_project_path` or absolute). Defaults to `target/manifest.json`.
- `dbt_catalog_path`: Path to `catalog.json` (relative to `dbt_project_path` or absolute). Defaults to `target/catalog.json`.
- `data_model_file`: Path where the data model YAML will be saved (relative to `dbt_project_path` or absolute). Defaults to `data_model.yml`.
- `dbt_model_paths`: List of path patterns to filter which dbt models are shown (e.g., `["3_core"]`). If empty, all models are included.
- `dbt_company_dummy_path`: Helper dbt project used by `trellis generate-company-data`. Run the command to create `./dbt_company_dummy` or update this path to an existing project.
- `lineage.enabled`: Feature flag for lineage UI + API. Defaults to `false` (opt-in).
- `lineage.layers`: Ordered list of folder names to organize lineage bands. Prefer this nested structure; legacy `lineage_layers` is deprecated.
- `exposures.enabled`: Feature flag for Exposures view mode. Defaults to `false` (opt-in). Set to `true` to enable the exposures view and API.
- `exposures.default_layout`: Default table layout for exposures view. Options: `dashboards-as-rows` (default, dashboards as rows, entities as columns) or `entities-as-rows` (exposures as columns, entities as rows). Users can manually toggle between layouts.
- `entity_creation_guidance`: Encounter-friendly guidance for the entity wizard (current defaults are shown in `trellis.yml.example`).

**Example `trellis.yml`:**
```yaml
framework: dbt-core
dbt_project_path: "./dbt_built"
dbt_manifest_path: "target/manifest.json"
dbt_catalog_path: "target/catalog.json"
data_model_file: "data_model.yml"
dbt_model_paths: []  # Empty list includes all models
dbt_company_dummy_path: "./dbt_company_dummy"
#lineage:
#  enabled: false  # Set to true to enable lineage UI/endpoints
#  layers: []
#exposures:
#  enabled: false  # Set to true to enable Exposures view (opt-in)
#  default_layout: dashboards-as-rows  # Options: dashboards-as-rows (default) or entities-as-rows
#entity_creation_guidance:
#  enabled: true  # Set false to disable the step-by-step wizard
#  push_warning_enabled: true
#  min_description_length: 10
#  disabled_guidance: []
```

Lineage and entity creation guidance sections are documented fully in `trellis.yml.example`; the CLI leaves them commented out by default.
```


## Testing

### Frontend
**Testing Libraries:**
The following testing libraries are defined in `package.json` under `devDependencies` and are automatically installed when you run `npm install`:
- [Vitest](https://vitest.dev/) (Unit testing)
- [Playwright](https://playwright.dev/) (End-to-End testing)
- [Testing Library](https://testing-library.com/) (DOM & Svelte testing utilities)
- [jsdom](https://github.com/jsdom/jsdom) (DOM environment)

> **Playwright system dependencies (Ubuntu/WSL2)**
>
> The browsers downloaded by Playwright need a handful of native libraries. Install them before running `npm run test:e2e`:
>
> ```bash
> sudo apt-get update && sudo apt-get install -y \
>   libxcursor1 libxdamage1 libgtk-3-0 libpangocairo-1.0-0 libpango-1.0-0 \
>   libatk1.0-0 libcairo-gobject2 libcairo2 libgdk-pixbuf-2.0-0 libasound2 \
>   libnspr4 libnss3 libgbm1 libgles2-mesa libgtk-4-1 libgraphene-1.0-0 \
>   libxslt1.1 libwoff2dec0 libvpx7 libevent-2.1-7 libopus0 \
>   libgstallocators-1.0-0 libgstapp-1.0-0 libgstpbutils-1.0-0 libgstaudio-1.0-0 \
>   libgsttag-1.0-0 libgstvideo-1.0-0 libgstgl-1.0-0 libgstcodecparsers-1.0-0 \
>   libgstfft-1.0-0 libflite1 libflite1-plugins libwebpdemux2 libavif13 \
>   libharfbuzz-icu0 libwebpmux3 libenchant-2-2 libsecret-1-0 libhyphen0 \
>   libwayland-server0 libmanette-0.2-0 libx264-163
> ```

**Running Tests:**

The test suite has multiple levels to catch different types of issues:

```bash
cd frontend

# Quick smoke test (catches 500 errors, runtime crashes, ESM issues)
# Fastest way to verify the app loads without errors
npm run test:smoke

# TypeScript/compilation check
npm run check

# Unit tests
npm run test:unit

# E2E tests (includes smoke test + full test suite)
# Note: Requires backend running with test data (see Test Data Isolation below)
npm run test:e2e

# Run all tests (check + smoke + unit + e2e)
npm run test
```

**Test Levels:**
1. **`npm run check`** - TypeScript compilation errors
2. **`npm run test:smoke`** - Runtime errors (500s, console errors, ESM issues) - **catches app crashes**
3. **`npm run test:unit`** - Unit tests with Vitest
4. **`npm run test:e2e`** - Full E2E tests with Playwright

**Using Makefile:**
```bash
# From project root
make test-smoke     # Quick smoke test
make test-check     # TypeScript check
make test-unit      # Unit tests
make test-e2e       # E2E tests (auto-starts backend with test data)
make test-all       # All tests
```

**Test Data Isolation:**
E2E tests use a separate test data file (`frontend/tests/test_data_model.yml`) to avoid polluting your production data model. **Playwright automatically starts the backend** with the correct environment variable, so you don't need to manage it manually.

```bash
# Just run E2E tests - backend starts automatically with test data
make test-e2e
# OR:
# cd frontend && npm run test:e2e
```

The test data file is automatically cleaned before and after test runs via Playwright's `globalSetup` and `globalTeardown`. Your production `data_model.yml` remains untouched.

### Backend
**Testing Libraries:**
The following testing libraries are defined in `pyproject.toml` under `[project.optional-dependencies]` in the `dev` group:
- [pytest](https://docs.pytest.org/) (Testing framework)
- [httpx](https://www.python-httpx.org/) (Async HTTP client for API testing)

**Installation:**
Unlike `npm`, `uv sync` does not install optional dependencies by default. To include the testing libraries, run:
```bash
uv sync --extra dev
```

**Running Tests:**
```bash
uv run pytest
```

## Collaboration

If you want to collaborate, reach out!

## Contributing and CLA
- Contributions are welcome! Please read [`CONTRIBUTING.md`](CONTRIBUTING.md) for workflow, testing, and PR guidelines.
- All contributors must sign the CLA once per GitHub account. The CLA bot on pull requests will guide you; see [`CLA.md`](CLA.md) for details.

## Acknowledgments
- Thanks to [dbt-colibri](https://github.com/dbt-labs/dbt-colibri) for providing lineage extraction capabilities that enhance trellis's data model visualization features.

## License
- Trellis Datamodel is licensed under the [GNU Affero General Public License v3.0](LICENSE).
- See [`NOTICE`](NOTICE) for a summary of copyright and licensing information.
