Skip to content

Database Operations Support

Supported Write Methods

The ETL orchestrator now supports the following database write operations:

1. insert

  • Inserts new records into the table
  • Does not check for existing records
  • May fail if primary key constraints are violated
  • Use case: Initial data load, append-only scenarios

2. upsert (default)

  • Inserts new records or updates existing ones based on primary key
  • Uses database-specific syntax:
  • PostgreSQL: INSERT ... ON CONFLICT ... DO UPDATE
  • MySQL: INSERT ... ON DUPLICATE KEY UPDATE
  • SQLite: INSERT OR REPLACE
  • MSSQL: MERGE statement
  • Use case: Incremental updates, maintaining latest data

3. replace

  • Deletes existing record (by primary key) then inserts new one
  • Ensures complete replacement of record
  • Use case: Full record replacement

4. update

  • Updates existing records only (requires primary key)
  • Skips records that don't exist
  • Use case: Updating existing data without inserting new records

5. delete

  • Deletes records by primary key
  • Only deletes records that match the provided primary key values
  • Use case: Removing specific records from the database

6. append

  • Similar to insert but semantically indicates appending to existing data
  • Does not check for duplicates
  • Use case: Appending new data to existing tables

7. truncate_and_load ⭐ NEW

  • Truncates the entire table (removes all existing data)
  • Then inserts all new data
  • Use case: Full table refresh, replacing all data with fresh extract
  • Warning: This operation is destructive - all existing data will be lost!

Configuration

Set the write method in your load.yaml file:

write_method: truncate_and_load  # Options: insert, upsert, replace, update, delete, append, truncate_and_load
primary_key: symbol  # Required for upsert, replace, update, delete

Examples

Truncate and Load (Full Refresh)

write_method: truncate_and_load
primary_key: symbol  # Not used for truncate, but may be needed for table creation

Upsert (Incremental Update)

write_method: upsert
primary_key: symbol

Insert Only (Append)

write_method: insert
# primary_key not required

Update Existing Records Only

write_method: update
primary_key: symbol

Database-Specific Behavior

PostgreSQL

  • truncate_and_load: Uses TRUNCATE TABLE
  • upsert: Uses INSERT ... ON CONFLICT ... DO UPDATE
  • All operations supported

MySQL

  • truncate_and_load: Uses TRUNCATE TABLE
  • upsert: Uses INSERT ... ON DUPLICATE KEY UPDATE
  • All operations supported

SQLite

  • truncate_and_load: Uses DELETE FROM (SQLite doesn't support TRUNCATE)
  • upsert: Uses INSERT OR REPLACE
  • All operations supported

MSSQL

  • truncate_and_load: Uses TRUNCATE TABLE
  • upsert: Uses MERGE statement
  • All operations supported

Performance Considerations

  • truncate_and_load: Fastest for full table replacement, but destructive
  • upsert: Good for incremental updates, handles conflicts gracefully
  • insert: Fastest for new data only, but may fail on duplicates
  • update: Efficient for updating existing records only
  • delete: Efficient for removing specific records

Best Practices

  1. Use truncate_and_load when you want to completely replace table data
  2. Use upsert for incremental updates and maintaining latest data
  3. Use insert for append-only scenarios where duplicates are not expected
  4. Use update when you only want to modify existing records
  5. Use delete for removing specific records based on primary keys