Database Operations Support¶
Supported Write Methods¶
The ETL orchestrator now supports the following database write operations:
1. insert¶
- Inserts new records into the table
- Does not check for existing records
- May fail if primary key constraints are violated
- Use case: Initial data load, append-only scenarios
2. upsert (default)¶
- Inserts new records or updates existing ones based on primary key
- Uses database-specific syntax:
- PostgreSQL:
INSERT ... ON CONFLICT ... DO UPDATE - MySQL:
INSERT ... ON DUPLICATE KEY UPDATE - SQLite:
INSERT OR REPLACE - MSSQL:
MERGEstatement - Use case: Incremental updates, maintaining latest data
3. replace¶
- Deletes existing record (by primary key) then inserts new one
- Ensures complete replacement of record
- Use case: Full record replacement
4. update¶
- Updates existing records only (requires primary key)
- Skips records that don't exist
- Use case: Updating existing data without inserting new records
5. delete¶
- Deletes records by primary key
- Only deletes records that match the provided primary key values
- Use case: Removing specific records from the database
6. append¶
- Similar to
insertbut semantically indicates appending to existing data - Does not check for duplicates
- Use case: Appending new data to existing tables
7. truncate_and_load ⭐ NEW¶
- Truncates the entire table (removes all existing data)
- Then inserts all new data
- Use case: Full table refresh, replacing all data with fresh extract
- Warning: This operation is destructive - all existing data will be lost!
Configuration¶
Set the write method in your load.yaml file:
write_method: truncate_and_load # Options: insert, upsert, replace, update, delete, append, truncate_and_load
primary_key: symbol # Required for upsert, replace, update, delete
Examples¶
Truncate and Load (Full Refresh)¶
write_method: truncate_and_load
primary_key: symbol # Not used for truncate, but may be needed for table creation
Upsert (Incremental Update)¶
Insert Only (Append)¶
Update Existing Records Only¶
Database-Specific Behavior¶
PostgreSQL¶
truncate_and_load: UsesTRUNCATE TABLEupsert: UsesINSERT ... ON CONFLICT ... DO UPDATE- All operations supported
MySQL¶
truncate_and_load: UsesTRUNCATE TABLEupsert: UsesINSERT ... ON DUPLICATE KEY UPDATE- All operations supported
SQLite¶
truncate_and_load: UsesDELETE FROM(SQLite doesn't support TRUNCATE)upsert: UsesINSERT OR REPLACE- All operations supported
MSSQL¶
truncate_and_load: UsesTRUNCATE TABLEupsert: UsesMERGEstatement- All operations supported
Performance Considerations¶
- truncate_and_load: Fastest for full table replacement, but destructive
- upsert: Good for incremental updates, handles conflicts gracefully
- insert: Fastest for new data only, but may fail on duplicates
- update: Efficient for updating existing records only
- delete: Efficient for removing specific records
Best Practices¶
- Use
truncate_and_loadwhen you want to completely replace table data - Use
upsertfor incremental updates and maintaining latest data - Use
insertfor append-only scenarios where duplicates are not expected - Use
updatewhen you only want to modify existing records - Use
deletefor removing specific records based on primary keys