MERGE Statement

The MERGE statement enables conditional updates, deletes, and inserts, based on the result of a join between a target and a source table. This operation is useful for applying data changes from transactional systems to analytic data warehouses by merging data from two tables with similar structures.

The MERGE statement supports multiple WHEN clauses, where each clause can specify actions like UPDATE, DELETE, or INSERT. Actions are applied based on the join conditions defined between the source and target tables.

Syntax:

MERGE INTO target_table [AS target_alias]
USING source_expr [AS source_alias]
ON search_condition
[WHEN MATCHED [AND search_condition] THEN
  UPDATE SET column1 = expression1, column2 = expression2, ... ]
[WHEN MATCHED [AND search_condition] THEN DELETE]
[WHEN NOT MATCHED [AND search_condition] THEN
  INSERT (column1, column2, ...) VALUES (expression1, expression2, ...)]

The WHEN MATCHED clause is executed if a row from the source table matches a row in the target table, based on the ON condition. Within this clause, you can either UPDATE specific columns or DELETE the matched rows. Multiple WHEN MATCHED clauses can be provided, each with a different condition.

The WHEN NOT MATCHED clause is executed if a row from the source table has no matching row in the target table. This clause typically inserts new rows into the target table.

  • UPDATE: Updates specified columns of the target table for matching rows. Both source and target fields can be used in the update expressions.
  • DELETE: Deletes the matching rows from the target table.
  • INSERT: Inserts new rows into the target table when no match is found, using values from the source table.

The ON clause defines the join condition between the target table and source expression, typically based on primary key or unique identifier columns. The MERGE operation evaluates the conditions in the order of the WHEN clauses, executing the first matching action and discarding subsequent clauses.

Examples:

MERGE INTO customers AS c
USING updates AS u
ON u.customer_id = c.customer_id
WHEN MATCHED AND c.status != 'inactive' THEN
  UPDATE SET c.name = u.name, c.email = u.email
WHEN MATCHED THEN DELETE
WHEN NOT MATCHED THEN
  INSERT (customer_id, name, email, status) VALUES (u.customer_id, u.name, u.email, 'active');

In this example, the MERGE operation updates customer information where IDs match and the customer is not inactive, deletes inactive customers, and inserts new customers from the source table if no match is found.

The MERGE statement is only supported for Iceberg tables.

For Iceberg tables, this operation generally uses a full outer join with the STRAIGHT_JOIN hint to combine the target and source datasets.