Replace a table with modified data and create a new snapshot

Usage

replace_table(.data, table_name, .quiet = TRUE)

Arguments

.data: A dplyr query object (tbl_lazy) with transformations
table_name: Table name to replace
.quiet: Logical, whether to suppress messages (default TRUE)

Value

Invisibly returns NULL

Details

This function is designed for schema changes or bulk transformations that should create a new versioned snapshot. It:

Collects the transformed data
Drops the existing table
Creates a new table with the updated schema/data

All operations happen within the current transaction context. Use begin_transaction() and commit_transaction() to ensure proper versioning.

When to use replace_table():

Adding new columns - DuckLake UPDATE cannot add columns; use replace_table()
Removing columns - Restructure schema with select()
Versioning needed - Creates snapshots via DROP + CREATE for time travel
Complex transformations - Apply full dplyr pipelines naturally

When to use update_table() instead:

Modifying existing column values only (no schema changes)
Performance critical and versioning not needed
Making targeted corrections to specific rows

Examples

if (FALSE) { # \dontrun{
# Add new derived columns with versioning
begin_transaction()
get_ducklake_table("adsl") |>
  mutate(
    AGE65FL = if_else(AGE >= 65, "Y", "N"),
    AGECAT = case_when(
      AGE < 65 ~ "<65",
      AGE >= 65 & AGE < 75 ~ "65-74",
      AGE >= 75 ~ ">=75"
    )
  ) |>
  replace_table("adsl")
commit_transaction()

# Remove columns and create new snapshot
begin_transaction()
get_ducklake_table("adsl") |>
  select(-AGE65FL, -AGECAT) |>
  replace_table("adsl")
commit_transaction()
} # }