Scripts and Script Arguments

The upsert script file (e.g., pg_upsert.sql) contains the following five scripts that are defined with execsql’s SCRIPT metacommand:

  • fkqa_one
  • load_staging
  • nullqa_one
  • qa_all
  • staged_to_load
  • upsert_all
  • upsert_one

The scripts staged_to_load and load_staging are intended to be the primary user interface, and are expected to be the only scripts that are directly used in most cases. The load_staging script calls the other scripts to carry out QA checks and the upsert operation. The purpose of each script, and the arguments that each expects, are described in the following sections. Some scripts use different arguments for different DBMSs.

Script FKQA_ONE

This script checks all foreign keys for a single staging table. The table name and other specifications for reporting the results are passed as script arguments. This script does not use the control table, and so can be useful when only foreign key checks need to be performed on a single staging table.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
table:The name of the table to be updated or extended with new data. The table name must be the same in the base and staging schemas.
display_errors:A value of “Yes” or “No” to indicate whether invalid foreign key values should be displayed to the user in a GUI.

Required Output Parameters

error_list:The name of an execsql substitution variable to be set to a value identifying any foreign key errors that are found. The value, if set, will be a comma-separated list of foreign key names and the number of rows having an invalid value for each foreign key.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
table:The name of the table to be updated or extended with new data. The staging table must have this name with the prefix added.
display_errors:A value of “Yes” or “No” to indicate whether invalid foreign key values should be displayed to the user in a GUI.

Required Output Parameters

error_list:The name of an execsql substitution variable to be set to a value identifying any foreign key errors that are found. The value, if set, will be a comma-separated list of foreign key names and the number of rows having an invalid value for each foreign key.

Script LOAD_STAGING

This script is the primary high-level interface for updating a set of base tables from corresponding staging tables. It carries out null checks, foreign key checks, and update and insert operations on all tables according to the specifications in the control table.

The upsert step is only carried out if no errors were found in any of the staging tables. The upsert operation is carried out within a transaction, so if any errors occur, or the user cancels the operation, no changes will be made to any table.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
control_table:The name of the control table created by the script staged_to_load.
do_commit:A flag to indicate whether or not the upsert script should commit the changes. Should be ‘Yes’ or ‘No’.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
control_table:The name of the control table created by the script staged_to_load.
do_commit:A flag to indicate whether or not the upsert script should commit the changes. Should be ‘Yes’ or ‘No’.

Script NULLQA_ONE

This script checks for nulls in non-nullable columns of a single staging table. The identity of the staging and base tables are specified as script arguments. This script does not use the control table and so can be useful when only null checks are to be carried out on a single table.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
table:The name of the table to be updated or extended with new data. The table name must be the same in the base and staging schemas.

Optional Input Parameters

exclude_null_checks:
 A comma-separated list of single-quoted column names that are to be excluded from null checks.

Required Output Parameters

error_list:The name of an execsql substitution variable to receive information about any errors found. The information will consist of a comma- delimited list of non-nullable column names, each followed by the number of rows having a null value in that column.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
table:The name of the table to be updated or extended with new data. The staging table must have this name with the prefix added.

Optional Input Parameters

exclude_null_checks:
 A comma-separated list of single-quoted column names that are to be excluded from null checks.

Required Output Parameters

error_list:The name of an execsql substitution variable to receive information about any errors found. The information will consist of a comma- delimited list of non-nullable column names, each followed by the number of rows having a null value in that column.

Script PKQA_ONE

This script checks for duplicate primary key values in a single staging table. The table name and other specifications for reporting the results are passed as script arguments. This script does not use the control table, and so can be useful when only foreign key checks need to be performed on a single staging table.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
table:The name of the table to be updated or extended with new data. The table name must be the same in the base and staging schemas.
display_errors:A value of “Yes” or “No” to indicate whether duplicate primary key values should be displayed to the user in a GUI.

Required Output Parameters

error_list:The name of an execsql substitution variable to be set to a value identifying any duplicate primary key errors that are found. The value, if set, will be a list of the number of primary key values that are duplicated, and the number of duplicates for each.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
table:The name of the table to be updated or extended with new data. The staging table must have this name with the prefix added.
display_errors:A value of “Yes” or “No” to indicate whether duplicate primary key values should be displayed to the user in a GUI.

Required Output Parameters

error_list:The name of an execsql substitution variable to be set to a value identifying any duplicate primary key errors that are found. The value, if set, will be a list of the number of primary key values that are duplicated, and the number of duplicates for each.

Script QA_ALL

This script carries out all null and foreign key checks on all staging tables identified in the control table. The results of null and foreign key checks are recorded in the null_errors and fk_errors columns of the control table, respectively.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
control_table:The name of the control table created by the script staged_to_load.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
control_table:The name of the control table created by the script staged_to_load.

Script STAGED_TO_LOAD

This script initializes the control table with the names of the tables to have their data updated or extended, and also sets default values for settings that control details of the QA checks and upsert operations.

Required Input Parameters

control_table:The name of the control table that will be (re-)created by this script.
table_list:A comma-separated list of the names of the tables that are to be modified with data from the staging tables.

Script UPDTPK_ONE

This script will update the primary key values of a single base table using new values specified in a staging table. This is an analogue to the UPSERT_ONE script, except that it only performs an update of primary key columns and does not insert any rows.

The format of the staging table used for this script is different from that used for the other upsert scripts. The staging table must have:

  • A set of columns identical to the primary key columns of the base table to be updated, with the same names and types.
  • A second set of equivalent columns, but all the column names must be prefixed with “new_”.

The staging table does not have to have a primary key defined, but the UPDTPK_ONE script requires that all of the following requirements are met:

  • The base table has a primary key defined.
  • The staging table contains a column name prefixed with “new_” for every primary key column in the base table.
  • The staging table contains at least one row with an update specified.
  • If any of the “new_”-prefixed columns are populated on a row of the staging table, all of the “new_”-prefixed columns on that row are populated.
  • For any update to be made, the values of the primary key columns in the staging table exist in the base table.
  • None of the values of the “new_”-prefixed columns exist in the base table.
  • No two or more original primary key values in the staging table have the same new primary key value.
  • There are not multiple original primary key values in the staging table that all have the same new primary key value.
  • All foreign keys from new primary key values into other base tables are valid.

There is no equivalent to UPSERT_ALL, to update primary keys in multiple tables all in one step. QA checks for such a set of simultaneous updates are infeasible given the potential complexities that result from cascading updates.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
table:The name of the base table to have its primary key updated. The staging table should have the same name.
display_errors:A value of “Yes” or “No” to indicate whether any QA check failures should be displayed to the user in a GUI.
display_changes:
 A value of “Yes” or “No” to indicate whether the changes to be made to the base table should be displayed in a GUI.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
table:The name of the base table to have its primary key updated. The staging table should have the same name, but with the ‘stage_pfx’ prefix.
display_errors:A value of “Yes” or “No” to indicate whether any QA check failures should be displayed to the user in a GUI.
display_changes:
 A value of “Yes” or “No” to indicate whether the changes to be made to the base table should be displayed in a GUI.

Script UPSERT_ALL

This script the upsert operation for all data tables that are identified in the control table. This script will update the rows_updated and rows_inserted columns of the control table as appropriate.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
control_table:The name of the control table created by the script staged_to_load.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
control_table:The name of the control table created by the script staged_to_load.

Script UPSERT_ONE

This script creates and executes SQL UPDATE and INSERT statements to move data from a single staging table to the corresponding base table. This script does not use the control table and therefore can be useful when only the upsert operation is to be performed on a single table.

PostgreSQL and SQL Server

Required Input Parameters

base_schema:The name of the database schema for the base table.
staging:The name of the database schema for the staging table.
table:The name of the table to be updated or extended with new data. The table name must be the same in the base and staging schemas.
exclude_cols:A comma-delimited list of single-quoted column names identifying the columns of the base table that are not to have their data modified.
display_changes:
 A value of “Yes” or “No” indicating whether or not the changes to be made to the base table should be displayed in a GUI.
display_final:A value of “Yes” or “No” indicating whether or not the base table should be displayed in a GUI after all data revisions have been made.
updcntvar:The name of a substitution variable that will be set to the number of rows updated.
inscntvar:The name of a substitution variable that will be set to the number of rows inserted.

MariaDB and MySQL

Required Input Parameters

stage_pfx:A table name prefix that identifies a table as a staging table.
table:The name of the table to be updated or extended with new data. The staging table must have this name with the prefix added.
exclude_cols:A comma-delimited list of single-quoted column names identifying the columns of the base table that are not to have their data modified.
display_changes:
 A value of “Yes” or “No” indicating whether or not the changes to be made to the base table should be displayed in a GUI.
display_final:A value of “Yes” or “No” indicating whether or not the base table should be displayed in a GUI after all data revisions have been made.
updcntvar:The name of a substitution variable that will be set to the number of rows updated.
inscntvar:The name of a substitution variable that will be set to the number of rows inserted.