Scripts and Script Arguments
The upsert script file (e.g., pg_upsert.sql) contains the following
five scripts that are defined with execsql’s
SCRIPT
metacommand:
fkqa_one
load_staging
nullqa_one
qa_all
staged_to_load
upsert_all
upsert_one
The scripts staged_to_load
and load_staging
are intended to be
the primary user interface, and are expected to be the only scripts
that are directly used in most cases. The load_staging
script
calls the other scripts to carry out QA checks and the upsert
operation. The purpose of each script, and the arguments that each
expects, are described in the following sections. Some scripts
use different arguments for different DBMSs.
Script FKQA_ONE
This script checks all foreign keys for a single staging table. The table name
and other specifications for reporting the results are passed as script
arguments. This script does not use the control table,
and so can be useful when only foreign key checks need to be performed
on a single staging table.
PostgreSQL and SQL Server
Required Output Parameters
error_list: | The name of an execsql substitution variable to be set to a value
identifying any foreign key errors that are found. The value, if set,
will be a comma-separated list of foreign key names and the number
of rows having an invalid value for each foreign key. |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
table: | The name of the table to be updated or extended with new data.
The staging table must have this name with the prefix added. |
display_errors: | A value of “Yes” or “No” to indicate whether invalid foreign key
values should be displayed to the user in a GUI. |
Required Output Parameters
error_list: | The name of an execsql substitution variable to be set to a value
identifying any foreign key errors that are found. The value, if set,
will be a comma-separated list of foreign key names and the number
of rows having an invalid value for each foreign key. |
Script LOAD_STAGING
This script is the primary high-level interface for updating a set
of base tables from corresponding staging tables. It carries out
null checks, foreign key checks, and update and insert operations
on all tables according to the specifications in the
control table.
The upsert step is only carried out if no errors were found in
any of the staging tables. The upsert operation is carried out
within a transaction, so if any errors occur, or the user cancels
the operation, no changes will be made to any table.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
control_table: | The name of the control table created by the script
staged_to_load . |
do_commit: | A flag to indicate whether or not the upsert script
should commit the changes. Should be ‘Yes’ or ‘No’. |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
control_table: | The name of the control table created by the script
staged_to_load . |
do_commit: | A flag to indicate whether or not the upsert script
should commit the changes. Should be ‘Yes’ or ‘No’. |
Script NULLQA_ONE
This script checks for nulls in non-nullable columns of a single
staging table. The identity of the staging and base tables are
specified as script arguments. This script does not use the
control table and so can be useful when
only null checks are to be carried out on a single table.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
table: | The name of the table to be updated or extended with new data.
The table name must be the same in the base and staging
schemas. |
Required Output Parameters
error_list: | The name of an execsql substitution variable to receive information
about any errors found. The information will consist of a comma-
delimited list of non-nullable column names, each followed by the
number of rows having a null value in that column. |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
table: | The name of the table to be updated or extended with new data.
The staging table must have this name with the prefix added. |
Optional Input Parameters
exclude_null_checks: |
| A comma-separated list of single-quoted column names that are
to be excluded from null checks. |
Required Output Parameters
error_list: | The name of an execsql substitution variable to receive information
about any errors found. The information will consist of a comma-
delimited list of non-nullable column names, each followed by the
number of rows having a null value in that column. |
Script PKQA_ONE
This script checks for duplicate primary key values in a single
staging table. The table name and other specifications for reporting
the results are passed as script arguments. This script does not
use the control table, and so can be useful
when only foreign key checks need to be performed on a single staging
table.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
table: | The name of the table to be updated or extended with new data. The table
name must be the same in the base and staging schemas. |
display_errors: | A value of “Yes” or “No” to indicate whether duplicate
primary key values should be displayed to the user in a GUI. |
Required Output Parameters
error_list: | The name of an execsql substitution variable to be set to a value
identifying any duplicate primary key errors that are found.
The value, if set, will be a list of the number of primary key
values that are duplicated, and the number of duplicates for each. |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
table: | The name of the table to be updated or extended with new data.
The staging table must have this name with the prefix added. |
display_errors: | A value of “Yes” or “No” to indicate whether duplicate primary key
values should be displayed to the user in a GUI. |
Required Output Parameters
error_list: | The name of an execsql substitution variable to be set to a value
identifying any duplicate primary key errors that are found.
The value, if set, will be a list of the number of primary key
values that are duplicated, and the number of duplicates for each. |
Script QA_ALL
This script carries out all null and foreign key checks on all
staging tables identified in the control table.
The results of null and foreign key checks are recorded in the
null_errors and fk_errors columns of the control table,
respectively.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
control_table: | The name of the control table created by the script
staged_to_load . |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
control_table: | The name of the control table created by the script
staged_to_load . |
Script STAGED_TO_LOAD
This script initializes the control table
with the names of the tables to have their data updated or
extended, and also sets default values for settings that
control details of the QA checks and upsert operations.
Required Input Parameters
control_table: | The name of the control table that will be (re-)created
by this script. |
table_list: | A comma-separated list of the names of the tables that
are to be modified with data from the staging tables. |
Script UPDTPK_ONE
This script will update the primary key values of a single base
table using new values specified in a staging table. This is an
analogue to the UPSERT_ONE script, except that it only performs
an update of primary key columns and does not insert any rows.
The format of the staging table used for this script is different
from that used for the other upsert scripts. The staging table
must have:
- A set of columns identical to the primary key columns of the
base table to be updated, with the same names and types.
- A second set of equivalent columns, but all the column names
must be prefixed with “new_”.
The staging table does not have to have a primary key defined,
but the UPDTPK_ONE script requires that all of the following
requirements are met:
- The base table has a primary key defined.
- The staging table contains a column name prefixed with “new_”
for every primary key column in the base table.
- The staging table contains at least one row with an update
specified.
- If any of the “new_”-prefixed columns are populated on a row
of the staging table, all of the “new_”-prefixed columns on
that row are populated.
- For any update to be made, the values of the primary key
columns in the staging table exist in the base table.
- None of the values of the “new_”-prefixed columns exist in
the base table.
- No two or more original primary key values in the staging
table have the same new primary key value.
- There are not multiple original primary key values in the
staging table that all have the same new primary key value.
- All foreign keys from new primary key values into other
base tables are valid.
There is no equivalent to UPSERT_ALL, to update primary keys
in multiple tables all in one step. QA checks for such a set
of simultaneous updates are infeasible given the potential
complexities that result from cascading updates.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
table: | The name of the base table to have its primary key updated.
The staging table should have the same name. |
display_errors: | A value of “Yes” or “No” to indicate whether any QA check
failures should be displayed to the user in a GUI. |
display_changes: |
| A value of “Yes” or “No” to indicate whether the changes
to be made to the base table should be displayed in a GUI. |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
table: | The name of the base table to have its primary key updated.
The staging table should have the same name, but with the
‘stage_pfx’ prefix. |
display_errors: | A value of “Yes” or “No” to indicate whether any QA check
failures should be displayed to the user in a GUI. |
display_changes: |
| A value of “Yes” or “No” to indicate whether the changes
to be made to the base table should be displayed in a GUI. |
Script UPSERT_ALL
This script the upsert operation for all data tables that are
identified in the control table. This
script will update the rows_updated and rows_inserted
columns of the control table as appropriate.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
control_table: | The name of the control table created by the script
staged_to_load . |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
control_table: | The name of the control table created by the script
staged_to_load . |
Script UPSERT_ONE
This script creates and executes SQL UPDATE and INSERT statements
to move data from a single staging table to the corresponding
base table. This script does not use the
control table and therefore can be
useful when only the upsert operation is to be performed on a
single table.
PostgreSQL and SQL Server
Required Input Parameters
base_schema: | The name of the database schema for the base table. |
staging: | The name of the database schema for the staging table. |
table: | The name of the table to be updated or extended with new data.
The table name must be the same in the base and staging
schemas. |
exclude_cols: | A comma-delimited list of single-quoted column names
identifying the columns of the base table that are not
to have their data modified. |
display_changes: |
| A value of “Yes” or “No” indicating whether or not the
changes to be made to the base table should be displayed
in a GUI. |
display_final: | A value of “Yes” or “No” indicating whether or not the
base table should be displayed in a GUI after all data
revisions have been made. |
updcntvar: | The name of a substitution variable that will be set
to the number of rows updated. |
inscntvar: | The name of a substitution variable that will be set
to the number of rows inserted. |
MariaDB and MySQL
Required Input Parameters
stage_pfx: | A table name prefix that identifies a table as a staging table. |
table: | The name of the table to be updated or extended with new data.
The staging table must have this name with the prefix added. |
exclude_cols: | A comma-delimited list of single-quoted column names
identifying the columns of the base table that are not
to have their data modified. |
display_changes: |
| A value of “Yes” or “No” indicating whether or not the
changes to be made to the base table should be displayed
in a GUI. |
display_final: | A value of “Yes” or “No” indicating whether or not the
base table should be displayed in a GUI after all data
revisions have been made. |
updcntvar: | The name of a substitution variable that will be set
to the number of rows updated. |
inscntvar: | The name of a substitution variable that will be set
to the number of rows inserted. |