Status Messages

When the load_staging script is running, it will write progress and status messages to the terminal or console. These messages provide the following information:

  • The name of each table on which non-null checks are conducted, as the checks are initiated.
  • The name of each non-nullable column that contains nulls, and the number of rows with nulls in that column.
  • The name of each table on which primary key uniqueness checks are conducted, as the check is initiated.
  • The number of duplicate primary key values in each table.
  • The name of each table on which foreign key checks are conducted, as the checks are initiated.
  • The name of each table referenced in an invalid foreign key.

These messages will appear on the console as shown in the following example:

Conducting non-null QA checks on table staging.d_study
Conducting non-null QA checks on table staging.d_location
Conducting non-null QA checks on table staging.d_sample
Conducting non-null QA checks on table staging.d_samploc
Conducting non-null QA checks on table staging.d_sampsplit
Conducting non-null QA checks on table staging.d_sampatt
    Column attribute has 1 nulls.
Conducting primary key QA checks on table staging.d_study
Conducting primary key QA checks on table staging.d_location
Conducting primary key QA checks on table staging.d_sample
Conducting primary key QA checks on table staging.d_samploc
Conducting primary key QA checks on table staging.d_sampsplit
    Duplicate key error on columns: study_id, sample_id, sample_no.
Conducting primary key QA checks on table staging.d_sampatt
Conducting foreign key QA checks on table staging.d_study
Conducting foreign key QA checks on table staging.d_location
Conducting foreign key QA checks on table staging.d_sample
Conducting foreign key QA checks on table staging.d_samploc
Conducting foreign key QA checks on table staging.d_sampsplit
Conducting foreign key QA checks on table staging.d_sampatt
    Foreign key error referencing e_sampatt.
    Foreign key error referencing d_sample.

Additional detail about QA failures may be written to the logfile.

Prompts

If the “display_changes” value is set to “Yes” in the control table, then both the results of QA errors and the data to be updated or added will be displayed interactively.

The results of a QA error check failure (in this case, a foreign key check) will be displayed like this:

_images/fk_error.png

If the ‘Cancel’ action is chosen from this display, the entire upsert script will halt, so that no further checks will be done and no data will be added or modified.

Data to be updated in the database will be displayed like this:

_images/update_prompt.png

Data to be inserted will be displayed like this:

_images/insert_prompt.png

If the ‘No’ action is chosen from either of the displays of data to be updated or inserted, then that specific update or insert operation will not be performed, but the upsert script will continue to run. Skipping an update or insert operation, however, may cause an integrity error to be reported by the DBMS when data in another table are subsequently added or modified.

Logfile Messages

A custom logfile will be written by the upsert script if specified by the following global variables:

  • logfile: The name of the logfile to use.
  • log_errors: A value of “Yes” or “No” (without quotes) that controls whether details of QA check failures are included in the logfile. A large amount of output may be generated as a result of setting this to “Yes”.
  • log_changes: A value of “Yes” or “No” (without quotes) that controls whether the new data added to each table by the upsert operation is written to the logfile. A large amount of output may be generated as a result of setting this to “Yes”.

If a large amount of information is written to the logfile as a result of logging either errors or changes, the performance of the overall script may be noticeably affected. Logging errors can provide additional detail that can help resolve those errors, and logging changes can provide a record of database changes that may be permanently retained.

QA Check Results

The custom logfile will contain information describing each QA check that is performed on each staging table. If any of the QA checks fails, details of that failure will be written to the logfile. The type of information that is written depends on the type of QA check that is conducted. More information about QA check results is written to the logfile than is displayed on the console. The logfile is therefore a more useful resource for identifying exactly what errors in the staging tables need to be corrected.

The following illustration shows the type of messages written to the logfile, including cases where the QA checks fail. For the sake of brevity, this example does not include the SQL that may optionally also be written to the logfile:

==================================================================
2019-03-24 10:00 -- Non-null QA checks on table staging.d_study
Checking column study_id.
Checking column study_name.

==================================================================
2019-03-24 10:00 -- Non-null QA checks on table staging.d_location
Checking column location.

==================================================================
2019-03-24 10:00 -- Non-null QA checks on table staging.d_sample
Checking column study_id.
Checking column sample_id.

==================================================================
2019-03-24 10:01 -- Non-null QA checks on table staging.d_samploc
Checking column study_id.
Checking column sample_id.
Checking column location.

==================================================================
2019-03-24 10:01 -- Non-null QA checks on table staging.d_sampsplit
Checking column study_id.
Checking column sample_id.
Checking column sample_no.

==================================================================
2019-03-24 10:01 -- Non-null QA checks on table staging.d_sampatt
Checking column study_id.
Checking column sample_id.
Checking column attribute.
    Column attribute has 1 nulls.
Checking column value.

==================================================================
2019-03-24 10:01 -- Primary key QA checks on table staging.d_study
Checking constraint pk_study.

==================================================================
2019-03-24 10:01 -- Primary key QA checks on table staging.d_location
Checking constraint pk_location.

==================================================================
2019-03-24 10:01 -- Primary key QA checks on table staging.d_sample
Checking constraint pk_sample.

==================================================================
2019-03-24 10:01 -- Primary key QA checks on table staging.d_samploc
Checking constraint pk_samploc.

==================================================================
2019-03-24 10:01 -- Primary key QA checks on table staging.d_sampsplit
Checking constraint pk_sampsplit.
Duplicate primary key values in staging.d_sampsplit
  study_id | sample_id | sample_no | row_count
 ----------|-----------|-----------|-----------
  C        | W004      | W004-DUP  | 2

==================================================================
2019-03-24 10:01 -- Primary key QA checks on table staging.d_sampatt
Checking constraint pk_sampatt.

==================================================================
2019-03-24 10:01 -- Foreign key QA checks on table staging.d_study

==================================================================
2019-03-24 10:01 -- Foreign key QA checks on table staging.d_location

==================================================================
2019-03-24 10:01 -- Foreign key QA checks on table staging.d_sample
Checking constraint fk_sampmassunit for table test.d_sample.
Checking constraint fk_parent for table test.d_sample.
Checking constraint fk_sampvolunit for table test.d_sample.

==================================================================
2019-03-24 10:01 -- Foreign key QA checks on table staging.d_samploc
Checking constraint fk_samploc_loc for table test.d_samploc.
Checking constraint fk_samploc_samp for table test.d_samploc.

==================================================================
2019-03-24 10:01 -- Foreign key QA checks on table staging.d_sampsplit
Checking constraint fk_parent for table test.d_sampsplit.

==================================================================
2019-03-24 10:01 -- Foreign key QA checks on table staging.d_sampatt
Checking constraint fk_sampattatt for table test.d_sampatt.
 Foreign key errors in d_sampatt referencing e_sampatt
  attribute | row_count
 -----------|-----------
  colour    | 3
  sheeen    | 1
Checking constraint fk_parent for table test.d_sampatt.
 Foreign key errors in d_sampatt referencing d_sample
  study_id | sample_id | row_count
 ----------|-----------|-----------
  A        | S02       | 1

Upsert Results

As the upsert operation is performed on each base table, information will be written to the custom logfile that records each step. This information may include the data values that are updated or inserted. When data are displayed for updates, the values that are displayed are the new data, not the pre-existing data in the base tables.

The following illustration shows the type of messages written to the logfile during the upsert operation. For the sake of brevity, this example does not include the SQL that may optionally also be written to the logfile:

==================================================================
2019-03-24 06:03 -- Performing upsert on table test.d_location

Updates:
  location | description
 ----------|---------------
  WC-017   | WC station 17

1 rows of test.d_location updated.

New data:
  location | description
 ----------|---------------
  GC-7451  | Station WT NE
  HR-2943  | Station WT NW
  HR-2491  | Station WT S

3 rows added to test.d_location.

==================================================================
2019-03-24 06:03 -- Performing upsert on table test.d_study

New data:
  study_id | study_name
 ----------|------------
  C        | Study C

1 rows added to test.d_study.

==================================================================
2019-03-24 06:03 -- Performing upsert on table test.d_sample

New data:
  study_id | sample_id | volume | volume_unit | mass | mass_unit
 ----------|-----------|--------|-------------|------|-----------
  A        | S003      |        |             | 1.1  | kg
  C        | W004      | 250.0  | mL          |      |
  B        | W003      | 1.0    | L           |      |

3 rows added to test.d_sample.

==================================================================
2019-03-24 06:03 -- Performing upsert on table test.d_samploc

New data:
  study_id | sample_id | location
 ----------|-----------|----------
  A        | S003      | GC-7451
  B        | W002      | WC-013c
  C        | W004      | WC-017

3 rows added to test.d_samploc.

==================================================================
2019-03-24 06:03 -- Performing upsert on table test.d_sampsplit

==================================================================
2019-03-24 06:04 -- Performing upsert on table test.d_sampatt

Updates:
  study_id | sample_id | attribute | value
 ----------|-----------|-----------|----------
  A        | S001      | odor      | Gasoline

1 rows of test.d_sampatt updated.

New data:
  study_id | sample_id | attribute | value
 ----------|-----------|-----------|------------
  A        | S002      | color     | Gray
  A        | S002      | odor      | None
  A        | S003      | color     | Blue-green
  B        | W001      | odor      | Petroleum
  B        | W001      | sheen     | trace
  B        | W003      | color     | Black
  C        | W004      | color     | Clear
  C        | W004      | odor      | None

8 rows added to test.d_sampatt.

==================================================================
2019-03-24 06:04 -- CHANGES COMMITTED.