Once a dimension has been defined in the metadata and created in the database, an update procedure can be generated to handle the joining of any tables and the update of the dimension records.

The Zero Key Row

WhereScape RED by default, inserts a record into the dimension with an artificial key value of zero. This record is used to link any fact records that do not have valid dimension joins. 
The values of the various columns in this record are acquired from the contents of the field Zero Key Value which is set in the Properties screen of each dimension column. 

Generating a Procedure

  1. To generate a procedure, right-click on the dimension name to edit the properties for the dimension.
  2. From the Update Procedure drop-down list select (Build Procedure...).
  3. Click OK to update the properties and start the process of generating the new procedure.
  4. A Procedure Build Type dialog appears, allowing you to select between a Cursor and Set procedure build types from the drop-down list.

A Set based procedure performs one SQL statement to join all the source tables together and then insert this data into the Dimension. This is normally the fastest method of populating a table.
See below in this section for different options using the Cursor procedure building type.

Set-Based Procedure Building types

  1. Select Set from the drop-down menu.
  2. The following window appears to enter the Update Procedure Build Options.
  3. Click on the ellipsis button on the top rightmost corner to select the Business Key Columns for the dimension.

Processing tab

Business Key

A business key can be made up of multiple columns, but it must provide a unique identifier. Where multiple columns uniquely and separately identify the dimension, choose one to act as the primary business key.
For example, a source table may have a unique constraint on both a product code and a product description. Therefore, the description as well as the code must be unique. It is of course possible to combine the two columns, but the normal practice would be to choose the code as the business key.

  • Use the column name ascending/descending buttons to sort column names. To revert to the meta column order, click on the meta column order button.
  • NULL Values - none of the columns chosen as the business key should ever contain a NULL value. See the note at the start of the Dimensions chapter.

Parameters

If WhereScape RED parameters exist in the metadata, the following window is displayed. Any parameters selected in this window (by moving them to the right pane), are included in the generated update procedure as variables.
The procedure includes code to retrieve the value of the parameter at run time and store it in the declared variable.

The variables can also be used in column transformations and in the from/where clause for the update procedure. Some databases have a 30-character limit for variable names. WhereScape RED ensures the variables added for any parameters are less than 30 characters long by creating variable names in the form v_ followed by the first 28 characters of the parameter name.
For example, a parameter called MINIMUM_ORDER_NUMBER_SINCE_LAST_SOURCE_LOAD will be available as the variable v_MINIMUM_ORDER_NUMBER_SINCE_L.

  • WhereScape RED parameters should be unique within the first 28 characters to avoid conflicting variables names.
  • If the desired parameter doesn't exist in the metadata yet, a new parameter can be added by clicking on the Add New button on the bottom leftmost corner of the Select Parameters dialog.


Refer to Parameters for more information about WhereScape RED Parameters.

Fields

Description

Insert Zero Key Record

This option enables you to include the zero key (an unknown record) for tables with an artificial key. The default for this field is set.

Include Initial Load insert

This option adds insert statement to the update procedure that runs if the target Dimension Object is empty. The benefit of this is improved performance inserting into an empty table without performing any checks to see if rows already exist. The default for this field is FALSE (i.e. an initial insert statement is not added to the procedure).

Process by Batch

This field enables the user to select a column to break up the data being processed in a loop based on the distinct values in the field. The update procedure loops on this field and performs the delete, update and/or insert for each value. If the field chosen is a date datatype (date, datetime or timestamp), then the user can specify yearly, monthly, daily or column-level looping. The default for this field is False (do not do batch processing).

Batch Processing Field

Enables selecting a field to batch process on. If you select a date field you will have the ability to process by date part. If you select a join field to process by you can choose and attribute of that related table to group by.

Include Explicit Lock

Enables locking of the table to avoid concurrent transactions. Double click the Lock Clause field to open a window that enables you to create the lock clause to insert.

Delete before insert

This option enables a delete statement to be added to the update procedure before any update or insert statement. This is a particularly useful option for purging old data and for updates based on a source system batch number. The default for this field is No which automatically grays out the Issue Warning if a Delete Occurs and the Delete Where Clause fields.

  • Issue Warning if a Delete occurs - sets the procedure to a warning state if deletes occur.
  • Delete Where Clause - The delete where clause is appended to the generated delete statement to constrain the rows deleted.

Process Method

Enables selecting whether the table should be updated using an Insert/Update statement or a Merge statement.

Source Table Locking

Enables a locking request modifier to be specified for each source table. The specified locking request modifier is applied to each source table during generated update procedures. By default this is set to 'ACCESS' which locks each row being accessed, a blank entry will result in no locking clause in the generated procedure. This option may also be presented in a separate dialog

Include Insert Statement

The Include Insert Statement option includes an insert statement in the procedure to insert new rows in the Dimension. If this option is set, the Insert New Rows Only option is available. If this option is turned off, the update procedure will not contain an insert statement. The default for this field is set (i.e. an insert statement is included).

Insert New Rows only

The insert new rows only option uses change detection to work out what rows require inserting.

  • New Row Identification Method - Method used to identify that the records in the source are not currently recorded in the target table.

Insert Hint

Enter a database hint to be used in the INSERT statement. This is SQL Server only option. Defaults can be configured in Tools > Options > Default Update Procedure Options.

Insert New Rows Only

Uses change detection to work out what rows require inserting.

New Row Identification Method

Method used to identify that records in the source are not currently recorded in the target table.

Existing Data Selection Hint

Database-compliant hint to be used for the existing data select statement.

Include Update Statement

Includes an update statement in the procedure to update changing rows in the Data Store. If this option is chosen, then the Update Changed rows only option is available.

Include Hint

Enter a database hint to be used in the INCLUDE statement. This is a SQL Server only option. Defaults can be configured in Tools > Options > Default Update Procedure Options.

Update Changed Rows Only

Uses change detection to work out what rows require updating. Choosing this option, enables the Change Row identification Method.

Change Row Identification Method

Method used to identify that records in source have changed from what is currently recorded in the target table.

Existing Data Selection Hint

Database-compliant hint to be used for the existing data select statement.

Merge Hint

Enter a database hint to be used in the MERGE statement. SQL Server only option. Defaults can be configured in Tools > Options > Default Update Procedure Options.

Source tab

If only one source table was used, adding the Business Key and checking the above fields is enough to proceed, otherwise, use the Source tab to Join the relevant tables.

Fields

Description

Distinct Data Select

Ensures duplicate rows are not added to the Dimension. This is achieved by the word DISTINCT being added to the source selected in the update procedure. The default for this field is not set.

Source Join

The From clause, including Source Join information. See the example below for Joining multiple source tables.

Where Clause

The Where Clause. Use as a filter to extract only the necessary records that fulfill a specified criterion.

Group By

The Group by clause. Use in collaboration with the SELECT statement to arrange identical data into groups.

Joining multiple source tables

WhereScape RED provides a wizard that enables you to graphically generate the join clause for an object with multiple source tables. The Source Join wizard enables you to specify the primary (driving) table and the source tables to join. A graphical representation of the tables and the join criteria is displayed by the wizard which you can manipulate to change the defined joins.

 
The Source Join wizard guides you in generating the Join Clause through the following steps:

  1. Choose Primary Table
    The first source table encountered is the default primary (driving) table which can be changed before any joins are defined.
  2. Choose Tables to Join
    Once the primary table is set, the remaining source tables are listed in this drop-down. Select a table to join and then click Add. The two tables are added to pane 4 and the basis for the join is displayed in pane 3.
  3. Defined Joins
    The join type defaults to Inner join but this can be changed from the drop down.
  4. Join criteria of selected join

    The column(s) used to join the two tables can be joined manually by clicking the corresponding join column connection point from the left-hand table and then dragging the line that appears to the connection point of the required column in the table on the right.

    Alternatively, if a column with a matching or similar name exists in the left-hand table(s) then a context menu is provided on the columns on the right-hand table. You can right-click on the column to display the context menu and then select a column from the list.

    To aid manual column join, you can expand or collapse columns in the left-hand tables and toggle between natural and alphabetic column ordering.

  5. Once the join column(s) for the table pair are defined, then any additional source tables can be added from the drop-down list in step 2 and the above column mapping process repeats to join this table. This process is repeated until all source tables are joined.
  6. Clicking the Next button or the Join query tab progresses to the Join editor window, where custom joins, database functions, etc. can be added.


If RED is unable to parse a joint statement, the user cannot navigate back to the Define joins tab, and the following warning is displayed:


Change Detection tab

For Slowly Changing Dimension columns. If the dimension was defined as a Slowly Changing Dimension, the Change Detection tab is displayed in the table Update Build Options screen.

For custom database targets, you can set the initial default values for the DSS columns that are used in the procedure generation for slowly changing dimensions. Please refer to the section DSS Columns for Custom Targets for details. The default values set are populated in the corresponding fields of the Change Detection tab.

A change detection field(s) must be selected for this Dimension type.

Fields

Description

Change Detection Fields

Click the ellipsis button to select the change detection fields that are required for the dimension

Null Support

If the Add Null support is selected, the change detect column management caters to null values in any change detect columns. If this option is not selected and null values are present, errors may occur while running the update procedure. The default for this value is ON (i.e. null values are catered for).

Update Current Records Only

The update current record-only option only applies changes to non-change detect columns on the current record. If this option is not selected, all past (non-current) rows will also be updated to reflect changes to non-change detect columns. The default for this value is ON (i.e. only the current record is updated).

Reset Dates to Initial Values

Resets dss_start date and dss_end_date date values to original values.

Start Date for Initial Member

The start date for the initial member field contains the start date for the first version of a particular business key. The value should be specified in an appropriate form, taking into account the default date format in the databases. The date may need to be explicitly cast to the current data type. The default value provided is usually cast to the correct database and can be treated as a template. The default for this field is 1 January 1900.

End Date for Current Member

The end date for the current member field contains the end date for the current version (the row with a current flag of Y and the maximum version number) of a particular business key. The value must be specified in an appropriate form, taking into account the default date format in the databases. The date may need to be explicitly cast to the current data type. The default value provided is usually cast to the correct database and can be treated as a template. The default for this field is 31 December 2999.

Start Date for New Member Entry

The start date for the new member entry field contains the start date for any subsequent rows added to the history table (not the first row for a particular business key i.e. not version 1). The value must be specified in an appropriate form, taking into account the default date format in the databases. The date may need to be explicitly cast to the current data type. The default value provided is usually cast to the correct database and can be treated as a template. The default for this field is the current date and time.

End Date for Expiring Member Entry

The end date for the expiring member entry field contains the end date for any rows updated no longer to no longer be the current row in the history table (i.e. rows that are replaced by a new current row). The value must be specified in an appropriate form, taking into account the default date format in the databases. The date may need to be explicitly cast to the current data type. The default value provided is usually cast to the correct database and can be treated as a template. The default for this field is the current date and timeless an arbitrarily small amount (for SQL Server this is 0.00000005 of a day, or about 4 thousandths of a second).


  • No labels