Dimensions are often sourced from one table in the base application. In many cases, some codes require description lookups to complete the de-normalization of the dimensional data.
The process for building a dimension is the same for most other tables and begins with the drag and drop of the Load table that contains the bulk of the dimensional information.
A dialog appears as shown below. There are four choices for the default generation of the Dimension table and its update procedure. 
For custom database targets, you can set the initial default values for the DSS columns that are used in the procedure generation for slowly changing dimensions. Refer to DSS Columns for Custom Targets for details. |
With any dimension, we identify a Business Key that uniquely identifies the dimension records.
For example in the case of the product dimension, the product code is deemed to be the business key. The code uniquely identifies each product within the dimension. The product may also have a name or description and various other attributes that distinguish it (e.g. size, shape, color, etc.).
A common question when handling dimensions is what to do when the name or description changes:
An example:
Code | Description | product_group | sub_group |
|---|---|---|---|
1235 | 15oz can of brussels sprouts | canned goods | sprouts |
This product has been sold for many years and we consequently have a very good history of sales and the product's performance in the market. The company does a '20% extra for free' promotion for 3 months during which time it increases the size of the can to 18oz. The description is also changed to be '15 + 3oz can of brussel sprouts'. At the end of the promotion, the product is reverted to its original size and the description changed back to its original name.
The question is, do we want to track the sales of the product when it had a different description (slowly changing), or should the description of the product simply change to reflect its current name (normal). For this scenario, a previous value dimension would not provide much advantage, so it is not discussed.
The decision is not a simple one, and the advantages and disadvantages of each choice are discussed below.
As mentioned above, the choice is never a simple one. Even among experienced data warehouse practitioners, there will be a variety of opinions. The decision must be based on the business requirements. In many cases keeping the analysis simple is the best choice, at least in the early stages of a data warehouse development. Slowly changing dimensions do have a place, but there is nearly always an alternate method that provides equal or better results. In the example above, a promotion dimension coupled with the product dimension could provide the same analysis results while still keeping product-only analysis simple and easy to understand.
| Do not overcomplicate the design of an analysis area. Keep it simple and avoid the unnecessary use of slowly changing dimensions. |
If you selected a 'Previous values' dimension type, the following questions will be asked. If one of the other two types were chosen then proceed directly to Dimension Properties.
The following dialog appears:
It requests the definition of each column that will be managed by storing the previous value in a separate additional column. Note that the business key cannot be used in this manner, as a change to the business key will result in a new record.
Select the columns to be managed and click OK.
The following dialog appears for each column that has been selected.
Either accept the name of the additional column or enter a new name.
Once the dimension type is chosen, the Properties screen appears. Change the storage options if required. If prototyping and the dimension are simple (i.e. one source table) then it is possible to create, load, and update the dimension in a couple of steps. If you wish to do this select the (Build Procedure...) option from the Update Procedure drop-down, and answer Create and Load to the next question.
The following window appears after the Properties screen and asks if you want to create the dimension table in the database and execute the update procedure.
If you are satisfied with the columns that will be used and do not wish to add additional columns you can click the Create and Load button.
In this case, it would be normal practice to select the (Build Procedure...) option from the 'Update Procedure' drop-down box in the previous Properties window.
If no procedure has been selected then select the Create button as no update will happen in any event.
If Create or Create and Load is selected and a new procedure creation is chosen, proceed directly to the Generating the Dimension Update Procedure section.
If you have additional columns to add or columns to delete, click Close and proceed as follows below.
It is possible to create and load the table via the Scheduler by selecting this option from the drop-down list on the Create and Load button: |
The columns defined for the dimension are displayed in the middle pane.
It is possible to delete any unwanted columns by highlighting a column name or a group of names and choosing the Delete key.
The name of a column can also be changed by selecting the column and using the right-click menu to edit its properties. Any new name must conform to the database naming standards. Good practice is to use alphanumeric and the underscore character.
Refer to the Column Properties for a more detailed description of the various fields.
When prototyping, and in the initial stages of an analysis area build it is best not to remove columns, nor to change their names to any great extent. This type of activity is best left until after end users have used the data and provided feedback. |
With the columns of the dimension table displayed in the middle pane, this pane is considered a drop target for additional columns.
It is a simple matter therefore to select columns from other tables and to drag these columns into the middle pane.
The source table shows where each column was dragged from.
In the example above, the description column is acquired from the load_product, load_prod_group, and load_prod_subgroup tables. To create the dimension table, these columns must be assigned unique names. For this example, the last two columns have been renamed to group_description and subgroup_description.
Several columns do not have a source table. These columns have been added by WhereScape RED and are added depending on earlier choices.
A description of these columns follows:
Column Name | Description |
|---|---|
dim_product_key | The unique identifier (artificial key) for the dimension. This key is used in the joins to the fact table. |
dss_start_date | Used for slowly changing dimensions. This column provides a date and time stamp when the dimension record came into existence. It is used to ascertain which dimension record should be used when multiple are available. |
dss_end_date | Used for slowly changing dimensions. This column provides a date time stamp when the dimension record ceased to be the current record. It is used to ascertain which dimension record should be used when multiple are available. |
dss_current_flag | Used for slowly changing dimensions. This flag identifies the current record where multiple versions exist. |
dss_source_system_key | Added to support dimensions that cannot be fully conformed and the inclusion of subsequent source systems. See the ancillary settings section for more details. |
dss_version | Used for slowly changing dimensions. This column contains the version number of a dimension record. Numbered from 1 upwards with the highest number being the latest or current version. It forms part of the unique constraint for the business key of a slowly changing dimension. |
dss_update_time | Indicates when the record was last updated in the data warehouse. |
dss_create_time | Indicates when the record was first created in the data warehouse |
If a Previous Value type of dimension is chosen, or, if the dimension is converted to this type, it is possible to manually add any required columns that were not defined as part of the create. The steps are:
Once the dimension has been defined in the metadata we need to physically create the table in the database.
SQL Server makes use of an identity attribute on the artificial key column.
The next section covers the Generating the Dimension Update Procedure.