As with all other load types, it is easier to use the drag-and-drop functionality to create load tables.
The drag-and-drop process for flat files is as follows:

  1. Browse to the directory and file via the appropriate Connections.
  2. Double-click on the Load Table object in the left pane to create a drop target.
  3. Drag the file from the right pane and drop it into the middle pane. 
  4. The dialog below appears. Rename the file if necessary and then click the ADD button.

    The option Add metadata columns to table is used for creating Load tables that are used in creating Data Vault objects. If this option is selected, two DSS columns (dss_record_source and dss_load_date) are included in the metadata for the table and are populated by transformations. These two DSS columns could equally be applied to other Load tables not used in a Data Vault system but are particularly important to comply with the Data Vault standards. Refer to Data Vaults for details.

  5. The following dialog displays for file loads.

    • First Row is a Header- To make changes in relational database tables after a table has been defined, users can edit the First Row is a Header option in the Source tab of the relevant table.
    • Hive - For File loads into Hive tables, the First Row is a Header option is a table option. Please ensure this option is selected in the file load wizard if you want to load files with header rows. To make any changes after Hive tables have been defined, the First Row is a Header option that can be found in the Storage tab of the relevant Hive table. 

    6. The load type selected in the New Table Default Load Type field in the connection Properties screen is the pre-selected option in the Load type drop-down list.

    • To change the desired load type and file parsing, use the Load type and File parsing drop-down list options.


Load type options

  • The File based load options results in a load where the bulk of the load management is handled by the scheduler. 
  • The Script based load option will make WhereScape RED generate a host script and the load table will be a Script based load. This host script is executed by the scheduler to perform the load. For further details about Script based loads, refer to Script based load.
  • The Externally loaded option will not execute an actual load into the table but will process the actions specified in the Post Load procedure property. Any After transformations recorded against any of the columns in an Externally loaded table will also be processed.

Post load procedures can either be manually generated from a RED provided procedure outline or generated leveraging a RED template—refer to Rebuilding Update Procedures for details.

For custom database targets, the option to use Post Load Scripts is provided as an alternative to post load procedure. Post load scripts can be generated using a RED template—refer to Rebuilding Update Procedures for details

File parsing options

    • Single data column - with this option, the majority of the work in terms of parsing the file must occur in a subsequent procedure within the data warehouse. The data is dumped into a single column. The task of coding a procedure to parse the data must then be undertaken.
    • Columns parsed - with this option, WhereScape RED will attempt to parse the columns. You will be asked for details and for the column delimiter. You then step through the columns providing names and data types. WhereScape RED attempts to guess the data type, but it needs to be checked and the field length will probably need to be adjusted.

  • The Decimal Code button will show the decimal value of each character in the lines retrieved from the source file. These decimal codes will be shown below each line and are green.

     7. Once the selection on the screen above is completed, a screen displays which enable the breakdown of the source data into columns. If no delimiter is entered, then width based parsing is assumed and an addition width size is prompted for.

  • If no delimiter is entered, then width based parsing is assumed and an additional width size is prompted.

The following screen is an example of the file parsing technique.
Use the Back button to revert to the previous column if an incorrect width or delimiter is entered.


Conversion

Understanding SQL*Loader

When using either type of loading method SQL*Loader is called to perform the actual load. There is therefore a requirement that the executable called 'sqlldr' be in the path.

Under UNIX this means that the UNIX user must be able to execute 'sqlldr' from the command prompt under a normal log on.

Under Windows 'sqlldr' must execute when the command is issued from a DOS prompt.

In both the Windows and UNIX environments sqlldr returns a result code indicating if the load worked fully, partially, or failed. WhereScape RED makes use of this returned code to report on the results of the load.

  • No labels