Flat file loads can be performed from either a Windows, UNIX/Linux or Hadoop connection. As with all other load types it is easier to use the drag and drop functionality to create load tables.
Flat files can also be loaded using SQL Server Integration Services (SSIS). For Flat File load instructions using SSIS, refer to the next section Loading Data from a Flat File using SSIS.

Greenplum - To process Greenplum file loads via the Windows Scheduler, please ensure that the Windows Scheduler user has the all the necessary permissions including having the ssh host key saved.

The drag and drop process for flat files is as follows:

  1. Browse to the directory and file via the appropriate Connections.
  2. Double-click on the Load Table object in the left pane to create a drop target.
  3. Drag the file from the right pane and drop it into the middle pane. 
  4. The dialog below appears. Rename the file if necessary and then click the ADD button.

    The option Add meta data columns to table  is used for creating Load tables that are used in creating Data Vault objects. If this option is selected, two DSS columns ( dss_record_source  and d ss_load_date ) are included in the meta data for the table and are populated by transformations. These two DSS columns could equally be applied to other Load tables not used in a Data Vault system but are particularly important to comply with the Data Vault standards. Refer to  Data Vaults  for details.

  5. The following dialog displays for file loads from Windows, UNIX/Linux and Hadoop connections.

    • First Row is a Header- To make changes in relational database tables after a table has been defined, users can edit the First Row is a Header option in the Source tab of the relevant table.
    • Hive - For File loads into Hive tables, the First Row is a Header option is a table option. Please ensure this option is selected in the file load wizard if you want to load files with header rows. To make any changes after Hive tables have been defined, the First Row is a Header option can be found in the Storage tab of the relevant Hive table. 
  6. The load type selected in the New Table Default Load Type field in the connection Properties screen is the pre-selected option in the Load type drop-down list.
    To change the desired load type and file parsing, use the Load type and File parsing drop-down list options.

    Load type options
    Post load procedures can either be manually generated from a RED provided procedure outline or generated leveraging a RED template—refer to Rebuilding Update Procedures for details.

    For custom database targets, the option to use Post Load Scripts is provided as an alternative to post load procedure. Post load scripts can be generated using a RED template—refer to Rebuilding Update Procedures for details.


    File parsing options
  7. Once the selection on the screen above is completed, a screen displays which enable the breakdown of the source data into columns. If no delimiter is entered, then width based parsing is assumed and an addition width size is prompted for.
    • If no delimiter is entered, then width based parsing is assumed and an addition width size is prompted.
    • Teradata - If this is a Fixed Width file loaded via TPT, the source file format can be specified later in the File Load Source Screen of the Load table's Properties window
    The following screen is an example of the file parsing technique.
    Use the Back button to revert to the previous column if an incorrect width or delimiter is entered.

Conversion 

Oracle

During the parsing of the columns an Oracle SQL*Loader conversion string can be used and is required in the case of dates for Oracle. Refer to the Oracle Sql loader manual for syntax. 

Any Oracle function can be used if enclosed in quotes and with the column name prefixed by a colon. For example: when loading a column called 'product_name', use the following syntax to bring over only the first 30 characters:

product_name "substr(:product_name,1,30)"

A special variable %FILE_NAME% can be used in an Oracle File based load. This will be substituted with the actual source file name.
To load the full file name (including the path) into a field, enter this transformation:

"RTRIM('%FILE_NAME%')"

To remove the path from the full file name on a UNIX or Linux system, enter this transformation:

"SUBSTR('%FILE_NAME%',INSTR('%FILE_NAME%','/',-1,1)+1)"

SQL

After Load transformations can be processed. The conversion string can be entered in the relevant column conversion field during the parsing of the columns, however, to process the column conversion, users will need to do the following after the table is created and loaded:

  1. Go into the Properties of the loaded table's relevant column(s) by double-clicking on the column(s) in the middle pane.
  2. Click the Transformation tab.
  3. Select the After load option in the Transformation Stage drop-down list.

Recreate the load table.

If running from the Windows scheduler, then a wildcard must be used in the file name if you wish to use the %FILE_NAME% variable. If a wild card is not used, then the literal file name appears enclosed in double quotes. When a wild card is used, Windows looks up the file and returns the actual file name with no enclosing double quotes.

Understanding SQL*Loader

When using either type of loading method SQL*Loader is called to perform the actual load. There is therefore a requirement that the executable called 'sqlldr' be in the path.
Under UNIX this means that the UNIX user must be able to execute 'sqlldr' from the command prompt under a normal log on.
Under Windows 'sqlldr' must execute when the command is issued from a DOS prompt.
In both the Windows and UNIX environments sqlldr returns a result code indicating if the load worked fully, partially or failed. WhereScape RED makes use of this returned code to report on the results of the load.