The fields for the Flat File Source Screen are described below. See example source screens in File and Script based load Source screens.

If the file has been dragged and dropped into the load table (middle pane) then some of the fields on this tab are automatically populated.


Fields

Description

Load Type

Method of loading data into the table. The available options are dependent on the Source Connection. Defaults to the 'Default Load Type' of the Source Connection. Can be specified via the Properties screen.

Source Connection

Connection to the data source (database or file system). Can be specified via the Properties screen.

Load Script Template

Available for script loads only and only if there is a valid template available. Select the template to use when generating a load script, or select (None) to use RED's built-in load script generator. Only templates with the correct Type and Target DB will appear in this drop-down list. Refer to Templates for details.

Source File Details

Source File identification and definition information.

Source Directory

The full path (absolute path) of the folder/directory containing the Source File on the Windows or UNIX/Linux system.

Source File Name

The name of the source file containing the data to be loaded.

Source File Character Encoding

PDW option only- The character encoding of the input datafile for PDW file loads. Select between ASCII, UTF8, UTF16 or UTF16BE. The UTF8 encoding cannot be specified for a fixed width source file (i.e. when the Source File Field Delimiter is blank) because UTF8 is a variable length character encoding.

Source File Field Delimiter

Optional character that separates the fields within each record of the Source File. The delimiter identifies the end of each field. Common field delimiters are tab, comma, colon, semi-colon, pipe, tilde. If no field delimiter is specified the record is regarded as fixed-width. 

If an ASCII character value is used this field may show as an unprintable character. To enter a special character enter the uppercase string CHAR with the ASCII value in brackets (e.g. CHAR(9) ).


Source File Field Enclosure Delimiter

  • PDW Only - For PDW loads, the delimiter can be specified either as a character or as a hex value (such as 0x22 for a double quote).

Source File Record Terminator

Optional string to identify how each line/record in the Source File is ended/terminated/delineated. The system default is used when not specified. On UNIX/Linux systems, end-of-line is typically line-feed (ASCII 10). On Windows systems, end-of-line is typically carriage-return (ASCII 13) and line-feed (ASCII 10).

Source File has Field Headings/Labels

Indicates whether the first line of the Source File contains a heading/label for each field, which is not regarded as data so it should not be loaded.

Trigger File Details

Optional Trigger File identification and definition information. If a Trigger File is specified, the Source File will not be loaded until the Trigger File is available.

Trigger File Path

The purpose of the trigger file is to indicate that the copying/loading of the main file has completed and that it is now safe to load the file. Secondly the trigger file may contain control sums to validate the contents of the main load file. This field should contain the full path name to the directory in which a trigger file is located on the Windows or UNIX systems. If this field and/or the Trigger Name field is populated then the scheduler will look for this file rather than the actual load file.

Trigger File Name

Refers to the name of the file that is used as a trigger to indicate that the main load file is available. A trigger file typically contains check sums. If the trigger file exists then this is the file that is used in the Wait Seconds, and Action on wait expire processing. (see notes under Trigger Path above).

Trigger File Field Delimiter

If the trigger file provides control information then the delimiter identifies the field separation, or \n for a return. The data found will be loaded into parameter values whose names will be prefixed by the prefix specified and numbered 0 to n.

Trigger Parameter Name Prefix

If a trigger file and delimiter have been specified then the contents of the trigger file are loaded into parameters. The parameters will be prefixed by the contents of this field and suffixed by an underscore and the parameter number. These parameters can be viewed under Tools > Parameters from the WhereScape RED menu bar. The checking of these parameters can be achieved by generating a Post Load procedure. An example set of parameters may be budget_0, budget_1 and budget_2 where there are 3 values in the trigger file and the prefix is set to 'budget'.

Load Configuration

Configuration settings to control the load processing.

Check existence of Source File

This field controls whether the load process checks if the file exists before performing the load. This check can only be disabled when doing script-based TPT Loads from Windows and UNIX/Linux connections, which can improve the use of built-in TPT functionality to wait for the arrival of the file.

Wait for Source File or Trigger File

Controls whether the load process waits for the file to arrive when it is NOT available to load. Enabling this allows the wait-related properties to be specified. When a Trigger File is specified the load waits for it rather than the Source File.

Wait Limit (in seconds)

Maximum duration to wait when no file is available, which is specified in seconds e.g. 1800 seconds to wait up to 30 minutes. A value of 0 equates to no wait. If the wait time expires and the specified file cannot be found, then the load will exit with the status defined in Action, e.g. Default Action = Error

Action when Wait Limit Reached

Action to take when the Wait Limit has been reached and no file is available. The specified action impacts any remaining tasks in the currently running job. When 'Success' or 'Warning' is specified, the WhereScape RED Scheduler will continue to run any dependent tasks in the running job. In contrast, specifying the 'Error' or 'Fatal Error' actions will cause the scheduler to stop/fail the job and hold any remaining tasks when the "Wait Limit" is reached.

BULK INSERT Options

SQL Server only- Optional comma-delimited list of options to control the behavior of the SQL Server BULK INSERT command. For example, the following options will respectively lock the table; set the batch size to 100,000 rows; and fail the load when the first error occurs: TABLOCK, BATCHSIZE= 100000, MAXERRORS=0.

Archived File Details

Optional Archived file identification and definition information. Once a file has been successfully loaded or processed, it can be optionally archived by moving it and/or renaming it.

Compress Source File when Archive

Optionally compresses the successfully loaded Source File if it is archived.

Archived Source File Path

Optional full path (absolute path) of the folder/directory to MOVE the successfully loaded Source File on the Windows or UNIX/Linux system.

Archived Source File Name

Optional new name to RENAME the successfully loaded Source File to. By default, the original file name is used it is optional to rename it. However, the in-built variable $SEQUENCE$ can be used to include a unique sequence number in the new name. Likewise, the in-built variables $YYYY$, $MM$, $DD$, $HH$, $MI$ and/or $SS$ can be used to include the number of the current year, month, day, hour, minute and/or second respectively. The date/time components can be used separately or can be combined using one set of enclosing $ such as $YYYYMMDD$.

Archived Trigger File Path

Optional full path (absolute path) of the folder/directory to MOVE the successfully processed Trigger File on the Windows or UNIX/Linux system.

Archived Trigger File Name

Optional new name to RENAME the successfully processed Trigger File to. By default, the original file name is used it is optional to rename it. However, the in-built variable $SEQUENCE$ can be used to include a unique sequence number in the new name. Likewise, the in-built variables $YYYY$, $MM$, $DD$, $HH$, $MI$ and/or $SS$ can be used to include the number of the current year, month, day, hour, minute and/or second respectively. The date/time components can be used separately or can be combined together using one set of enclosing $ such as $YYYYMMDD$.

PDW File Load Options (Configuration details specific to PDW File Loads)

Load Mode

Select the loading mode from the list. The available options include: Append/Fast Append/Reload/Upsert. Note that selecting the Upsert Load Mode requires a business key to be defined.

Use Staging database

If this field is set and a staging database is defined in the target connection, then the database will be used for staging, otherwise, the target connection's database will be used for staging. To set up a Staging Database in the target connection, refer to Connections - Database for details.

Additional Command Line Parameters

Optional additional command line parameters (space separated) for PDW file loads that are not configured elsewhere. Such additional configurations include certificate verification (-N), skipping loads of empty files (-se), specifying the batch load size (-b <batchsize>), white space trimming around string fields (-c), converting empty strings to null (-E), and load failure options (-rt value|percentage, -rv <reject_value>, -rs <reject_sample_size>).


File and Script based load source screens

SQL Server File load Source screen:

SQL Server Script load Source screen from a Windows connection:

DB2 File load Source screen:

  • No labels