Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Python 3.8 or higher
  • PIP Manager
    • From Command Prompt (Run As Administrator) run the below command
      Code Block
      titlePIP Manager Install
      python -m pip install --upgrade pip
  • Azure Data Lake Storage Gen2
    • Azure Data Lake Storage Gen2 Account Name
    • Azure Data Lake Storage Gen2 Access Key
    • Azure Data Lake Storage Gen2 SAS Token
    • Azure Data Lake Storage Gen2 File System Name Name (Created in Storage Explorer Preview). EgFor example:

                     

    • Azure Data Lake Storage Gen2 Directory Name Name (Created in Storage Explorer Preview). EgFor example:

                     

    • Install Python package - pip install azure-storage-file-datalake
    • Net Framework 4.8 or higher
    • Windows Powershell version 5 or higher
  • Run these commands in " Windows PowerShell":
    Code Block
    [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
    Install-Module Az.Storage 
    Note: Use a 64-bit powershell terminal

...

Scripts entirely drive the Enablement Pack Install process. The table below outlines these scripts, their purpose, and if Run as Administrator is required. 

#

Enablement Pack Setup Scripts

Script Purpose

Run as Admin

Intended Application

1

install_Source_Enablement_Pack.ps1

Install Python scripts and UI Config Files for browsing files from Amazon S3, Azure Data Lake Gen2, Google Drive

Yes

New and Existing installations

The Powershell script above provides some help at the command line, this can be output by passing the -help parameter to the script.

...

Run Windows Powershell as Administrator 

Code Block
titleInstall Source Connectivity Packs
<Script1 Location > Powershell -ExecutionPolicy Bypass -File .\install_Source_Enablement_Pack.ps1

If prompted enter the source enablement pack as 'Azure

Azure Data Lake Storage Gen2 Connection Setup

  1. Login to RED
  2. Check-in Host Script Browse_Azure_DataLakeStorageGen2 in the objects list.                                                                                  
  3. Check UI Configurations in Menu, Tools → UI Configurations → Maintain UI Configurations
  4. Create a new connection in RED
  5. Select properties as shown in below screenshot                   

...

  • Property Section Azure Data Lake Gen2 Storage Authentication
    • Azure Data Lake Gen2 Storage Account: Azure Data Lake Gen2 Storage Account Name, The token used to read the storage account name in the scripts is $WSL_SRCCFG_azureStorageAccountName$
    • Azure Data Lake Gen2 Storage Account Access Key(Account Key)   : Azure Data Lake Gen2 Storage Account Access Key also called as Account Key, The token used to read the access key is environment variable: WSL_SRCCFG_azureStorageAccountAccessKey
    • Azure Data Lake Gen2 Storage Account SAS Token: Azure Data Lake Gen2 Storage Account Shared Access Signature (SAS) Token, The token used to read environment variable: WSL_SRCCFG_azureSASToken
  • Property Section Azure Data Lake Gen2 Storage Settings
    • Azure Data Lake Gen2 Storage File System: Azure Data Lake Gen2 Storage File System name, The token used to read the storage file system name in the scripts is $WSL_SRCCFG_azureStorageFileSystem$
    • Azure Data Lake Gen2 Storage File System Directory: Azure Data Lake Gen2 Storage Directory name where blob exists, The token used to read the directory name in the browse script is $WSL_SRCCFG_azureStorageFileSystemDirectory$
    • File Download Path: Local directory where the file needs to be downloaded for data profiling from the sourceAzure Data Lake Gen2 Storage.For Example Eg: C:\\Source\\Subfolder
      or C:/Source/Subfolder/ The token used to read path name in the browse script is $WSL_SRCCFG_fileDownloadPath$      
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
  • Property Section Azure Data Lake Gen2 Storage File Filter Options
    • Field Headings/Labels: Indicates whether the first line of the source file contains a heading/label for each field, which is not regarded as data so it should not be loaded. The token used to reader read field header boolean value in the script is $WSL_SRCCFG_azureDataLakeGen2FirstLineHeader$
    • File Filter Name: Indicates source file name. Provide Azure Blob filename pattern. The file list filters with file extensions, and file name patterns.
      • *.*
      • *.<File Extension>
      • <File Name>.<File Extension>
      • <File Name Start>*

        The Token used to read File Filter Name in the scripts is $WSL_SRCCFG_azureDataLakeGen2FileFilterName$

...