WhereScape maintains and distributes a customized version of Azkaban for the scheduling and orchestration of data warehouse workloads with RED. WhereScape’s customized version includes support for PostgreSQL as opposed to the MySQL version available publicly. This version of Azkaban is currently only available and distributed with WhereScape RED. WhereScape’s version of Azkaban is based on the Azkaban 3.x release branch. For the most part, the publicly available Azkaban documentation still applies to WhereScape’s version of Azkaban excluding the MySQL portions and any features beyond the 3.x version.

Note

For more information, visit the Azkaban official documentation.

The WhereScape Azkaban Scheduler is made up of a single Azkaban Web Server and one or more Azkaban Executor Servers. Both of these components can run on either Windows or Linux depending on your preference for environments and the job types.

You can have both Windows and Linux Azkaban Executors in your WhereScape Azkaban Scheduler ecosystem to allow for running a wide variety of jobs across your data warehouse. Every executor has a set of accepted ‘tags’ that allow you to tag certain job types to particular executors.

Installing the WhereScape Azkaban scheduler components on Windows is the easiest way to get started with Azkaban for RED, the Linux installation is more advanced and requires some knowledge of Linux and package management in your particular Linux distribution which is not covered in this install guide.

Before you begin

Common Prerequisites

  1. An existing RED metadata repository to associate with the Azkaban Scheduler (only one Azkaban Web Server Scheduler per RED repository is permitted.)

  2. A PostgreSQL-compatible database for the Azkaban metadata.

    Note

    The database name must be in lowercase only and not contain any special characters that would require surrounding it with double quotes (") in queries.

    RED and Azkaban metadata can coexist in the same database as they use different schemas, ‘red’ and ‘white’ respectively. Still, in production environments, it is recommended to keep them in separate databases for performance and administrative reasons.

  3. PostgreSQL tools installed (psql, pg_dump, pg_restore)

  4. Azkaban Web and Executor Servers require Java, only Open JDK 11 JRE is supported and tested. On windows installs the JRE within the RED install directory is used automatically but on Linux you will need to install the JRE yourself.

Next, you can find more information on how to install the Scheduler in Windows and Linux.

  • No labels