When running scheduled jobs from WhereScape RED, it's possible to configure a job to run commands when the job finishes successfully or when it fails. These commands can be set to send email notifications or create a log file about failure or success status.
Success and Failure commands can use variables and tokens, this allows them to use host scripts stored on the metadata repository instead of setting up scripts for each existing scheduler.
Success Commands
Failure Commands
Along with supporting the existing RED 9 behavior we have also added the capability to use the token format for referencing scripts from the metadata within the success/failure commands, which means you don't need to store a static script somewhere for each scheduler and instead pick it up from a host script stored in the metadata.
The version of Azkaban we ship has a known issue in that even for successful conditional execution the job in Azkaban is marked as cancelled, because in either execution path, success or failure, there are items in the flow that are skipped (i.e the failure step is skipped for success and the success is skipped on failure).
image-20251027-230347.png
Due to the issue above the proposed solution in the original description below might not be acceptable. In later version of Azkaban this behavior was apparently changed so that skipped tasks due to conditional workflows would not mark the job as KILLED.
If the failure command is populated execute the Failure Command as a system command when a job is being marked as failed by task thread.
If the job is being failed due to a parent job dependency failure then conditionally run the command based on the setting ‘Execute Failure Command in event of dependency failure’.
Ensure the command is only executed once for the job, instead of each failing thread which executing tasks in parallel.
If the success command is populated execute the Success Command as a system command when a job is being set to successful completion in the final stage of execution.
Expansion of existing documented tokens in the success/failure command should be supported.
The success (exit code 0) or failure (exit code non-zero) of the command should be reported in the job log. i.e.
Running Success Command [succeeded|failed]
Running Failure Command [succeeded|failed]
Set the usual script environment variables such as WSL_META*, job, task, sequence and scheduler api prior to running the command.
Allow the existing token support for script sourcing by looking for tokens in the form $WSL_SCRIPT_<script_name_from_metadata>_CODE$ and replacing them with the full path of the metadata script which should be written to disk at that location.
Enable execution of success/failure commands directly in our plugin RED-12369: RED 10 success and failure command support
In QE
Create a single task at the end that takes both success and failure commands as args and run one or the other after first determining the execution's state via the REST API. This would work well with the RED-12091: Add a new method to schedulerUtils.jar such that an individual script can be run
Closed
enhancement where we could run any script from the metadata. This would require changes to the wsl_sceduler_publish script. These parameters could be passed to scripts: https://submitteddenied.github.io/azkaban2/documents/2.1/jobconf.html
Success and Failure commands in Jobs are not yet catered for out of the box in RED 10.4
Reqs:
Any placeholder tokens in the success/failure commands should be converted to the equivalent tokens available in Azkaban.
wsl_scheduler_publish script should be updated to include success failure commands
We will soon be releasing a RED version 10.6.0.2 that adds success/failure commands directly to the RED Scheduler Plugin for Azkaban so that the existing RED 9 behavior works directly in RED 10 without the need to change the standard publish scripts to get it working.
Along with supporting the existing RED 9 behavior we have also added the capability to use the token format for referencing scripts from the metadata within the success/failure commands, which means you don't need to store a static script somewhere for each scheduler and instead pick it up from a host script stored in the metadata.