Overview:

The objective of this https://pbspro.atlassian.net/browse/PP-1303 is to provide a hook event that should be invoked when MOM receives the suspend request to a running JOB. Just before the SIGSTOP is sent, an execjob_suspend hook can  a) Modify the job’s Execution_Time, Hold_Types, and resources_used attributes b) Cause the job to keep running c) Set attributes and resources on the vnode(s) managed by the MoM where this job executes d) Flag the job to be rerun e) For third-party licensing software currently, in mainline, even license resources are deemed to be freed on suspension, but sending a STOP signal to a job is not enough to actually free the licenses and f) special checkpointing operations that can also free a lot of resources that preempted jobs are hogging, g) Weather domain customers would like to customize the suspend signal to make weather workflow tools like Cylc to work properly. 


Jira ID

PP-1303 - Adding Job-Suspend hook event in MoM side

Forum Discussion  

Click here

Requirements and Use cases Click here


Interface 1: MoM hook event - “execjob_suspend”