Page Comparison

...

Python constant: pbs.EXECJOB_PRERESUME
Event Parameters:
- pbs.event().job - This is a pbs.job object representing the job that will be resumed. This job object cannot be modified under this hook.
- pbs.event().vnode_list[] - This is a dictionary of pbs.vnode objects, keyed by vnode name, listing the vnodes that are assigned to the job. The vnode objects in the vnode_list cannot be modified.
Hook Attributes:
- fail_action: This hook will not allow a fail action to be set.
- user: This hook will only allow the value "pbsadmin".
Details:
- An execjob_preresume hook is executed by the primary mom when a request to resume the job is received.
- An execjob_preresume hook is executed by the sister mom when a request from the primary mom to resume the job's tasks is received.
- A call to pbs.event().accept() means the hook code has executed cleanly.
- A call to pbs.event().reject() means the hook code was not able to fully accomplish its task
  - Note: this will prevent all MoMs from resuming jobs.
  - Keeping with hook design, if one execjob_preresume hook is rejected, the other execjob_preresume hooks with a higher order value will not run.
- If the execjob_postsuspend preresume hook script encounters an unexpected error causing an unhandled exception, or times out due to the hook's alarm setting, the hook will act similar to a pbs.event().reject().
  - Note: this will prevent all MoMs from resuming jobs.
Internal Design:
- The MS will complete the event first. If it is not rejected, the sisters will then run their hooks.
- All moms must accept the event before the job can be resumed.
Consumer:
- Hooks like cgroups that need to take action when resource allocation changes.
  - Because the cgroups are cleaned up on suspension, it has to be recreated/modified when the job is resumed. Otherwise, the job will not have the resources it requested.
Caveats
- Current behavior shows that when a PBS_BATCH_SignalJob to resume a job is rejected by the mom, the server starts another scheduling cycle. If the scheduler says it can still be resumed, it will try again. If the execjob_preresume hook always rejects, there is nothing preventing this loop. Again, this is consistent with current behavior, but now it's easier to enter this loop.

...

Versions Compared

Old Version 8

New Version Current

Key