/
Proposed Design for PP-465
Proposed Design for PP-465
- PP-465Getting issue details... STATUS
Interface: job_requeue_timeout
- Visibility: Public
- Change Control: Obsolete
- Synopsis: server attribute job_requeue_timeout, will now be used to send reply to the client that job couldn't be rerun within the time period and will continue to be in progress.
- Details:
- server attribute job_requeue_timeout, does not have a clear specification as to what will happen to the job once the timeout is hit.
- current behaviour is that once the timeout is hit, server returns an error to the client that the rerun process has timed out.
- However, the server continues the process of rerunning the job even though an error is already returned to the client.
- As this "timeout" is not aborting the rerun process, but leads to display a spurious message that the rerun has timed out, the behaviour is not correct.
- PBS scheduler relies on the "delay" and if the API is made to return immediately, there are chances there will be over-subscription.
- Hence, we would continue to have this delay/timeout in place, however, the server attribute job_requeue_timeout, will be marked obsolete.
- The documentation will be changed to reflect that the attribute is obsolete and that the error message is spurious.
- The error message will be changed to display that the rerun process is in progress. Exact words will be "qrerun: Response timed out. Job rerun request still in progress for <jobid>.<server>"
- server attribute job_requeue_timeout, does not have a clear specification as to what will happen to the job once the timeout is hit.
, multiple selections available,
Related content
Analysis for PP-465
Analysis for PP-465
More like this
PP-305: If server_dyn_res script does not return on UNIX/LINUX, scheduler will hang
PP-305: If server_dyn_res script does not return on UNIX/LINUX, scheduler will hang
More like this
Changes to run_count attribute, hold and release operation of subjob(s) in a Job Array
Changes to run_count attribute, hold and release operation of subjob(s) in a Job Array
More like this
Creating reservation out of a job
Creating reservation out of a job
More like this
New sched attribute to control runjob wait + making pbs_asynrunjob truly async + deprecating 'throughput_mode'
New sched attribute to control runjob wait + making pbs_asynrunjob truly async + deprecating 'throughput_mode'
More like this
Preemption via deletion
Preemption via deletion
More like this