Overview
This design focuses on creating a reservation out of a job after the job has started running.
When a job encounters a problem with application/data/script/etc, the job exits and the resources allocated to the job are released back to the server. If the user wants to run the job again (after correcting the issues that caused the problem), they need to re-submit the job which could take a while to run (depending on various factors like number of jobs in the queue, priority, scheduling policy etc.). If the user needs resources to be allocated for a time period so that the job can be re-submitted and can run without delays, they can use reservations. As of now reservations have a start and end time, so it is not yet possible to have a reservation that is schedule-able. This design focuses on creating a reservation out of a job after the job has started running.
Technical Details
- Interface 1 - New '--job' option to pbs_rsub command.
- Visibility: public
- Change Control: Stable
- Synopsis: Allow users to create a reservation out of a running job.
- Details:
- This command will create a reservation using the exec_vnodes of the job provided.
- Example -
[root@d_server /]# qstat -f 1.d_server | grep exec_vnode
exec_vnode = (vnode[0]:ncpus=1)+(vnode[0]:ncpus=1)+(vnode[0]:ncpus=1)
[root@d_server /]#[root@d_server /]# pbs_rsub --job 1
R2.d_server CONFIRMED[root@d_server /]#
[root@d_server /]# pbs_rstat -f | grep resv_nodes
resv_nodes = (vnode[0]:ncpus=1)+(vnode[0]:ncpus=1)+(vnode[0]:ncpus=1)
[root@d_server /]#
- Example -
- This option can only be used for a job in state 'R' and substate 42.
- "request invalid for job state" will be displayed if the job is not in state R/42.
- The newly created reservation will be immediately confirmed as shown above.
- The walltime of the newly created reservation will be the same as that of the job.
- The start time of the newly created reservation will be copied from the job.
- The end time of the newly created reservation will be calculated from the start time and walltime
Other attributes that will be copied from the job are -
JobReservationJob_Owner Reserve_Owner schedselect schedselect exec_vnode resv_nodes - The reservation ID will be prefixed with 'R' as that of advance reservations.
- The reservation will be named R<next_available_id>.
- The job from which the reservation is created will be moved to the newly created reservation queue.
- An array job ID cannot be used with this new option.
- If the job is peer scheduled, the reservation will be created in the pulling complex.
- This command will create a reservation using the exec_vnodes of the job provided.
- Interface 2: A new job attribute "create_resv_from"
- Visibility: public
- Change Control: Stable
- Synopsis: Allow users to mark a job for creating a reservation out of it at the time of submission (qsub) or through a runjob hook.
- Details:
- This command will mark the job for creating a reservation out of it.
- Example:
[root@d_server /]# qsub -Wcreate_resv_from=1 -- /bin/sleep 1111
3016.d_server
[root@d_server /]# qstat -sd_server:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
------------------- -------------- -------- ------------ ---------- ------ ------ ----------- ------ - -------
3016.d_server root R3017 STDIN 10824 1 1 -- -- R 00:00
Job run at Fri Jan 10 at 22:58 on (d_server:ncpus=1)
[root@d_server /]# pbs_rstat
Resv ID Queue User State Start / Duration / End
---------------------------------------------------------------------
R3017.d_se R3017 root@d_s RN Today 22:58 / 157680000 / Wed Jan 08 2025 2
[root@d_server /]#- Example showing creating a reservation out of a job is in the file hook demo.txt .
- Example:
- Points 1.d.iii - 1.d.xii apply here as well.
- This command will mark the job for creating a reservation out of it.
- Interface 3: A new reservation attribute "reserve_job"
- Visibility: public
- Change Control: Stable
- Synopsis: Allow users to identify if the reservation is created out of a job.
- Details:
- Example:
- [root@d_server /]# pbs_rstat -f | grep job
reserve_job = 3016.d_server
[root@d_server /]#
- [root@d_server /]# pbs_rstat -f | grep job
- Example:
- Interface 4: pbs_rsub error message when creating a reservation out of a reservation job.
- Visibility: public
- Change Control: Stable
- Synopsis: A new error message indicating that creating a reservation out of a reservation job is not allowed.
- Details:
- Example:
- [root@d_server /]# pbs_rsub --job 3016
pbs_rsub: Reservation cannot be created from a reservation job
[root@d_server /]#
- [root@d_server /]# pbs_rsub --job 3016
- Example:
- Interface 4: pbs_rsub error message when creating a reservation out of an array job.
- Visibility: public
- Change Control: Stable
- Synopsis: A new error message indicating that creating a reservation out of a reservation job is not allowed.
- Details:
- Example:
- [root@d_server /]# pbs_rsub --job 3[]
pbs_rsub: Reservation cannot be created from an array job
[root@d_server /]# - [root@d_server /]# pbs_rsub --job 3[1]
pbs_rsub: Reservation cannot be created from an array job
[root@d_server /]#
- [root@d_server /]# pbs_rsub --job 3[]
- Example: