Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Current »

Overview

 The main motivation behind this RFE is to ensure that jobs are preempted and high priority work is run if at all possible.   The scheduler can currently preempt jobs via suspension, checkpointing, and requeuing.   If the scheduler tries to preempt via checkpointing and the job requested -c n, or if scheduler is trying to preempt via requeue and the job requested - r n, the scheduler will skip over this job.  If we provide another method of preemption to delete a job, jobs can't be ignored and will be moved out of the way.  It gives a high priority job the best chance to run.

Technical Details

Interface 1:

There will be a new letter option to the sched object's 'preempt_order' attribute.  The new letter is 'D'.  This will mean to delete jobs.  This means the set of letters accepted will be 'SCRD'.

This interface will be set by the admin.  It will be consumed by both the scheduler and the server.  The scheduler uses the interface when it decides if a job can be preempted.  The server uses the interface when it decides how a job is to be preempted.

The default preempt_order will not change (SCR).

Examples:

qmgr -c 's sched default preempt_order = RD'

qmgr -c 's sched default preempt_order = SCRD 50 R'


The pbs_deljob() IFL call normally returns back to the caller immediately after the server has received the request and started the delete process.  Unlike pbs_deljob(), if jobs are to be deleted, the server will not return back to a pbs_preempt_jobs() call until all the jobs have been fully preempted.  This means if a job is to be deleted,  the server will wait until the job is truly deleted before returning.  This is because the scheduler needs the jobs to be out of the way before it starts the high priority job.  If pbs_preempt_jobs() returned sooner, the scheduler would oversubscribe the nodes until jobs were finished being deleted.


Advice:

It is unwise to use a runjob hook with preemption via deletion.  This means the high priority job can have its run request rejected.  If this happens we'll have deleted jobs for no reason.







OSS Site Map

Project Documentation Main Page

Developer Guide Pages


  • No labels