New Sched attribute to throttle job attribute updates

Motivation: Scheduler sends job attribute updates for certain attributes to the server. These updates can become a performance bottleneck, affecting scheduling speed. The most problematic ones are the updates related to “job can’t run”, where scheduler updates the job comment, possibly accrue_type, for each job that’s not run, even if it might run those jobs in the very next cycle. So, throttling such updates sounds reasonable, and can speed up the scheduler greatly. By reducing the total number of requests sent to the server, it should also help the server be more responsive to other more important requests.

External changes:

  • New sched attribute “attr_update_period”: This will specify the time in seconds after which scheduler will send “job can’t run” related attribute updates to the server. These attributes are as follows:

    • comment

    • estimated.exec_vnode

    • estimated.start_time

  • Default: There will be no default value, but internally PBS will behave as if the value were 0, i.e - scheduler will send updates every cycle, just like today. For sites which see a large volume of jobs, admins might want to set this to a higher value. With a workload of 100k jobs every cycle, 50k of which wouldn’t run, I saw a 3x+ performance boost when setting this value to just 60 seconds.

  • Exception: If “accrue_type” needs to be updated for a job, then scheduler will ignore the waiting window & send all attribute updates for that job immediately so that eligible time is accrued accurately.

    • To ameliorate this, I’m also proposing that we change the default accrue_type of jobs to ‘eligible_time’ instead of ‘initial_time’. It seems like we already document “eligible_time” as the default for accrue_type, so this shouldn’t need a separate design change document.

  • Permissions: Manager write only, everyone can read

  • Can be configured per scheduler in a multi-sched scenario

  • All “job run” related attribute updates, like pset, walltime for STF jobs, etc., will be sent like before.

  • Caveats:

    • Scheduler will determine at the start of every cycle whether it should send attribute updates or not. So, there can be some additional delay in the attribute updates getting sent. For example:

      • If attr_update_period set to 5 mins and each sched cycle takes 2 mins, scheduler will send updates in the 4rd sched cycle which happens 6 mins after it last sent updates, even though the update period is 5 mins.

    • Depending on the configured period, there will be delays associated with when attributes like job comment get updated, which tells users why their job isn’t running, so admins should keep this in mind and figure out what the value of this attribute works best for their site.


Technical details:

  • Scheduler will check at the beginning of each cycle whether it’s been attr_update_period seconds since it last sent the updates to decide whether to send updates or not that cycle.