Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

There is a desire for jobs to be placed on nodes of a single type.  The type groupings can have many values.  Job submitters do not care which value of the group their job is placed on as long as all the nodes for the job have the same value for the group.  In PBS today, this grouping is called a placement set.  Placement sets can be defined complex wide, queue wide, or just for the job.  The more specific placement sets overrides the less specific (e.g. queue placement sets override the complex-wide placement sets).  The new desire is more finely grained grouping.  Instead of per-job, grouping be able to be for part of a resource request, but not all of it.

Technical Details

Glossary

chunk complex - Part of a select statement which is between the pluses.  One or more identical chunks specified in the form of N:chunk.

New PBS resource: group

There is a new PBS resource named group.  This can only be requested in the select statement.  The value of the group resource will be a placement set resource (just like the resources in node_group_key).  When a chunk complex requests group, placement sets will be created based on the specified resource and the chunks from that chunk complex will be placed on nodes where the resource is set to the same value.  A select statement can contain multiple group resources with the same/different values as long as there aren't two group requests in the same chunk complex.  If two chunk complexes contain the same group resource, each chunk complex will be evaluated individually.  This means the different chunk complexes can be placed on different placement sets in the same placement pool.


Interaction with other placement sets

Currently a job can be run within one pool of placement sets.  These will come from the server, queue, or job.  The job's place=group overrides the queue's node_group_key which overrides the server's node_group_key.  Per-chunk grouping will work similarly.  If a chunk complex requests a group, it will override any other placement sets for the job.  If a job has multiple chunk complexes where some request a group and others do not, the chunk complexes that do not request a group will be placed over all nodes available to the job (e.g. if the job is in a queue with nodes associated with it, only those nodes).  It is invalid to to request place=group and per-chunk grouping.

...

shape=triangle - nodes 3-4, 7-8

Example 1: Interaction between server's node_group_key with multiple chunk complexes.  One chunk complex has per-chunk grouping and the other does not.

node_group_key=color

select=2:ncpus=1:group=shape+2:ncpus=1

...

chunk complex 2:ncpus=1 will be run on any node

all nodes - 1-8

Example 2: Per-chunk grouping with two chunk complexes with different groups.

no per-job placement

select=2:ncpus=1:group=color+2:ncpus=1:group=shape

...

shape=triangle - nodes 3-4, 7-8


Example 3: Per-chunk grouping where two chunk complexes request the same group.

no per-job placement

select=2:ncpus=1:group=color+2:ncpus=1:group=color

...

The difference between per-job place=group=color and this request is that the two chunk complexes can be placed on different placement sets.  It is possible for the first chunk complex to be placed on color=blue and the second chunk complex be placed on color=red

Interaction with placement set spanning

Currently if no placement set is large enough (when empty) to fit a job, the job will span across all nodes available to the job.  This can be controlled with the scheduler's do_not_span_psets attribute.  If do_not_span_psets is true, and a job can not fit within any placement set, the job not span and will never run.

...

If the job requests per-chunk grouping, and any chunk can not fit, the entire job will span.  This is regardless if other chunks can fit in their placement sets.


Interaction with only_explicit_psets

The only_explicit_psets scheduler attribute tune the placement set creation.  If any node does not have a grouping resource set on it, it is usually added to a resource="" (e.g. color="") placement set.  If only_explicit_psets is true, then the resource="" placement set is not created, and those nodes are not available for placement.

There is no change in this behavior with per-chunk grouping.  If only_explicit_psets is true, then the resource="" placement set is not created for that chunk complex's placement pool.

Clarifications

The nodes used to create placement sets:

...