Overview
Placement sets are used to force a job to run on all nodes that are alike in certain ways. Placement sets are created based on resources of type string_array. The resource is set on nodes. A placement set is a set of nodes where the resource has the same value.
Placement sets can be enabled at the server, queue, or job level. Up until now the whole job must fit within a single placement set. This feature will allow each chunk to request placement sets.
Technical Details
Glossary
chunk complex - Part of a select statement which is between the pluses. One or more identical chunks specified in the form of N:chunk.
New PBS resource: group
There is a new pbs resource named group. This can only be requested in the select statement. The value of the group resource will be a placement set resource (just like the resources in node_group_key). When a chunk complex requests group, placement sets will be created based on the specified resource and the chunks from that chunk complex will be placed on nodes where the resource is set to the same value. A select statement can contain multiple group resources with different values as long as there aren't two group resources requests in the same chunk complex. Two chunk complexes can contain the same group=resource value, each chunk complex will be evaluated individually. This means the different chunk complexes can be placed on different placement sets in the same placement pool.
Interaction with other placement sets
Currently a job can be run within one pool of placement sets. These will come from the server, queue, or job. The job's place=group overrides the queue's node_group_key which overrides the server's node_group_key. Per-chunk placement will work slightly differently. If both per-job and per-chunk placement are requested together, per-chunk placement will be made by breaking the per-job placement sets apart.
Example:
Nodes | Color | Shape |
---|---|---|
1-2 | blue | square |
3-4 | blue | triangle |
5-6 | red | square |
7-8 | red | triangle |
Current:
If the server has node_group_key=color and a job requests place=group=shape, the job will be placed
shape=square which consists of nodes 1, 2, 5, or 6
shape=triangle which consists of nodes 3, 4, 7, or 8
Example 1: Interaction between server's node_group_key and per-chunk placement
If the server has node_group_key=color and a job requests select=group=shape, the job will be placed on:
color=blue+shape=square: nodes 1-2
color=blue+shape=triangle: nodes 3-4
color=red+shape=square: nodes 5-6
color=red+shape=triangle: nodes 7-8
Example 2: Interaction between server's node_group_key with multiple chunk complexes. One chunk complex has per-chunk placement and the other does not.
node_group_key=color
select=2:ncpus=1:group=shape+2:ncpus=1
Chunk complex 2:ncpus=1:group=shape will be run on the same node selection as example 1
chunk complex 2:ncpus=1 will be run on
color=blue nodes 1-4
color=red nodes 5-8
Example 3: Per-chunk placement with two chunk complexes with different groups.
no per-job placement
select=2:ncpus=1:group=color+2:ncpus=1:group=shape
Chunk complex 2:ncpus=1:group=color will be placed on color=blue (nodes 1-4) or color=red (nodes 5-8)
Chunk complex 2:ncpus=1:group=shape will be placed on shape=square (nodes 1-2, 5-6) or shape=triangle (nodes 3-4, 7-8)
Example 4: Per-chunk placement where two chunk complexes request the same group.
no per-job placement
select=2:ncpus=1:group=color+2:ncpus=1:group=color
Both chunk complexes will be placed similarly: color=blue (nodes 1-4) or color=red (nodes 5-8). The difference between per-job place=group=color and this request is that the two chunk complexes can be placed on different placement sets. It is possible for the first chunk complex to be placed on color=blue and the second chunk complex be placed on color=red
Interaction with do_not_span_psets
Currently if no placement set is large enough (when empty) to fit a job, the job will span across all nodes. This can be controlled with the scheduler's do_not_span_psets attribute. If do_not_span_psets is true, and a job can not fit within any placement set, the job will never run.
If no per-chunk placement is requested, per-job placement set spanning will work like it does today
If only per-chunk placement is requested, it will work similarly as per-job placement set spanning. It will only affect the chunk complex though. If the chunk complex can not fit into any placement set, it will span over all nodes. If multiple chunk complexes have different groups, spanning for each chunk complex is evaluated individually.
If both per-job placement and per-chunk placement are requested, spanning will happen in a tiered fashion. We first try and place chunks using their per-job+per-chunk placement sets. If any chunk can not fit, we will attempt to place the whole job in the per-job placement sets. If the job still can not fit, we will span as we do today over all nodes.