Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Overview

There is a desire for jobs to be placed on nodes of a single type.  The type groupings can have many values.  Job submitters do not care which value of the group their job is placed on as long as all the nodes for the job have the same value for the group.  In PBS today, this grouping is called a placement set.  Placement sets can be defined complex wide, queue wide, or per-job.  The more specific placement sets have priority the less specific (e.g. queue placement sets have priority over the complex-wide sets).  The new desire is more finely grained grouping.  Instead of per-job, it should be per-rank or set of ranks.

Technical Details

Glossary

chunk complex - Part of a select statement which is between the pluses.  One or more identical chunks specified in the form of N:chunk.

New PBS resource: group

There is a new pbs resource named group.  This can only be requested in the select statement.  The value of the group resource will be a placement set resource (just like the resources in node_group_key).  When a chunk complex requests group, placement sets will be created based on the specified resource and the chunks from that chunk complex will be placed on nodes where the resource is set to the same value.  A select statement can contain multiple group resources with different values as long as there aren't two group resources requests in the same chunk complex.  Two chunk complexes can contain the same group=resource value, each chunk complex will be evaluated individually.  This means the different chunk complexes can be placed on different placement sets in the same placement pool.


Interaction with other placement sets

Currently a job can be run within one pool of placement sets.  These will come from the server, queue, or job.  The job's place=group overrides the queue's node_group_key which overrides the server's node_group_key.  Per-chunk placement sets will work similarly.  If a chunk complex requests a group, it will override any other placement sets for the job.  If a job has multiple chunk complexes where some request a group and others do not, the chunk complexes that did not request a group will be placed by the for the job.

Example:

NodesColorShape
1-2bluesquare
3-4bluetriangle
5-6redsquare
7-8redtriangle


Current:

If the server has node_group_key=color and a job requests place=group=shape, the job will be placed 

shape=square which consists of nodes 1, 2, 5, or 6

shape=triangle which consists of nodes 3, 4, 7, or 8

Example 1: Interaction between server's node_group_key and per-chunk placement

If the server has node_group_key=color and a job requests select=group=shape, the job will be placed on:

shape=square nodes 1-2, 5-6

shape=triangle nodes 3-4 7-7


Example 2: Interaction between server's node_group_key with multiple chunk complexes.  One chunk complex has per-chunk placement and the other does not.

node_group_key=color

select=2:ncpus=1:group=shape+2:ncpus=1

Chunk complex 2:ncpus=1:group=shape will be run on the same node selection as example 1

chunk complex 2:ncpus=1 will be run on

color=blue nodes 1-4

color=red nodes 5-8


Example 3: Per-chunk placement with two chunk complexes with different groups.

no per-job placement

select=2:ncpus=1:group=color+2:ncpus=1:group=shape

Chunk complex 2:ncpus=1:group=color will be placed on color=blue (nodes 1-4) or color=red (nodes 5-8)

Chunk complex 2:ncpus=1:group=shape will be placed on shape=square (nodes 1-2, 5-6) or shape=triangle (nodes 3-4, 7-8)


Example 4: Per-chunk placement where two chunk complexes request the same group.

no per-job placement

select=2:ncpus=1:group=color+2:ncpus=1:group=color

Both chunk complexes will be placed similarly:  color=blue (nodes 1-4) or color=red (nodes 5-8).  The difference between per-job place=group=color and this request is that the two chunk complexes can be placed on different placement sets.  It is possible for the first chunk complex to be placed on color=blue and the second chunk complex be placed on color=red

Interaction with placement set spanning

Currently if no placement set is large enough (when empty) to fit a job, the job will span across all nodes.  This can be controlled with the scheduler's do_not_span_psets attribute.  If do_not_span_psets is true, and a job can not fit within any placement set, the job will never run.

If no per-chunk placement is requested, per-job placement set spanning will work like it does today

If only per-chunk placement is requested, it will work similarly as per-job placement set spanning.  It will only affect the chunk complex though.  If the chunk complex can not fit into any placement set, it will span over all nodes.  If multiple chunk complexes have different groups, spanning for each chunk complex is evaluated individually.

If both per-job placement and per-chunk placement are requested, spanning will happen in a tiered fashion.  We first try and place chunks using their per-chunk placement sets.  If any chunk can not fit, we will attempt to place the whole job in the per-job placement sets.  If the job still can not fit, we will span as we do today over all nodes.


  • No labels