Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Overview

There is a desire for jobs to be placed on nodes of a single type.  The type groupings can have many values.  Job submitters do not care which value of the group their job is placed on as long as all the nodes for the job have the same value for the group.  In PBS today, this grouping is called a placement set.  Placement sets can be defined complex wide, queue wide, or just for the job.  The more specific placement sets overrides the less specific (e.g. queue placement sets override the complex-wide sets).  The new desire is more finely grained grouping.  Instead of per-job, it should be per-rank or set of ranks.

Technical Details

Glossary

chunk complex - Part of a select statement which is between the pluses.  One or more identical chunks specified in the form of N:chunk.

New PBS resource: group

There is a new pbs resource named group.  This can only be requested in the select statement.  The value of the group resource will be a placement set resource (just like the resources in node_group_key).  When a chunk complex requests group, placement sets will be created based on the specified resource and the chunks from that chunk complex will be placed on nodes where the resource is set to the same value.  A select statement can contain multiple group resources with different values as long as there aren't two group resources requests in the same chunk complex.  Two chunk complexes can contain the same group=resource value, each chunk complex will be evaluated individually.  This means the different chunk complexes can be placed on different placement sets in the same placement pool.


Interaction with other placement sets

Currently a job can be run within one pool of placement sets.  These will come from the server, queue, or job.  The job's place=group overrides the queue's node_group_key which overrides the server's node_group_key.  Per-chunk placement sets will work similarly.  If a chunk complex requests a group, it will override any other placement sets for the job.  If a job has multiple chunk complexes where some request a group and others do not, the chunk complexes that did not request a group will be placed by the per-job placement sets.

Example:

NodesColorShape
1-2bluesquare
3-4bluetriangle
5-6redsquare
7-8redtriangle


Current:

If the server has node_group_key=color and a job requests place=group=shape, the job will be placed 

shape=square which consists of nodes 1, 2, 5, or 6

shape=triangle which consists of nodes 3, 4, 7, or 8

Example 1: Interaction between server's node_group_key with multiple chunk complexes.  One chunk complex has per-chunk placement and the other does not.

node_group_key=color

select=2:ncpus=1:group=shape+2:ncpus=1

Chunk complex 2:ncpus=1:group=shape will be run on its group=shape

shape=square nodes 1-2, 5-6

shape=triangle nodes 3-4, 7-8

chunk complex 2:ncpus=1 will be run on the node_group_key=color

color=blue nodes 1-4

color=red nodes 5-8


Example 2: Per-chunk placement with two chunk complexes with different groups.

no per-job placement

select=2:ncpus=1:group=color+2:ncpus=1:group=shape

Chunk complex 2:ncpus=1:group=color will be placed on color=blue (nodes 1-4) or color=red (nodes 5-8)

Chunk complex 2:ncpus=1:group=shape will be placed on shape=square (nodes 1-2, 5-6) or shape=triangle (nodes 3-4, 7-8)


Example 3: Per-chunk placement where two chunk complexes request the same group.

no per-job placement

select=2:ncpus=1:group=color+2:ncpus=1:group=color

Both chunk complexes will be placed similarly:  color=blue (nodes 1-4) or color=red (nodes 5-8).  The difference between per-job place=group=color and this request is that the two chunk complexes can be placed on different placement sets.  It is possible for the first chunk complex to be placed on color=blue and the second chunk complex be placed on color=red

Interaction with placement set spanning

Currently if no placement set is large enough (when empty) to fit a job, the job will span across all nodes.  This can be controlled with the scheduler's do_not_span_psets attribute.  If do_not_span_psets is true, and a job can not fit within any placement set, the job not span and will never run.

If no per-chunk placement is requested, per-job placement set spanning will work like it does today

Spanning will not change.  The decision to span will be made at the job level.  Regardless if the entire job can't fit in any per-job placement set or a chunk can't fit in any per-chunk placement set, the entire job will span over all nodes.  The interaction with do_not_span_psets will not change either.

  • No labels