Architecture Design
Synopsis
As part of the BASIL 1.7 project, we will be making a System Query to get KNL Node information. One vnode per KNL node will be created using this information.
...
Additional attributes such as numa_cfg, hbm_cache_pct and hbm_size_mb will also be considered when creating KNL vnodes.
Current behavior
PBS makes an INVENTORY Query request (using BASIL 1.4).
The Query response (from ALPS) is an XML representation of Compute Nodes.
Flow of control
New behavior
PBS will make a SYSTEM Query request (using BASIL 1.7) to collect information on KNL Nodes.
...
The following Table shows how the System Query attributes (in the XML Response) map into the basil.h structure (basil_system_element_t) that gets populated with this parsed XML information.
XML attribute name | Corresponding Structure element name (in basil.h) | Expected Values | Comments |
---|---|---|---|
role | role | batch, interactive | This attribute is used for KNL node determination. The structure element "role" will be set to "UNKNOWN" when unexpected attribute values are encountered in the XML response. |
state | state | up, down, unavailable, routing, suspect, admin | This attribute is used for KNL node determination. The structure element "state" will be set to "UNKNOWN" when unexpected attribute values are encountered in the XML response. |
speed | speed | Value cannot be an empty string, cannot be negative, cannot be "0". | |
numa_nodes | numa_nodes | Value cannot be an empty string, cannot be negative, cannot be "0". | This attribute is ignored during KNL vnode creation. |
dies | n_dies | Value cannot be an empty string, cannot be negative, can be "0". | This attribute is ignored during KNL vnode creation. |
compute_units | compute_units | Value cannot be an empty string, cannot be negative, can be "0". | This attribute will be displayed in 'resources_available.nppus'. |
cpus_per_cu | cpus_per_cu | Value cannot be an empty string, cannot be negative, cannot be "0". | This will be displayed in 'resources_available.vps_per_ppu' (the product of compute_units & cpus_per_cu will be displayed in 'resources_available.ncpus'). |
page_size_kb | avlmem | Value of attribute page_size_kb cannot be an empty string, cannot be negative, cannot be "0". avlmem holds the product of page_size_kb & page_count. | This represents conventional DRAM memory (will be displayed as 'resources_available.mem'). |
pgszl2 | pgszl2 holds X, where 2^X is page_size_kb in Bytes. | ||
page_count | Refer to avlmem note above (under "Values") | Value cannot be an empty string, cannot be negative, can be "0". | |
accels | accel_name | Not every Node group in the System 1.7 XML response may have this attribute. When it is present, the attribute value cannot be an empty string. | If this attribute is present in the XML response, we capture the attribute value during XML parsing. However, this attribute is ignored during subsequent KNL vnode creation i.e. KNL vnodes will be created without this attribute. KNL nodes cannot have GPUs. |
accel_state | accel_state | Not every Node group in the System 1.7 XML response may have this attribute. When it is present, the attribute value should be "up" or "down". | If this attribute is present in the XML response, we capture the attribute value during XML parsing and set the structure element "accel_state" to "UNKNOWN" when unexpected values are encountered. However, this attribute is ignored during subsequent KNL vnode creation i.e. KNL vnodes will be created without this attribute. |
numa_cfg | numa_cfg | a2a, snc2, snc4, hemi, quad. This attribute will always have a value (non-empty string) for KNL Nodes. The value will be an empty string for non-KNL Nodes. | |
hbm_size_mb | hbmsize | Value of hbm_size_mb cannot be negative. This attribute will always have a value (non-empty string) for KNL Nodes. This will be an empty string for non-KNL Nodes. | This represents High Bandwidth MCDRAM memory (in MB) (will be displayed as 'resources_available.hbmem'). |
hbm_cache_pct | hbm_cfg | Value of hbm_cache_pct will be 0, 25, 50, 100. This attribute will always have a value (non-empty string) for KNL Nodes. This will be an empty string for non-KNL Nodes. | |
None | nidlist | The Rangelist of Node IDs. | The XML response does not have a specific attribute name corresponding to the "nidlist" structure element. During XML parsing, the Rangelist of Node IDs (in the incoming XML) is assigned to the "nidlist" structure element. This is repeated for every Node group in the XML response. |
Handling unexpected attribute values.
...