Revision to PBS_MOM_NODE_NAME configuration variable
Follow the PBS Pro Design Document Guidelines.
Links
- Link to discussion on Developer Forum: http://community.pbspro.org/t/allow-dots-in-pbs-mom-node-name/2160
- Link to pull request: https://github.com/openpbs/openpbs/pull/1779
Overview
This design is a revision of PP-277: Multinode jobs may fail to start
Technical Details
PBS_MOM_NODE_NAME configuration variable
PBS_MOM_NODE_NAME is a configuration variable that may be defined in the pbs.conf configuration file. It is used to ensure that when the MoM starts up, it uses a name for the natural vnode that is consistent with the name used when creating the node on the server. The value is used when MoM builds a list of local vnodes at startup. The list consists of either the natural vnode alone, or a list of local vnodes (either configured with a v2 configuration file or with an exechost_startup or exechost_periodic hook). MoM cannot check what the value on the server because the server may not be running at the time MoM is started.
If PBS_MOM_NODE_NAME is defined in pbs.conf configuration file, then mom sets the name of the natural vnode to the value of PBS_MOM_NODE_NAME verbatim, without any checks. If PBS_MOM_NODE_NAME is not defined, MoM assumes that the name of the natural vnode is the (non-canonicalized) hostname returned by gethostname(), truncated after the first dot.
PBS_MOM_NODE_NAME also serves as a backup solution for hostname when mom fails to gethostname(). Under this use case, PBS_MOM_NODE_NAME must be defined, and must comply to RFC
If the call to gethostname() fails and PBS_MOM_NODE_NAME is either undefined, or defined but the value does not conform to RFCs
Unable to obtain my host name
Once the hostname is obtained, MoM will ensure the hostname resolves properly by calling get_fullhostname(). If the hostname fails to resolve, the following message will be printed to the log:
Unable to resolve my host name
Project Documentation Main Page