Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Info for hook writers

    • The type for the modifyvnode event is pbs.MODIFYVNODE

    • modifyvnode hooks run at the server

    • Hooks registered to the modifyvnode event will execute after the vnode's state attribute is changed by the server

    • Two objects are available to modifyvnode event hook writers:

      1. pbs.event().vnode: this read-only object’s attributes represent the new/current state of the vnode (i.e., after the server has successfully changed the vnode state attribute)

      2. pbs.event().vnode_o: this read-only object’s attributes appear as they were prior to the server changing the vnode state

    • Two new functions have been added to the python vnode object:

      1. extract_state_strs() returns a list of the string values currently set in the vnode’s state bits

      2. extract_state_ints() returns a list of the integer values currently set in the vnode’s state bits

    • A pbs.event().accept() call terminates hook execution, as does pbs.event().reject(). The vnode and vnode_o event objects are unaffected by either call.

    • Deprecated vnode state constants

ND_FREE
ND_OFFLINE
ND_DOWN
ND_STALE
ND_JOBBUSY
ND_JOB_EXCLUSIVE
ND_RESV_EXCLUSIVE
ND_BUSY
ND_PROV
ND_WAIT_PROV
ND_UNRESOLVABLE
ND_SLEEP

  • New vnode state
    • constant changes

      • New constants

ND_STATE_FREE
ND_STATE_OFFLINE
ND_STATE_DOWN
ND_STATE_DELETED
ND_STATE_STALE
ND_STATE_JOBBUSY
ND_STATE_JOB_EXCLUSIVE
ND_STATE_RESV_EXCLUSIVE
ND_STATE_BUSY
ND_STATE_UNKNOWN
ND_STATE_NEEDS_HELLOSVR
ND_STATE_INIT
ND_STATE_PROV
ND_STATE_WAIT_PROV
ND_STATE_UNRESOLVABLE
ND_STATE_SLEEP
ND_STATE_OFFLINE_BY_MOM
ND_STATE_MARKEDDOWN
ND_STATE_NEED_ADDRS
ND_STATE_MAINTENANCE
ND_STATE_NEED_CREDENTIALS
ND_STATE_VNODE_AVAILABLE
ND_STATE_VNODE_UNAVAILABLE

  • Deprecated constants

ND_FREE
ND_OFFLINE
ND_DOWN
ND_STALE
ND_JOBBUSY
ND_JOB_EXCLUSIVE
ND_RESV_EXCLUSIVE
ND_BUSY
ND_PROV
ND_WAIT_PROV
ND_UNRESOLVABLE
ND_SLEEP

  • Example hook script that records current and previous vnode values in the pbs log only if the vnode just went down:

    Code Block
    languagepy
    # VnodeDownReport draft 20201102 19:46
    # Sample modifyvnode event hook script
    import pbs
    import os, sys
    
    try:
       e = pbs.event()
       vnode = e.vnode      # Represents the current (recently changed) state
       vnode_o = e.vnode_o  # Represents the state prior to the change
    
       if ((int(vnode.state)) & pbs.ND_STATE_VNODE_UNAVAILABLE) and not ((int(vnode_o.state)) & pbs.ND_STATE_VNODE_UNAVAILABLE):
          #
          # A node just went down. Report current and previous vnode values.
          #
          # Reports attributes in "Table 5-7: Vnode Attributes" from the 2020.1 Hooks Guide,
          # EXCEPT:
          #   arch (vnode attribute not defined in demo deployment)
          #   hpcbp_enable (vnode attribute not defined in demo deployment)
          #   hpbcbp_stage_protocol (vnode attribute not defined in demo deployment)
          #   hpcbp_webservice_address (vnode attribute not defined in demo deployment)
          #   hhpcbp_user_name (vnode attribute not defined in demo deployment)
          #   topology_info (due to output size)
          #
    
          # Demonstrate the new vnode state list functions
          vnode_state_str_list = ",".join(vnode.extract_state_strs())
          vnode_o_state_str_list = ",".join(vnode_o.extract_state_strs())
          vnode_state_int_list = ','.join([str(_) for _ in vnode.extract_state_ints()])
          vnode_o_state_int_list = ','.join([str(_) for _ in vnode_o.extract_state_ints()])
    
          # First print the state values
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;state: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, hex(vnode.state), hex(vnode_o.state)))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;state string list: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode_state_str_list, vnode_o_state_str_list))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;state int list: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode_state_int_list, vnode_o_state_int_list))
    
          # Next print the remaining vnode members
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;comment: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.comment, vnode_o.comment))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;current_aoe: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.current_aoe, vnode_o.current_aoe))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;in_multivnode_host: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.in_multivnode_host, vnode_o.in_multivnode_host))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;jobs: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.jobs, vnode_o.jobs))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;last_state_change_time: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, str(vnode.last_state_change_time), str(vnode_o.last_state_change_time)))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;Mom: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.Mom, vnode_o.Mom))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;ntype: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, hex(vnode.ntype), hex(vnode_o.ntype)))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;pcpus: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.pcpus, vnode_o.pcpus))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;pnames: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.pnames, vnode_o.pnames))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;Port: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.Port, vnode_o.Port))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;Priority: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.Priority, vnode_o.Priority))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;provision_enable: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.provision_enable, vnode_o.provision_enable))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;queue: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.queue, vnode_o.queue))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;resources_assigned: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.resources_assigned, vnode_o.resources_assigned))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;resources_available: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.resources_available, vnode_o.resources_available))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;resv: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.resv, vnode_o.resv))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;resv_enable: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.resv_enable, vnode_o.resv_enable))
          pbs.logmsg(pbs.LOG_DEBUG, \
             '%s;%s;sharing: vnode=%s vnode_o=%s' % \
             (e.hook_name, vnode.name, vnode.sharing, vnode_o.sharing))
       e.accept()
    except SystemExit:
       pass
    except:
       pbs.event().reject("%s hook failed with %s" % (pbs.event().hook_name, sys.exc_info()[:2]))
    • PBS log excerpt of a vnode state change in response to a host being offlined by "sudo pbsnodes -o pbsdev-centos7-mvn6-mom1":

      Code Block
      11/04/2020 05:08:24.288615;0004;Server@pbsdev-centos7-mvn6-server;Node;pbsdev-centos7-mvn6-mom1;attributes set:  at request of root@pbsdev-centos7-mvn6-server.pbsdev-centos7-mvn6.local
      11/04/2020 05:08:24.294421;0100;Server@pbsdev-centos7-mvn6-server;Node;pbsdev-centos7-mvn6-mom1;set_vnode_state;vnode.state=0x1 vnode_o.state=0x0 vnode.last_state_change_time=1604466504 vnode_o.last_state_change_time=1604466244 state_bits=0x1 state_bit_op_type_str=Nd_State_Set state_bit_op_type_enum=0
      11/04/2020 05:08:24.296099;0800;Server@pbsdev-centos7-mvn6-server;Hook;hook_perf_stat;label=hook_modifyvnode_VnodeDownReport_278 action=server_process_hooks profile_start
      11/04/2020 05:08:24.296171;0400;Server@pbsdev-centos7-mvn6-server;Hook;VnodeDownReport;started
      11/04/2020 05:08:24.296208;0086;Server@pbsdev-centos7-mvn6-server;Svr;Server@pbsdev-centos7-mvn6-server;Compiling script file: </var/spool/pbs/server_priv/hooks/VnodeDownReport.PY>
      11/04/2020 05:08:24.296986;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;state: vnode=0x1 vnode_o=0x0
      11/04/2020 05:08:24.297004;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;state string list: vnode=ND_STATE_OFFLINE,ND_STATE_VNODE_UNAVAILABLE vnode_o=ND_STATE_FREE,ND_STATE_VNODE_AVAILABLE
      11/04/2020 05:08:24.297012;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;state int list: vnode=1,409903 vnode_o=0,8400
      11/04/2020 05:08:24.297021;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;comment: vnode=None vnode_o=None
      11/04/2020 05:08:24.297029;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;current_aoe: vnode=None vnode_o=None
      11/04/2020 05:08:24.297037;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;in_multivnode_host: vnode=None vnode_o=None
      11/04/2020 05:08:24.297053;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;jobs: vnode=None vnode_o=None
      11/04/2020 05:08:24.297062;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;last_state_change_time: vnode=1604466504 vnode_o=1604466244
      11/04/2020 05:08:24.297071;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;Mom: vnode=pbsdev-centos7-mvn6-mom1.pbsdev-centos7-mvn6.local vnode_o=pbsdev-centos7-mvn6-mom1.pbsdev-centos7-mvn6.local
      11/04/2020 05:08:24.297079;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;ntype: vnode=0x0 vnode_o=0x0
      11/04/2020 05:08:24.297087;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;pcpus: vnode=4 vnode_o=4
      11/04/2020 05:08:24.297095;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;pnames: vnode=None vnode_o=None
      11/04/2020 05:08:24.297103;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;Port: vnode=15002 vnode_o=15002
      11/04/2020 05:08:24.297110;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;Priority: vnode=None vnode_o=None
      11/04/2020 05:08:24.297118;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;provision_enable: vnode=None vnode_o=None
      11/04/2020 05:08:24.297126;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;queue: vnode=None vnode_o=None
      11/04/2020 05:08:24.297137;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;resources_assigned: vnode=accelerator_memory=0kb,hbmem=0kb,mem=0kb,naccelerators=0,ncpus=0,vmem=0kb vnode_o=accelerator_memory=0kb,hbmem=0kb,mem=0kb,naccelerators=0,ncpus=0,vmem=0kb
      11/04/2020 05:08:24.297146;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;resources_available: vnode=arch=linux,host=pbsdev-centos7-mvn6-mom1,mem=2038904kb,ncpus=4,vnode=pbsdev-centos7-mvn6-mom1 vnode_o=arch=linux,host=pbsdev-centos7-mvn6-mom1,mem=2038904kb,ncpus=4,vnode=pbsdev-centos7-mvn6-mom1
      11/04/2020 05:08:24.297154;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;resv: vnode=None vnode_o=None
      11/04/2020 05:08:24.297163;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;resv_enable: vnode=1 vnode_o=1
      11/04/2020 05:08:24.297171;0006;Server@pbsdev-centos7-mvn6-server;Hook;Server@pbsdev-centos7-mvn6-server;VnodeDownReport;pbsdev-centos7-mvn6-mom1;sharing: vnode=1 vnode_o=1
      11/04/2020 05:08:24.297186;0800;Server@pbsdev-centos7-mvn6-server;Hook;hook_perf_stat;label=hook_modifyvnode_VnodeDownReport_278 action=run_code walltime=0.000320 cputime=0.000000
      11/04/2020 05:08:24.297246;0400;Server@pbsdev-centos7-mvn6-server;Hook;VnodeDownReport;finished
      11/04/2020 05:08:24.297285;0800;Server@pbsdev-centos7-mvn6-server;Hook;hook_perf_stat;label=hook_modifyvnode_VnodeDownReport_278 action=server_process_hooks walltime=0.001183 cputime=0.000000 profile_stop
      11/04/2020 05:08:24.297300;0004;Server@pbsdev-centos7-mvn6-server;Node;pbsdev-centos7-mvn6-mom1;attributes set: state + offline
  • Internals

    • New functional tests for vnode state changes defined in pbs_hook_modifyvnode_state_changes.py:

      • Includes tests that induce state changes via various operations (e.g., mom stop, offline mom, server restart, etc.)

      • Includes checks verifying existence of expected nodes state constants

    • Reuses existing vnode object logic where possible; two functions added to class _vnode in _svrtypes.py:

      • extract_state_strs() returns list of string values from the vnode’s state bits

      • extract_state_ints() returns list of int values from the vnode’s state bits

    • New code for propagating vnode state changes has been added, including: 

      • New batch request structure:

        Code Block
        languagec
        /* ModifyVnode - used for node state changes */
        struct rq_modifyvnode {
        	struct pbsnode *rq_vnode_o; /* old/previous vnode state */
        	struct pbsnode *rq_vnode; /* new/current vnode state */
        };
      • New event type:

        Code Block
        HOOK_EVENT_MODIFYVNODE
      • New event object:

        Code Block
        EVENT_VNODE_O_OBJECT	EVENT_OBJECT ".vnode_o"
      • New event param:

        Code Block
        #define PY_EVENT_PARAM_VNODE_O    "vnode_o"
      • A call to process_hooks() has been added to set_vnode_state() in node_manager.c to fire off the modifyvnode event

      • New pbs log entry added to set_vnode_state() in node_manager.c:

        Code Block
        languagec
        	snprintf(local_log_buffer, LOG_BUF_SIZE-1,
        		"set_vnode_state;vnode.state=0x%lx vnode_o.state=0x%lx "
        		"vnode.last_state_change_time=%d vnode_o.last_state_change_time=%d "
        		"state_bits=0x%lx state_bit_op_type_str=%s state_bit_op_type_enum=%d", pnode->nd_state, 
        		vnode_o->nd_state, time_int_val, last_time_int, state_bits, get_vnode_state_op(type), type);
        	log_event(PBSEVENT_DEBUG2, PBS_EVENTCLASS_NODE, LOG_INFO,
        		pnode->nd_name, local_log_buffer);

...