Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Overview

In situations that demand high job throughput the cgroup hook's overhead can be substantial. One important part of this overhead is that when it instantiates the NodeUtils class it always discovers devices on the hosts and calls nvidia-smi to see if there are GPUs to manage.

...

The only net result of skipping device disvoery discovery is that the dictionary that records devices to manage becomes the empty dictionary. Since no event will actually use the dictionary to manage anything when the subsystem is disabled, that does no harm (and even does a lot of good in culling MoM log messages from irrelevant content).

...