Alternate security model to CLUSTERADDR2

Follow the PBS Pro Design Document Guidelines.

Overview

This proposal is to eliminate the usage of IS_CLUSTERADDR and to use an alternate scheme.

What is CLUSTERADDR2?

Whenever the server receives a request or response from the mom, it check it against the list of IP's known to server. The server maintains a list of IP's of moms added to the cluster using qmgr. This will make sure that the server is communicating only to the mom's in it's cluster.

In case of inter mom communication, we need a similar mechanism to ensure that a mom is talking to only the one in it's own cluster. For that every mom needs to maintain the list of IP address in the cluster. So server distributes the list of IP's of mom to every mom in the cluster whenever a node is added or deleted. This is done with an inter server command known as IS_CLUSTERADDR2.

Why is it bad?

A. Not Scalable:

In larger clusters where there are more than 10K mom's, this mechanism is not very efficient as we need to broadcast the message to all the mom's in the cluster and this needs to be done with every addition and deletion of mom's. This becomes a bigger problem in cluster which dynamically grows and shrinks, like cloud bursting.

Proposal:

Every server and mom will have a common secret key which is manually configured in a file in a root-only readable directory. This secret key is used to send an encrypted payload (ip address of sender) to the receiver. The receiver can confirm that the sender is part of the same cluster as they have the same shared key.

Sender will generate a hash using sha256 on (IP address + key) and will send that to the sister mom. Sister mom will generate the same hash on the sender's ip address + shared key and check whether they match.

Replay attacks are not possible unless they can spoof the IP address, which itself is difficult. This scheme does not replace authentication such as reserved port or munge, which will still be required.

Interface changes

This scheme requires the same password to be present in pbs.key file in all $PBS_HOME/$daemon_priv/ directory. Where the daemon is either server or mom.


Pros:

  1. Scalable
  2. Easy to implement and maintain

Cons:

  1. Admin has to manually configure the key across all moms (Not really an issue while installing using cluster managers)







OSS Site Map

Project Documentation Main Page

Developer Guide Pages