Systemd Service Enhancement

Follow the PBS Pro Design Document Guidelines.

Overview

Currently we have only one unit service for pbs which is pbs.service. By starting this service all the daemons starts together and there is no easy way to start daemons separately. Whenever any of the daemons were stopped via mechanism other than systemctl or if one or more daemons gets killed, in this case systemctl fails to report the correct status of daemons. Systemctl status still shows pbs.service running. So for that a desired approach can be making all the daemons as a separate service and have their own unit files so that the daemons can be ( start | stop | restart) the service when required.

Proposal

As of now we have only pbs.service a single service in which we call init.d script through which we start all the demons. The pbs_init.d script only takes one argument which can be any one of these "start, stop, restart, status". Now to make all the demons separate service, we will have different unit file for each demon but still here I am willing to use the same pbs_init.d script to start and stop separate service. To make this happen, required changes can be pbs_init.d taking arguments "start, stop, restart, status" followed by the demon name "pbs_server, pbs_mom, pbs_comm, pbs_sched" so that whenever we start the service on the basis of the second argument i.e the demon name all the required checks for that particular demon which happen currently in init.d script will continue and than on the basis of first argument the service can be "start, stop, restart, status". 

Systemd will be the default way of starting PBS services. All the PBS_START switches which are used in the SysV will be removed from pbs.conf. If user wants to use SysV services than user have to add these switches manually in pbs.conf and can use the SysV services. By doing this we will still support SysV services.

Note: 

Using Systemd and SysV together is undefined. Either the user should use systemd or SysV,  both should not be used together.

Example

Here is an example of having separate unit file for each daemon  and how it can be managed.

[root@jitendra openpbs]# cat /usr/lib/systemd/system/pbs_mom.service
[Unit]
Documentation=man:pbs(8)
SourcePath=/opt/pbs/libexec/pbs_init.d
Description=Portable Batch System Mom
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target
DefaultDependencies=true

[Service]
Type=forking
Restart=no
TimeoutStartSec=0
TimeoutStopSec=5min
Delegate=yes
IgnoreSIGPIPE=no
GuessMainPID=no
KillMode=process
ExecStart=/opt/pbs/libexec/pbs_init.d start pbs_mom
ExecStop=/opt/pbs/libexec/pbs_init.d stop pbs_mom
TasksMax=infinity

[Install]
WantedBy=multi-user.target
[root@jitendra openpbs]#


[root@jitendra openpbs]# cat /usr/lib/systemd/system/pbs_server.service

[Unit]
Documentation=man:pbs(8)
SourcePath=/opt/pbs/libexec/pbs_init.d
Description=Portable Batch System Server
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target
DefaultDependencies=true

[Service]
Type=forking
Restart=no
TimeoutStartSec=0
TimeoutStopSec=5min
Delegate=yes
IgnoreSIGPIPE=no
GuessMainPID=no
ExecStart=/opt/pbs/libexec/pbs_init.d start pbs_server
ExecStop=/opt/pbs/libexec/pbs_init.d stop pbs_server
TasksMax=infinity

[Install]
WantedBy=multi-user.target
[root@jitendra openpbs]#


[root@jitendra jitenr]# systemctl status pbs_server
? pbs_server.service - Portable Batch System
Loaded: loaded (/opt/pbs/libexec/pbs_init.d; enabled; vendor preset: disabled)
Active: inactive (dead)
Docs: man:pbs(8)

Oct 05 09:20:58 jitendra pbs_init.d[75608]: Stopping PBS pbs_server
Oct 05 09:20:58 jitendra pbs_init.d[75608]: Shutting server down with qterm.
Oct 05 09:20:58 jitendra su[75655]: (to postgres) root on none
Oct 05 09:20:58 jitendra su[75680]: (to postgres) root on none
Oct 05 09:21:01 jitendra su[75804]: (to postgres) root on none
Oct 05 09:21:01 jitendra pbs_init.d[75608]: PBS server - was pid: 73534
Oct 05 09:21:01 jitendra su[75872]: (to postgres) root on none
Oct 05 09:21:01 jitendra su[75906]: (to postgres) root on none
Oct 05 09:21:02 jitendra pbs_init.d[75608]: Waiting for shutdown to complete
Oct 05 09:21:03 jitendra systemd[1]: Stopped Portable Batch System.
[root@jitendra jitenr]# systemctl start pbs_server
[root@jitendra jitenr]# systemctl status pbs_server
? pbs_server.service - Portable Batch System
Loaded: loaded (/opt/pbs/libexec/pbs_init.d; enabled; vendor preset: disabled)
Active: active (running) since Mon 2020-10-05 11:12:25 PDT; 10s ago
Docs: man:pbs(8)
Process: 88397 ExecStart=/opt/pbs/libexec/pbs_init.d start pbs_server (code=exited, status=0/SUCCESS)
Tasks: 3
Memory: 6.7M
CGroup: /system.slice/pbs_server.service
+-89721 /opt/pbs/sbin/pbs_ds_monitor monitor
+-89748 /usr/bin/postgres -D /var/spool/pbs/datastore -p 15007
+-89759 postgres: logger process
+-89763 postgres: checkpointer process
+-89764 postgres: writer process
+-89765 postgres: wal writer process
+-89766 postgres: autovacuum launcher process
+-89767 postgres: stats collector process
+-89860 postgres: postgres pbs_datastore 192.168.37.154(44600) idle
+-89890 /opt/pbs/sbin/pbs_server.bin

Oct 05 11:12:21 jitendra su[89602]: (to postgres) root on none
Oct 05 11:12:21 jitendra pbs_init.d[88397]: *** End of /opt/pbs/libexec/pbs_habitat
Oct 05 11:12:21 jitendra pbs_init.d[88397]: Home directory /var/spool/pbs updated.
Oct 05 11:12:22 jitendra su[89685]: (to postgres) root on none
Oct 05 11:12:22 jitendra su[89722]: (to postgres) root on none
Oct 05 11:12:24 jitendra su[89835]: (to postgres) root on none
Oct 05 11:12:25 jitendra pbs_init.d[88397]: Connecting to PBS dataservice...connected to PBS dataservice@jitendra
Oct 05 11:12:25 jitendra pbs_init.d[88397]: Licenses valid for 10000000 Floating hosts
Oct 05 11:12:25 jitendra pbs_init.d[88397]: PBS server
Oct 05 11:12:25 jitendra systemd[1]: Started Portable Batch System.
[root@jitendra jitenr]# systemctl start pbs_comm
[root@jitendra jitenr]# systemctl start pbs_mom
[root@jitendra jitenr]# systemctl start pbs_sched
[root@jitendra jitenr]# systemctl status pbs_sched
? pbs_sched.service - Portable Batch System
Loaded: loaded (/opt/pbs/libexec/pbs_init.d; enabled; vendor preset: enabled)
Active: active (running) since Mon 2020-10-05 11:13:17 PDT; 5s ago
Docs: man:pbs(8)
Process: 91428 ExecStart=/opt/pbs/libexec/pbs_init.d start pbs_sched (code=exited, status=0/SUCCESS)
Tasks: 5
Memory: 2.7M
CGroup: /system.slice/pbs_sched.service
+-91508 /opt/pbs/sbin/pbs_sched

Oct 05 11:13:16 jitendra systemd[1]: Starting Portable Batch System...
Oct 05 11:13:16 jitendra pbs_init.d[91428]: Starting PBS pbs_sched
Oct 05 11:13:17 jitendra pbs_init.d[91428]: PBS sched
Oct 05 11:13:17 jitendra systemd[1]: Started Portable Batch System.
[root@jitendra jitenr]# systemctl status pbs_mom
? pbs_mom.service - Portable Batch System
Loaded: loaded (/opt/pbs/libexec/pbs_init.d; enabled; vendor preset: disabled)
Active: active (running) since Mon 2020-10-05 11:13:13 PDT; 21s ago
Docs: man:pbs(8)
Process: 91267 ExecStart=/opt/pbs/libexec/pbs_init.d start pbs_mom (code=exited, status=0/SUCCESS)
Tasks: 2
Memory: 2.7M
CGroup: /system.slice/pbs_mom.service
+-91334 /opt/pbs/sbin/pbs_mom

Oct 05 11:13:13 jitendra systemd[1]: Starting Portable Batch System...
Oct 05 11:13:13 jitendra pbs_init.d[91267]: Starting PBS pbs_mom
Oct 05 11:13:13 jitendra pbs_init.d[91267]: PBS mom
Oct 05 11:13:13 jitendra systemd[1]: Started Portable Batch System.
[root@jitendra jitenr]# systemctl status pbs_comm
? pbs_comm.service - Portable Batch System
Loaded: loaded (/opt/pbs/libexec/pbs_init.d; enabled; vendor preset: disabled)
Active: active (running) since Mon 2020-10-05 11:13:09 PDT; 31s ago
Docs: man:pbs(8)
Process: 91055 ExecStart=/opt/pbs/libexec/pbs_init.d start pbs_comm (code=exited, status=0/SUCCESS)
Tasks: 5
Memory: 412.0K
CGroup: /system.slice/pbs_comm.service
+-91143 /opt/pbs/sbin/pbs_comm

Oct 05 11:13:08 jitendra systemd[1]: Starting Portable Batch System...
Oct 05 11:13:08 jitendra pbs_init.d[91055]: Starting PBS pbs_comm
Oct 05 11:13:09 jitendra pbs_init.d[91055]: PBS comm
Oct 05 11:13:09 jitendra pbs_init.d[91055]: /opt/pbs/sbin/pbs_comm ready (pid=91143), Proxy Name:jitendra:17001, Threads:4
Oct 05 11:13:09 jitendra systemd[1]: Started Portable Batch System.


Similarly the services can be stopped by systemctl stop (pbs_mom | pbs_comm | pbs_server | pbs_sched). 








OSS Site Map

Project Documentation Main Page

Developer Guide Pages