Record for Job Suspend and Resume events in Accounting Logs
Overview:
There are many events in the lifecycle of a job. This EDD is focused on logging accounting records related to suspend and resume events of the job. Currently, job's resource usage is available when the job ends and there is "E" record in the accounting logs for the same. The resource requested/usage can change during the life of the job. The objective of this EDD is to understand the correct usage of resources during the suspend and resume events.
'z' record:
- Upon job suspension, a 'z' record shall be accounted.
- The record shall consist of:
- resources_released(if available)
- resources_used
Example:
1. resources_released list shall only be available if the server attribute restrict_res_to_release_on_suspend is set.
"qmgr -c 's s restrict_res_to_release_on_suspend+=ncpus'"
a. Submit a job requesting ncpus
04/04/2020 00:56:06;z;1003.pbsserver;resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.ncpus=2 resources_used.vmem=0kb resources_used.walltime=00:00:00 resources_released=(pbsserver:ncpus=2)
b. Submit a job requesting ncpus and memory
04/04/2020 00:55:12;z;1002.pbsserver;resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=3432kb resources_used.ncpus=1 resources_used.vmem=336452kb resources_used.walltime=00:00:32 resources_released=(pbsserver:ncpus=1)
2. restrict_res_to_release_on_suspend is unset
04/04/2020 00:52:45;z;1002.pbsserver;resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.ncpus=1 resources_used.vmem=0kb resources_used.walltime=00:00:00
'r' record
- Upon resuming a job, an 'r' record shall be accounted.
Example:
04/04/2020 00:54:43;r;1002.pbsserver;