Enhancing accounting logs for job lifecycle
Follow the PBS Pro Design Document Guidelines.
Links
- Link to discussion on Developer Forum: http://community.pbspro.org/t/new-accounting-records-to-track-the-jobs-lifecycle/1794
Overview
There are a couple pieces of data that aren't logged into the accounting log that allows someone to see the full lifecycle of a job. Right now the main source of data about a job comes from the 'E' record. This tells us about the job after it ran. If anything changed during the life of the job, we will miss it.
Technical Details
Enhanced 'Q' record
- The current 'Q' record only prints the queue the job is going into
- Since the 'Q' record is the first record of the job, it should contain more information of what the job is submitted with.
- queue (existing)
- user
- group
- project
- Account_Name (shown as "account")
account (CSA accounting string)
- jobname
- reservation ID of the reservation the job is in (shown as "resvID")
- reservation name (shown as "resvname")
- ctime
- qtime
- etime (will be 0 for routing queue)
- dependencies (shown as "depend")
- Resource_List resources including select and place
- array indices submitted for job arrays (shown as "array_indices")
- Since the 'Q' record is printed when jobs are moved between queues (including when routing queues route between queues), the same data will be printed, but with the most up to date values.
New 'a' record when a job is altered
- Whenever a job attribute is altered, an 'a' record will be emitted to the accounting log. This is either via qalter or a server hook.
- The record will consist of the keyword value pair of attribute name '=' attribute value
- The attribute value is as you would see it in qstat -f (minus the line wrapping) or 'UNSET' if the value is being unset.
- If unsetting a value causes it to return to a default, the default will be printed instead of UNSET.
- The attribute value is as you would see it in qstat -f (minus the line wrapping) or 'UNSET' if the value is being unset.
- Due to the repetitive nature of the scheduling cycle, any alter coming from the scheduler will not be logged.
- This includes:
- Modification of walltime for STF jobs
- pset
- Any estimated resource (e.g. start_time)
- Job's comment
- accrue_type
- ptime (job attribute used by the scheduler for preemption)
- This includes:
- Any attribute set internally by the server will not be logged. These mostly are read only PBS attributes.
- The job's resources_used values are not logged when they are updated by mom.
Examples
% qsub -lselect=1:ncpus=1 -l place=pack -Wdepend=afterany:1008
10/22/2019 12:19:37;Q;1009.mars;user=bmann group=staff project=_pbs_project_default jobname=STDIN queue=workq ctime=1571771977 qtime=1571771977 etime=0 depend=afterany:1008.mars@mars Resource_List.ncpus=1 Resource_List.nodect=1 Resource_List.place=pack Resource_List.select=1:ncpus=1
% qalter -lwalltime=1:00:00 1009
10/22/2019 12:20:19;a;1009.mars;Resource_List.walltime=1:00:00
% qalter -Wdepend=afterany:1008,afterok:1013 1009
10/22/2019 12:20:19;a;1009.mars;depend=afterany:1008.mars@mars:afterok1013.mars@mars
% qalter -lmin_walltime=1:00 -l max_walltime=1:00:00 1009
10/22/2019 12:22:57;a;1009.mars;Resource_List.min_walltime=00:01:00 Resource_List.max_walltime=01:00:00
Project Documentation Main Page