Support in PTL for deletion of large number of jobs
Overview
PTL performance tests run large number of jobs. The tearDown() of these tests or setUp() of subsequent tests clean up by deleting these jobs using qdel. This qdel operation takes very long. Consequently these tests timeout. This is because ,the scheduler is constantly cycling and it may tell the server to start the jobs we are trying to delete. This makes server busy and the delete job requests have to wait to get processed until the server is done talking to the scheduler at the start of the sched cycle. Also, a lot of MoM and server<->MoM activity happens when terminating a running job. This adds delay.
- Discussion forum: http://community.pbspro.org/t/add-support-in-ptl-to-speed-up-deletion-of-large-number-of-jobs/1400
- https://github.com/PBSPro/pbspro/pull/973
Changes proposed to support deletion of large number of jobs in PTL are listed below.
Interface addition
Following is the new interface that will be added
In fw/ptl/lib/pbs_testlib.py,
- Interface: _cleanup_large_num_jobs(job_ids=None, runas=None)
Visibility: Private
Synopsis: Helper function to delete large number of jobs. Will be called from cleanup_jobs if number of jobs for deletion is greater than 100.
Details: This function will get the process ids of the running jobs and kill them manually. It would then delete all jobs from server using ‘qdel -Wforce’.
Interface updates
Following are the updates that will be done to existing interfaces
In fw/ptl/lib/pbs_testlib.py,
Interface:cleanup_jobs(extend=None, runas=None)
Synopsis: Updated to handle deletion of large number of jobs
Details: This method is now updated to delete large number of jobs.- Scheduling will be turned 'off' before job deletion and turned back 'on' before exiting
- if number of jobs is less than 100 then it uses qdel. if number of jobs to delete is more than 100, it calls _cleanup_large_num_jobs().
In fw/ptl/utils/pbs_testsuite.py,
- Interface: tearDown()
- Synopsis: Updated to handle job deletion at the end of test execution
- Details: tearDown() will call cleanup_jobs() to make sure jobs are deleted at the end of every test execution
Project Documentation Main Page