Analysis: Build PBS using CMake
Links
Link to discussion on Developer Forum: https://community.openpbs.org/t/analysis-build-pbs-using-cmake/2348
Link to dev branch: https://github.com/minghui-liu/openpbs/tree/cmake
Overview
This page documents an exploration project of building PBS using CMake. This document begins with a brief introduction to CMake, followed a comparison of CMake with build tools we use today. Finally it details the changes that we need to make to build PBS using CMake.
What is CMake?
CMake is an open source system for managing the build and packaging process of software in a compiler-independent manner. It is designed to generate build files (makefiles on Unix/Linux and projects/workspaces in Windows MSVC) and be used with native build tools. Once integrated with PBS it will replace autotools which we currently use. It also ships with a packaging tool, CPack, that generates packages for different Unix/Linux distros and even windows installer. For more information please read Cmake website https://cmake.org/overview/
CMake vs Autotools
| Autotools | CMake |
---|---|---|
Supported Platforms | Autotools does not have good support for systems other than Unix/Linux. Autotools assumes make and other GNU tools and relies heavily on shell scripting, and sticks to GNU coding standards. On Windows it requires Cygwin to be installed. | CMake is truly cross-platform compiler independent. It uses native tools and generates native build files on each platform. On Unix/Linux it generate makefiles. On Windows it generates MSVS projects, and on Mac OS X it generates XCode projects. CMake is also able to generate project files of other common IDEs such as Eclipse, Code::Block, etc. |
Language | Autotools uses the m4 language and shell scripts. The syntax is not very readable. But most developers already know the shell script. | CMake uses its own configuration language and is more readable and maintainable than m4 and shell scripts. However it is a new language and developers need to learn it. |
Headers and Library Functions | Autotools wants us to list all required headers and library functions (in the configure.ac file) | CMake can figure out the required headers and library functions on its own. |
Popularity | Autotools is declining in popularity. | More C/C++ projects are moving to CMake. CMake modules are very easy to find and copy from other projects. |
Interface | Autotools is command line only. | CMake has colored command line output, progress report, and graphical tools. |
Summary of Changes
To integrate CMake we need the following changes:
Create a CMakeLists.txt file under each directory of PBS
Create a .cmake directory for custom CMake modules
Modify all files with .in extension, and replace configurable values with CMake style variable placeholders (@VARIABLE@)
Modify RPM scripts (pbs_postinstall, pbs_preunstainll, etc.) to accommodate CPack
Remove autotools files: configure.ac, autogen.sh, “.m4” file, Makefile.am, Makefile.in,
Remove RPM spec file (openpbs.spec.in)
Build PBS using CMake
CMake uses a hierarchy of configuration files named CMakeLists.txt (C and L must be capitalized). So one is created under each directory of PBS that contains something to be built or installed. Higher level CMakeLists.txt files include its subdirectories by using add_subdirectory() function. Here is an example of our root level CMakeLists.txt including the src
and doc
directories.
# add the src subdirectory
add_subdirectory(src)
add_subdirectory(doc)
Variables and Options
CMake uses its own language that consists of functions (cmake calls them “commands”) and variables. Variables can be defined by using set() function. For example, in the top level CMakeLists.txt file, we create a variable of string type that stores the home directory of PBS with a default value and a description string. The “CACHE” keyword tells CMake to cache this variable between builds. Cached variables are stored in a file named CMakeCache.txt under the build directory.
set(PBS_HOME_DIR /var/spool/pbs CACHE STRING "PBS home directory")
Options are boolean variables that can be turned ON or OFF. They are usually used to enable or disable features (by turning them into C proprocessor definitions and use #ifdef in source code to conditionally compile features). For example, in the top level CMakeLists.txt we have an option to enable PTL with a default value of ON.
option(ENABLE_PTL "Enable PTL" ON)
Both variables and options can be overridden through the command line or graphical tools before a build.
Dependency Packages
PBS requires a number of packages to be installed on the system before it can be built. To find these libraries and packages we use the find_package() function and give it the name of the package. CMake has a list of common packages it knows how to find. CMake achieves this with build-in cmake modules, each one responsible for finding a package. However some of our dependencies, such as UndoLR, Editline and Ical, are not on this list. Therefore we created our own cmake module to find them. These custom modules were placed under the .cmake directory of PBS.
CMake can find some packages by component. Here we require both the interpeter (python3) and the development (python3-devel) package of Python3.
The find_package function and modules store result in variables. By convention these variables are prefixed with the name of the package. For example, after we tell cmake to find python3, variables including Python3_FOUND, and Python3_INCLUDE and Python3_LIBRARIES etc. will be set, the first containing whether Python3 was found, the later two containing the path to Python header files and Python3 libraries. These variables will be useful later when we conditional compile Python features, and need to link to Python3.
System Introspection
Some PBS features require certain system capabilities. CMake offers many ways to let us inspect the system for features, such as checking if a header file exist on the system, or checking whether a symbol exists in a header file. We can use them to find out if features that PBS depend on, such as evenfd and epoll is available on the system. Below is an example of checking for sys/eventfd.h and store the resulst in HAVE_SYS_EVENTFD_H vairable.
And below is how we figure out if the system has epoll by checking for epoll_create and epoll_pwait in the sys/epoll.h header, and store the result in PBS_HAVE_EPOLL and PBS_HAVE_EPOLL_PWAIT.
check_type_size() lets us to check if a type size is defined.
If an os features cannot be figured out from header files and symbols, CMake has allows us to check features by testing if a block of c code compiles or runs. For example, here is a another way of checking if the system has epoll, and store the result in PBS_HAVE_EPOLL.
All of these checks are placed in /src/include/CMakeLists.txt.
File Generation
With all the information about packages and system features stored in cmake variables, we can now use them to generate headers, C source files and other files that are needed to build PBS.
For generating header files, cmake offers two handy functions, #cmakedefine and #cmakedefine01 that turn cmake variables and options into C preprocessor definitions. #cmakedefine #define or #undef a definition depending on if the cmake variable has a non-false value. #cmakedefine01 always defines a definition but gives it value 0 or 1 depending on if the cmake variable is set to non-false value. For example in pbs_config.h.in:
For other types of files, and for using values of cmake variables in files directly, CMake has two formats of variable placeholders, @VARIABLE@
and ${VARIABLE}
. The first one is prefered because the second one conflicts with shell script variables.
To generate (cmake calls this step 'configure') files and replace cmake variables with values, we can use configure_file()
function in the CMakeLists.txt file under the same directory as the file to be generated. To tell cmake to ignore the ${VARIABLE}
format we can append @ONLY
at the end. For example, in src/include/CMakeLists.txt we ask cmake to generate pbs_config.h from pbs_config.h.in, and only replace variables in the @VARIABLE@
format.
Besides generating files by variable replacement from an input file, CMake also supports custom file generation by using add_custom_command()
and add_custom_target
(). We need to supply the dependency (input file), output file name and the formula for generating the file.
The example below shows how to generate libattr header files from xml attribute defintions using attr_parser.py. Instead of calling add_custom_command for each attr def file, a function named gen_attrdef_file is created and and called with the file names to save repeatition.
The example below is how we generate job_attr_def.c
from master_job_attr_def.xml
by using buildutils/attr_parser.py
.
Since we do this for all the attribute definition files under src/lib/libattr, we can create a function for this to avoid repeating.
Note that cmake does not place generate files under the same directory, but creates a output tree (binary tree in cmake terms) under your build folder where all generated files and built executables and libraries go. So if you called cmake ..
from a folder called build
, cmake will place the output of src/include/pbs_config.h.in
to build/src/include/pbs_config.h
. To distingushi between the source tree and the binary tree, we can use ${CMAKE_SOURCE_DIR}
and ${CMAKE_BINARY_DIR}
prefix.
Now we can create a build directory, cd into it and call cmake [path to root dir of PBS]
under build directory.
You should see the generate CMake files, and the binary tree.
Targets
With all the required files generated, we list all targets that need to be built. Targets can be anything. It is simply an abstraction of the build system for tracking dependencies. But typically we list libraries and executables and targets. The example below is how the static library liblog
is built, and how include directory and linker information is given to cmake. Note that CMake automatically appends lib
prefix to a library name.
Add_library() tells cmake about a library target and its source files. target_include_directories()
tells CMake its include directories (where to look for header files, think the -I option of gcc), and target_link_libraries()
tells CMake where to find the libraries the target links to (-L of gcc). The PRIVATE and PUBLIC are inheritance keywords (there is a third one, INTERFACE). They determine how a target's property transfers to other targets that link to it.
Executable targets are created by using add_executable()
function in a similarly way. Here is how we build pbs_probe
in src/tools/CMakeLists.txt.
In this example we use many variable instead of hardcoded paths. The ${Python3_INCLUDE_DIRS}
and ${Python3_LIBRARIES}
variables were from find_package(Python3 ...)
. ${COMMON_INCLUDES}
and ${COMMON_LIBS}
are two we created that contain lists of libraries and include files that all executables under src/tools
share.
After this step, we will be to run cmake
and then make
(on Linux) to build PBS.
Install Rule
Now that libraries and excutables targets are listed, we need to decide how they are installed. The install rules for targets and files are given to CMake by the install()
function. These decide where things get installed to and what permission they are set to. They apply both to manual install (make install on Unix/Linux) and to package install.
As an example, here we install binaries under src/tools to sbin directory.
DESTINATION
can be relative path, absolute path or variable. Here we use GNU install path variables as relative path. They are basically string variables that contain relative paths for different types of files, such as doc
, lib
, sbin
, etc. For a comprehensive list of all the available GNU install directory variables please see here. All relative paths are relative to ${CMAKE_INSTALL_PREFIX}
, which is set to a default of value of /opt/pbs in our top level CMakeLists.txt.
Besides targets, install()
can be used to install files. There are different file type keywords, such as FILES
for normal non-executable files, PROGRAMS
for executable files, and DIRECTORY
for directories. Here is an example of how we install shell scripts under src/cmds/scripts to ${CMAKE_INSTALL_PREFIX}/sbin
. Because we used PROGRAMS
keyword, CMake will correctly install these scripts with executable permission.
In the example below we install modulefile
to ${CMAKE_INSTALL_PREFIX}/etc
.
At this point, we can install PBS by make install
under the build directory.
Packaging
CMake has a package generation tool name CPack
that can generate packages in multiple formats at once. To use CPack we need to give CMake information about packaging by setting variables. There are variables generic to all package generators, and generator specific variables. For example, below are generic variable such as package name, vendor, version, description etc. At the end we set the default list of package formats to only RPM.
The following are variables we set that are specific to the RPM generator.
To generate packages, call cpack
from the build directory. Or do make package
or
To generate the source package, call cmake --build . --target package_source
or make source
.
Demo
If you want to try building PBS using CMake yourself, please git clone the development branch and do the following.
Future Work
Complete all PBS build options and system features tests
Correctly mark all target attributes with inheritance keywords. Now most of them are marked as PRIVATE.
Generate multiple RPM packages. Currently only a single RPM package that contains all PBS files is generated. We need to split it into server, client, execution, devel and ptl packages.
Generate package in more formats, such as DEB and Windows installer.
Test on more platforms. Currently building PBS using CMake is only tested on CentOS 7.