Following the kernel #2 – Kernel Build System

maj 28, 2021 Following the kernel, Linux

Following the kernel #2 – Kernel Build System

Multi-platform projects like Linux or U-Boot need a flexible tool to configure all the conditional stuff inside and run the build in a concise form. Our series is long before going into Linux code. However, I will describe Kbuild system in the example of Linux source code. This is the original version without any modifications. If you want to see it I recommend you clone the repository given in the previous part. The way it works is the same on U-Boot, but it may differ at some points.

Makefile scripts show how things can be made complicated, but it is a standard living much longer than I do and it does the job for all these years. Their strength is that behind the enigmatic form, hides simple shell invocations, which are very flexible. I expect from you basic knowledge in this topic.

scripts/Makefile.build

The central place of Kbuild is the Makefile.build script, placed under scripts directory. It is a smart, handy helper which is generically used in every source compilation. It gathers configuration and accepts one parameter obj, which is a path to a directory with a source code to build.

If you want to build some kernel module in Kbuild, just call make -f scripts/Makefile.build obj=path/to/kernel/module, or with Kbuild helper variable make $(build)=path/to/kernel/module. Of course, it is a simplification. I haven’t mentioned about configuration step, but this is a general idea.

# Init all relevant variables used in kbuild files so
# 1) they have correct type
# 2) they do not inherit any value from the environment
obj-y :=
obj-m :=
lib-y :=
lib-m :=
always :=
always-y :=
always-m :=
targets :=
subdir-y :=
subdir-m :=
EXTRA_AFLAGS   :=
EXTRA_CFLAGS   :=
EXTRA_CPPFLAGS :=
EXTRA_LDFLAGS  :=
asflags-y  :=
ccflags-y  :=
cppflags-y :=
ldflags-y  :=

subdir-asflags-y :=
subdir-ccflags-y :=

The first thing done in it is the initialization of build variables. At this point, you may see the modular nature of Linux. The variables with the suffix -y corresponds to objects compiled into the Kernel (in the case of U-Boot everything is compiled and linked into one binary). The suffix -m, as you may expect, corresponds to modules loaded during Linux runtime. These variables tell Makefile.build what should be compiled and eventually, are there any local build options. These variables are supplied by a developer in the local module Makefile script. I will describe it later.

Below this code, there are several inclusions of other files. Now I will briefly describe what they do.

include/config/auto.conf

This one includes generated Makefile-readable configuration. It contains all parameters, which decide, how Linux code should be compiled. At this point I can tell you, that the repository contains a lot of prepared, specific boards configurations, so you don’t have to make it on your own. Usually, you choose one of it, and create small changes, up to your needs.

# Read auto.conf if it exists, otherwise ignore
-include include/config/auto.conf

- before the include clause cause that there is no error if the file doesn’t exist. In the bare, cloned repository, there is no include/config directory, since we didn’t specify the configuration. More info about it I will write later in this article. Let’s assume, that configuration is right there. Below you can find a sample configuration, taken from the U-Boot repository:

#
# Automatically generated file; DO NOT EDIT.
# U-Boot 2021.04-rc1 Configuration
#
CONFIG_ENV_SUPPORT=y
CONFIG_DISPLAY_BOARDINFO=y
CONFIG_CMD_BOOTM=y
CONFIG_ENV_FAT_FILE="uboot.env"
CONFIG_CMD_EXT4=y
CONFIG_SYS_OMAP24_I2C_SPEED=100000
scripts/Kbuild.include

Now you see, that customization stuff is accessible at the early beginning of Makefile analysis. Right after that, we can see the next inclusion. It contains Kbuild-specific Makefile helper definitions.

include scripts/Kbuild.include

If you are experienced, using Makefile scripts, you may see that it’s problematic with special character usage. The first thing done in this included script is defining such characters as variables:

# Convenient variables
comma   := ,
quote   := "
squote  := '
empty   :=
space   := $(empty) $(empty)
space_escape := _-_SPACE_-_
pound := \#

For example, comma sign is used by Makefile in many functions, such as subst or call, as an argument separator. But if it is placed in a statement as a variable $(comma), it is only substituted by Makefile without interpreting it as a special character. After that, there are some other helpers, commonly used in Kbuild scripts.

I will briefly describe more complicated ones. The first of them is filechk. It is quite complex, but let’s puzzle it out:

define filechk
	$(Q)set -e;						\
	mkdir -p $(dir $@);					\
	trap "rm -f $(dot-target).tmp" EXIT;			\
	{ $(filechk_$(1)); } > $(dot-target).tmp;		\
	if [ ! -r $@ ] || ! cmp -s $@ $(dot-target).tmp; then	\
		$(kecho) '  UPD     $@';			\
		mv -f $(dot-target).tmp $@;			\
	fi
endef

define directive creates Makefile macro, which can be called in this form:

filechk_sample = echo $(KERNELRELEASE)
version.h: FORCE
	$(call filechk,sample)

Pay attention to the ; \ signs in the filechk definition. Everything is run in one /bin/sh invocation (every recipe line in Makefile calls a separate shell with a command supplied there if there is a single line, everything is called in one shell environment). $(Q) variables are optional and say if the command should be written to the stdout. set -e cause immediate exit on any command failure within this function. mkdir -p generates the target directory and the parameter -p suppresses error if the directory already exists.

trap function might be new for you. It is the exit handler installed in our /bin/sh invocation. For example, if we enter CTRL+C combination, during the build, the makefile will be stoped, but before it, the commands between the quotes will be executed –rm -f $(dot-target).tmp.

dot-target is another macro defined in Kbuild.include file. It is quite simple, so I will not tell much on that. Briefly speaking it generates the name of the temporary (prefixed with a dot) file for a target. In our case it is .version.h. If the build is interrupted, we should remove it to allow a clean build, in later make invocation.

Last 4 lines are most important. { $(filechk_$(1)); } > $(dot-target).tmp; executes in recipe the commands defined before. $(1) is the argument passed in call statement. In our example it is sample, so as the effect, our recipe executes filecheck_sample function –echo $(KERNELRELEASE)and redirects it output to the temporary file mentioned before.

Right after that, there is an if statement. It uses Makefile recipe variable $@ which is the target file, in our example it is version.h, to check its existence and accessibility (-r). If the file exists it also compares its content with the generated temporary file. If any of these conditions are true (file not exists or it has different content), the file is updated with new, generated content (mv -f $(dot-target).tmp $@;).

So the filechk macro defines a way of checking the content of a generated file before creating it. It is very useful. If you have ever written more complex Makefile scripts with generating files, you probably noticed, that every generation of such file triggers not necessary builds of other targets, which rely on it – Makefile looks at the last touch timestamp of the file, not its content.

The next function is try-run:

try-run = $(shell set -e;		\
	TMP=$(TMPOUT)/tmp;		\
	TMPO=$(TMPOUT)/tmp.o;		\
	mkdir -p $(TMPOUT);		\
	trap "rm -rf $(TMPOUT)" EXIT;	\
	if ($(1)) >/dev/null 2>&1;	\
	then echo "$(2)";		\
	else echo "$(3)";		\
	fi)

At the beginning, it sets environment variables TMP, TMPO and makes output directory $(TMPOUT) – this variable should be set before try-run call. You are already familiar with trap. The purpose of it is the same, but in this case, it removes the ouput directory on exit. As the first argument, the function gets the command to be executed. If it succeeds, the second argument is outputted. If it files, the third one.

Worth noting here is that TMP and TMPO environment variables are accessible in the wrapped command. They are valid only inside try-run call. A good example of usage is a next definition:

as-option = $(call try-run,\
	$(CC) $(KBUILD_CFLAGS) $(1) -c -x assembler /dev/null -o "$$TMP",$(1),$(2))

It runs the compiler $(CC) with default C-related flags $(KBUILD_CFLAGS) and given as the as-option parameter assembly-related flag $(1). It tells the compiler (-c) to stop at the compilation step (without linking), since we only want to check the passed option. As the name of the functions says, it is related to the assembly, so we tell to the $(CC) that the language is assembler (-x assembler). As the input file /dev/null is given (as I have mentioned it is just the compiler check). The output file is a temporary $$TMP file defined in try-run. We must supply it here with double $$ sign, because otherwise, Makefile would try to resolve it during definition as a variable (like for example $(CC)).

Sample usage of as-option is placed in arch/arm/Makefile script.

AFLAGS_NOWARN	:=$(call as-option,-Wa$(comma)-mno-warn-deprecated,-Wa$(comma)-W)

This statement assigns to variable AFLAGS_NOWARN result of the as-option call. It runs the compiler with the -Wa,-mno-warn-deprecated flag. If it is supported, the -Wa,-W value is assigned to the AFLAGS_NOWARN. Otherwise, the variable remains empty.

Next important part are shorthand definitions, common build statements:

###
# Shorthand for $(Q)$(MAKE) -f scripts/Makefile.build obj=
# Usage:
# $(Q)$(MAKE) $(build)=dir
build := -f $(srctree)/scripts/Makefile.build obj

###
# Shorthand for $(Q)$(MAKE) -f scripts/Makefile.dtbinst obj=
# Usage:
# $(Q)$(MAKE) $(dtbinst)=dir
dtbinst := -f $(srctree)/scripts/Makefile.dtbinst obj

###
# Shorthand for $(Q)$(MAKE) -f scripts/Makefile.clean obj=
# Usage:
# $(Q)$(MAKE) $(clean)=dir
clean := -f $(srctree)/scripts/Makefile.clean obj

When we want to build an object (statically linked or loaded as a module), the make -f scripts/Makefile.build obj=<module_dir> must be called. With the encapsulating build variable it is easier since the call looks like this: make $(build)=<module_dir>. Similar purposes have dtbinst (device-tree related) and clean variables.

I encourage you to read all of these Makefile helpers. They are commonly used around Kbuild scripts, and knowing them lets you read them fluently. I tried to describe a few, so it should be easier to read all others.

Local Kbuild/Makefile

After Kbuild.include helpers, the local Kbuild/Makefile script is included:

# The filename Kbuild has precedence over Makefile
kbuild-dir := $(if $(filter /%,$(src)),$(src),$(srctree)/$(src))
kbuild-file := $(if $(wildcard $(kbuild-dir)/Kbuild),$(kbuild-dir)/Kbuild,$(kbuild-dir)/Makefile)
include $(kbuild-file)

This is the part filled by the module developer. It defines files, which should be built into the kernel (obj-y) or as a loadable module (obj-m). These are the most common definitions. At the beginning of Makefile.build I have shown all variables, which might be supplied here. These are lib- which will be built as statically linked libraries, always- are targets, which are – as the name says – are always triggered during the build of our module. Good examples are generated header files. The recipe for it should be described in our local Makefile. The next variable is targets – this is a variable related to the way, the Kbuild works, and to if_changed helper, defined in Kbuild.include. Briefly, it helps with a situation in which the command building something changed in Makefile, so the rebuild of this particular thing should be done. If you want to read more about it, please refer to https://www.kernel.org/doc/Documentation/kbuild/makefiles.rst chapter 3.12 – Command change detection. subdir- are variables that trigger descending into subdirectories in Kbuild framework. More about it I will tell you later. All other variables with *flags* part lets apply specific build flags locally in our module.

A simple example of such local Makefile is placed in drivers/tty/serial/8250/Makefile :

obj-$(CONFIG_SERIAL_8250)		+= 8250.o 8250_base.o
8250-y					:= 8250_core.o
8250-$(CONFIG_SERIAL_8250_PNP)		+= 8250_pnp.o
8250_base-y				:= 8250_port.o
8250_base-$(CONFIG_SERIAL_8250_DMA)	+= 8250_dma.o
8250_base-$(CONFIG_SERIAL_8250_DWLIB)	+= 8250_dwlib.o
8250_base-$(CONFIG_SERIAL_8250_FINTEK)	+= 8250_fintek.o
obj-$(CONFIG_SERIAL_8250_GSC)		+= 8250_gsc.o
obj-$(CONFIG_SERIAL_8250_PCI)		+= 8250_pci.o
obj-$(CONFIG_SERIAL_8250_EXAR)		+= 8250_exar.o
obj-$(CONFIG_SERIAL_8250_HP300)		+= 8250_hp300.o

The often shorthand is concatenating obj- with configuration variable like $(CONFIG_SERIAL_8250). If CONFIG_SERIAL_8250 is declared as y, it is built into the kernel. If this tristate variable is set to m these files are built as a loadable module. If it is set to n it is not built at all

Some modules may be linked from several compilation units (.c files). This situation is presented in 8250.o and 8250_base.o files. After placing them in obj-y/m definition, the Makefile script defines which files must be compiled before linking into 8250.o/8250_base.o. It is done in 8250-y and 8250_base-y variables.

scripts/Makefile.lib

The next part of Makefile.build is Makefile.lib script. It defines all commands, which are used to build the kernel. In the beginning we can find backward compatibility variables support code

asflags-y  += $(EXTRA_AFLAGS)
ccflags-y  += $(EXTRA_CFLAGS)
cppflags-y += $(EXTRA_CPPFLAGS)
ldflags-y  += $(EXTRA_LDFLAGS)

EXTRA_ variables are now deprecated. New modules should use xxflags-y variables in their Makefile scripts (described above). But old ones also must be properly compiled.

# When an object is listed to be built compiled-in and modular,
# only build the compiled-in version
obj-m := $(filter-out $(obj-y),$(obj-m))

The important thing here is a declaration, that if a module is configured for both built-into kernel and as a loadable module, the built-in option is chosen.

# Libraries are always collected in one lib file.
# Filter out objects already built-in
lib-y := $(filter-out $(obj-y), $(sort $(lib-y) $(lib-m)))

Some objects may be built as libraries, but similarly, if they are declared as obj-y, they are filtered out from lib-y.

# Subdirectories we need to descend into
subdir-ym := $(sort $(subdir-y) $(subdir-m) \
			$(patsubst %/,%, $(filter %/, $(obj-y) $(obj-m))))

This part needs more comment. Kbuild system is designed to traverse all configured directories from root folders to more specialized ones if they are configured. As we can see in the subdir-ym definition, these subdirectories might be passed from local Makefile through subdir-y, subdir-m but also with obj-y and obj-m variables – but these one are treated like that, only when they end with a /.

Repository structure reflects a logical design of kernel. For example, device drivers are placed under the drivers folder, under this location, we can find tty directory (teletypewriter). For now, it is a broad area of devices, mainly based on serial ports, but also the virtual devices are inside this group

The tty directory in turn has some generic stuff for all tty devices and serial directory which conform to more specialized device drivers like omap-serial.c. A completely separate directory is arch that includes stuff dependent on the core architecture or the fs directory, containing file system-related code.

The general principle is to start building from the main Makefile, placed in a root directory and descend to more specific directories, chosen according to the configuration. The example code is Makefile in drivers directory, always including tty subsystem directory:

obj-y				+= tty/

Descending is implemented in Makefile.build file, line 488-503:

__build: $(if $(KBUILD_BUILTIN), $(targets-for-builtin)) \
	 $(if $(KBUILD_MODULES), $(targets-for-modules)) \
	 $(subdir-ym) $(always-y)
	@:

endif

# Descending
# ---------------------------------------------------------------------------

PHONY += $(subdir-ym)
$(subdir-ym):
	$(Q)$(MAKE) $(build)=$@ \
	$(if $(filter $@/, $(KBUILD_SINGLE_TARGETS)),single-build=) \
	need-builtin=$(if $(filter $@/built-in.a, $(subdir-builtin)),1) \
	need-modorder=$(if $(filter $@/modules.order, $(subdir-modorder)),1)

subdir-ym is one of the last prerequisites performed in __build target. The recipe for it is taking one by one each subdirectory (the order has significance) and recursive calling make $(build)=$@ with some conditional parameters and the current target variable $@.

Worth noting here is that descending is often related to configuration. We may add subdir- variables conditionally, according to CONFIG_ variables.

The next part, of Makefile.build which I’d like to enlighten, are build rules for source code files. Kbuild supports code written in C and assembly. If you scroll this file, you will find definitions like:

quiet_cmd_cc_s_c = CC $(quiet_modtag)  $@
      cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $<

$(obj)/%.s: $(src)/%.c FORCE
	$(call if_changed_dep,cc_s_c)

The cmd_cc_s_c abbreviation tells, that its command using CC (C Compiler) program to generate .S file (assembly) from the C source. This is an intermediate step, which is sometimes helpful to debug problems and see how the human-readable machine code of our module looks like. Next, a similar command is cmd_cpp_i_c which generates with C preprocessor code in .i files (pure C after preprocessing). All other rules are written similarly.

In this example, you may see how if_changed... helper is used. Build command is defined as cmd_cc_s_c and is passed (without cmd_ prefix) to the helper. Besides that quite_cmd_cc_s_c is defined to tell Kbuild, how this command should be displayed at some level of verbosity. FORCE is given as a prerequisite, to let if_changed_dephelper decide if the file should be built or not.

The whole path between C (or assembly) source code and object file is presented below. as you may see, Kbuild adds some extra steps. Some of them are optional, some are essential to kernel work. We will not describe them in detail, because it is a topic for a whole series. Maybe I will go back here when it will be needed to code analysis.

define rule_cc_o_c
	$(call cmd_and_fixdep,cc_o_c)
	$(call cmd,gen_ksymdeps)
	$(call cmd,checksrc)
	$(call cmd,checkdoc)
	$(call cmd,objtool)
	$(call cmd,modversions_c)
	$(call cmd,record_mcount)
endef

define rule_as_o_S
	$(call cmd_and_fixdep,as_o_S)
	$(call cmd,gen_ksymdeps)
	$(call cmd,objtool)
	$(call cmd,modversions_S)
endef
Root Makefile

Now after reading all these raw definitions it’s time to split it up. This is the job of the main Makefile script in the kernel repo. This is the root caller of Makefile.build recursion. Unfortunately, it’s long and messy, but after following each called rule we will know what is most essential.

Configuration

As I mentioned above, CONFIG_ variables, placed in auto.conf are choosing drivers, architecture, and all customization stuff. At first sight, you might be afraid, that the whole kernel configuration is in your hands. Fortunately, it is not as bad. There are predefined configuration files for each architecture and specific boards. They are placed under arch/$ARCH/configs/$PLATFORM_defconfig files. For example, defconfig dedicated for BeagleBone Black (and some other boards) is placed under arch/arm/configs/omap2plus_defconfig. The terminology here might be misleading. OMAP is the larger SoC containing ARM processor (for example AM335x) and multimedia coprocessor, probably that is why Linux puts Sitara AM335x processor under this config file.

To configure our cross-compilation we need to set up a couple of things. The first one is assigning ARCH variable (in the terminal before executing make). In our case, it must be set to arm (export ARCH=arm). If it is not set properly, the Makefile script will choose the architecture of the underlying build machine. This variable lets Makefile decide which arch/$SRCARCH/Makefile should be included:

include arch/$(SRCARCH)/Makefile
export KBUILD_DEFCONFIG KBUILD_KCONFIG CC_VERSION_TEXT

config: outputmakefile scripts_basic FORCE
	$(Q)$(MAKE) $(build)=scripts/kconfig $@

%config: outputmakefile scripts_basic FORCE
	$(Q)$(MAKE) $(build)=scripts/kconfig $@

The next thing is telling Kbuild which toolchain should be used. By default, build machine gcc is used. In embedded systems however, it’s rarely used. We must explicitly assign chosen one to CROSS_COMPILE variable. In my case, it is export CROSS_COMPILE=arm-linux-gnueabihf-. You can download such toolchains from Linaro https://releases.linaro.org/components/toolchain/binaries/latest-5/arm-linux-gnueabihf/. Of course, after uncompressing it, you must add it to your local PATH variable.

When we are done, the next step is choosing default defconfig. So we exectue make omap2plus_defconfig. This will run the recipe for the target %config placed in the upper Makefile snippet. It will descend into scripts/kconfig and run Makefile.build on the local Makefile script. As the result, a host program called conf will be built and executed. This rule is implemented in the following part of scripts/kconfig/Makefile:

%_defconfig: $(obj)/conf
	$(Q)$< $(silent) --defconfig=arch/$(SRCARCH)/configs/$@ $(Kconfig)

$< variable is first prerequisite (theconf) program, and it’s called with parameters --defconfig=arch/arm/configs/omap2plus_defconfig. This program gathers all the parameters supplied in omap2plus_defconfig and environment variables declared by make.

The output, generated by it, is include/config directory, containing all the configuration (CONFIG_ variables), readable in C language (as header files) and auto.conf mentioned before. The input data are environment variables, Kconfig files from all Linux subdirectories (which probably are validators), and the most important defconfig file – in our case omap2plus_defconfig.

Running the build

When these files are generated, we are almost there. The only thing to do is running our long-lasting kernel build. If our ARCH and CROSS_COMPILE variables are still in the terminal session, we can just execute make. The default target is all, triggers vmlinux target.

# Final link of vmlinux with optional arch pass after final link
cmd_link-vmlinux =                                                 \
	$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)

vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
	+$(call if_changed,link-vmlinux)

We will focus on the most important parts – vmlinux-deps are all the objects linked into the kernel image, in a single variable. Let’s see how it’s defined.

$ cat Makefile | grep -w "vmlinux-deps .="
vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS)

$ cat Makefile | grep -w "KBUILD_LDS \|KBUILD_VMLINUX_OBJS \|KBUILD_VMLINUX_LIBS "
KBUILD_VMLINUX_OBJS := $(head-y) $(patsubst %/,%/built-in.a, $(core-y))
KBUILD_VMLINUX_OBJS += $(addsuffix built-in.a, $(filter %/, $(libs-y)))
KBUILD_VMLINUX_OBJS += $(patsubst %/, %/lib.a, $(filter %/, $(libs-y)))
KBUILD_VMLINUX_LIBS := $(filter-out %/, $(libs-y))
KBUILD_VMLINUX_LIBS := $(patsubst %/,%/lib.a, $(libs-y))
KBUILD_VMLINUX_OBJS += $(patsubst %/,%/built-in.a, $(drivers-y))
export KBUILD_LDS          := arch/$(SRCARCH)/kernel/vmlinux.lds

$ cat Makefile | grep -w "core-y\|libs-y\|drivers-y"
core-y		:= init/ usr/
drivers-y	:= drivers/ sound/
drivers-y	+= net/ virt/
libs-y		:= lib/
core-y		+= kernel/ certs/ mm/ fs/ ipc/ security/ crypto/ block/

$ cat arch/arm/Makefile | grep -w "head-y"
head-y		:= arch/arm/kernel/head$(MMUEXT).o

This is the glue, joining all the things described previously. The main Makefile after preparing configuration (auto.conf and generated headers) descend into main source directories (init, drivers, etc.) and after building all configured objects, it links built-in.a (this archive contains all related kernel objects) using scripts/link-vmlinux.sh script. The makefile recipe, doing it is called descend:

vmlinux-dirs	:= $(patsubst %/,%,$(filter %/, \
		     $(core-y) $(core-m) $(drivers-y) $(drivers-m) \
		     $(libs-y) $(libs-m)))
...
build-dirs	:= $(vmlinux-dirs)
...
# Handle descending into subdirectories listed in $(build-dirs)
# Preset locale variables to speed up the build process. Limit locale
# tweaks to this spot to avoid wrong language settings when running
# make menuconfig etc.
# Error messages still appears in the original language
PHONY += descend $(build-dirs)
descend: $(build-dirs)
$(build-dirs): prepare
	$(Q)$(MAKE) $(build)=$@ \
	single-build=$(if $(filter-out $@/, $(filter $@/%, $(KBUILD_SINGLE_TARGETS))),1) \
	need-builtin=1 need-modorder=1
Summary

Kbuild system is a really broad topic. I have mentioned the most important skeleton, which hopefully eases the understanding of more complicated parts.

In the next chapter, I will describe debug environment setup. And finally, we will dive into some interesting stuff.

Rafał

PrzezRafał

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *