Linux Kernel's __is_constexpr Macro

2019-01-12 00:11发布

问题:

How does the __is_constexpr(x) macro of the Linux Kernel work? What is its purpose? When was it introduced? Why was it introduced?

/*
 * This returns a constant expression while determining if an argument is
 * a constant expression, most importantly without evaluating the argument.
 * Glory to Martin Uecker <Martin.Uecker@med.uni-goettingen.de>
 */
#define __is_constexpr(x) \
        (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

For a discussion on different approaches to solve the same problem, see instead: Detecting Integer Constant Expressions in Macros

回答1:

Linux Kernel's __is_constexpr macro

Introduction

The __is_constexpr(x) macro can be found in Linux Kernel's include/kernel/kernel.h:

/*
 * This returns a constant expression while determining if an argument is
 * a constant expression, most importantly without evaluating the argument.
 * Glory to Martin Uecker <Martin.Uecker@med.uni-goettingen.de>
 */
#define __is_constexpr(x) \
        (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

It was introduced during the merge window for Linux Kernel v4.17, commit 3c8ba0d61d04 on 2018-04-05; although the discussions around it started a month before.

The macro is notable for taking advantage of subtle details of the C standard: the conditional operator's rules for determining its returned type (6.5.15.6) and the definition of a null pointer constant (6.3.2.3.3).

In addition, it relies on sizeof(void) being allowed (and different than sizeof(int)), which is a GNU C extension.


How does it work?

The macro's body is:

(sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

Let's focus on this part:

((void *)((long)(x) * 0l))

Note: the (long)(x) cast is intended to allow x to have pointer types and to avoid warnings on u64 types on 32-bit platforms. However, this detail is not important for understanding the key points of the macro.

If x is an integer constant expression (6.6.6), then it follows that ((long)(x) * 0l) is an integer constant expression of value 0. Therefore, (void *)((long)(x) * 0l) is a null pointer constant (6.3.2.3.3):

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant

If x is not an integer constant expression, then (void *)((long)(x) * 0l) is not a null pointer constant, regardless of its value.

Knowing that, we can see what happens afterwards:

8 ? ((void *)((long)(x) * 0l)) : (int *)8

Note: the second 8 literal is intended to avoid compiler warnings about creating pointers to unaligned addresses. The first 8 literal could simply be 1. However, these details are not important for understanding the key points of the macro.

The key here is that the conditional operator returns a different type depending on whether one of the operands is a null pointer constant (6.5.15.6):

[...] if one operand is a null pointer constant, the result has the type of the other operand; otherwise, one operand is a pointer to void or a qualified version of void, in which case the result type is a pointer to an appropriately qualified version of void.

So, if x was an integer constant expression, then the second operand is a null pointer constant and therefore the type of the expression is the type of the third operand, which is a pointer to int.

Otherwise, the second operand is a pointer to void and thus the type of the expression is a pointer to void.

Therefore, we end up with two possibilities:

sizeof(int) == sizeof(*((int *) (NULL))) // if `x` was an integer constant expression
sizeof(int) == sizeof(*((void *)(....))) // otherwise

According to the GNU C extension, sizeof(void) == 1. Therefore, if x was an integer constant expression, the result of the macro is 1; otherwise, 0.

Moreover, since we are only comparing for equality two sizeof expressions, the result is itself another integer constant expression (6.6.3, 6.6.6):

Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated.

An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof operator.

Therefore, in summary, the __is_constexpr(x) macro returns an integer constant expression of value 1 if the argument is an integer constant expression. Otherwise, it returns an integer constant expression of value 0.


Why was it introduced?

The macro came to be during the effort to remove all Variable Length Arrays (VLAs) from the Linux kernel.

In order to facilitate so, it was desirable to enable GCC's -Wvla warning kernel-wide; so that all instances of VLAs were flagged by the compiler.

When the warning was enabled, it turned out that GCC reported many cases of arrays being VLAs which was not intended to be so. For instance in fs/btrfs/tree-checker.c:

#define BTRFS_NAME_LEN 255
#define XATTR_NAME_MAX 255

char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];

A developer may expect that max(BTRFS_NAME_LEN, XATTR_NAME_MAX) was resolved to 255 and therefore it should be treated as a standard array (i.e. non-VLA). However, this depends on what the max(x, y) macro expands to.

The key issue is that GCC generates VLA-code if the array's size is not an (integer) constant expression as defined by the C standard. For instance:

#define not_really_constexpr ((void)0, 100)

int a[not_really_constexpr];

According to the C90 standard, ((void)0, 100) is not a constant expression (6.6), due to the comma operator being used (6.6.3). In this case, GCC opts to issue VLA code, even when it knows the size is a compile-time constant. Clang, in contrast, does not.

Since the max(x, y) macro in the kernel was not a constant expression, GCC triggered the warnings and generated VLA code where kernel developers did not intend it.

Therefore, a few kernel developers tried to develop alternative versions of the max and other macros to avoid the warnings and the VLA code. Some attempts tried to leverage GCC's __builtin_constant_p builtin, but no approach worked with all the versions of GCC that the kernel supported at the time (gcc >= 4.4).

At some point, Martin Uecker proposed a particularly clever approach that did not use builtins (taking inspiration from glibc's tgmath.h):

#define ICE_P(x) (sizeof(int) == sizeof(*(1 ? ((void*)((x) * 0l)) : (int*)1)))

While the approach uses a GCC extension, it was nevertheless well-received and was used as the key idea behind the __is_constexpr(x) macro which appeared in the kernel after a few iterations with other developers. The macro was used then to implement the max macro and other macros that are required to be constant expressions in order to avoid GCC generating VLA code.