Will gcc skip this check for integer overflow?

For example, given the following code:

int f(int n)
{
    if (n < 0)
        return 0;
    n = n + 100;
    if (n < 0)
        return 0;
    return n;
}

      

Assuming you are passing a number very close to integer overflow (less than 100 away), would the compiler generate code that would give you negative returns?

Here is an excerpt from this issue from Descent to C by Simon Tatham:

"The GNU C compiler (gcc) generates code for this function that can return a negative integer if you pass (for example) the maximum value that represents a capable int. Since the compiler knows after the first if statement that n is positive, then it assumes no integer overflow occurs and uses this assumption to conclude that n should be positive after the addition, so it removes the second if statement entirely and returns the result of the unchecked addition. "

I wondered if the same problem exists in C ++ compilers and if I have to be careful that my integer overflow checks are not missed.

+3


source to share


1 answer


Short answer

Whether the compiler will definitely optimize the check in your example, we cannot say for all cases, but we can do a test against gcc 4.9

using the godbolt interactive compiler with the following code ( watch it live ):

int f(int n)
{
    if (n < 0) return 0;

    n = n + 100;

    if (n < 0) return 0;

    return n;
}

int f2(int n)
{
    if (n < 0) return 0;

    n = n + 100;

    return n;
}

      

and we can see that it generates identical code for both versions, which means that it does indeed repeat the second check:

f(int):  
    leal    100(%rdi), %eax #, tmp88 
    testl   %edi, %edi  # n
    movl    $0, %edx    #, tmp89
    cmovs   %edx, %eax  # tmp88,, tmp89, D.2246
    ret
f2(int):
    leal    100(%rdi), %eax #, tmp88
    testl   %edi, %edi  # n
    movl    $0, %edx    #, tmp89 
    cmovs   %edx, %eax  # tmp88,, tmp89, D.2249
    ret

      

Long answer

When your code exhibits undefined behavior or relies on potential undefined behavior (counter integer overflow in this example), then yes, the compiler can make assumptions and optimize around them. For example, it might assume that there is no undefined behavior and thus optimize according to that assumption. The most infamous example is probably the removal of null check in the Linux kernel . The code was as follows:

struct foo *s = ...;
int x = s->f;
if (!s) return ERROR;
... use s ..

      

The logic used was that since it s

was dereferenced it should not be a null pointer, otherwise it would be undefined behavior and so it optimized the check if (!s)

. The linked article says:

The problem is that dereferencing s on line 2 allows the compiler to infer that s is not null (if the pointer is null, then the function is undefined; the compiler can simply ignore this case). Thus, the null check on line 3 is silently optimized, and now the kernel contains a vulnerability if an attacker can find a way to call this code with a null pointer.



This applies to both C and C ++, which have the same language regarding undefined behavior. In both cases, the standard tells us that the results of undefined behavior are unpredictable, although what exactly is undefined in any language may be different. The draft C ++ standard defines undefined behavior as follows:

for which this International Standard is not required

and includes the following note (emphasis mine):

undefined can be expected when this International Standard omits any explicit definition of behavior, or when a program uses erroneous constructs or erroneous data. Acceptable undefined behavior is varied from completely ignoring a situation with unpredictable results in order to behave during translation or program execution in a typical environment (with or without issuing a diagnostic message), terminating the translation or executing (with issuing a diagnostic message). Many faulty programming constructs do not produce undefined behavior; they have to be diagnosed.

The draft C11 standard has a similar language.

Signature overflow due diligence

Your validation is not a proper way to protect against signed integer overflow, you need to validate before performing an operation and not perform an operation if it would cause an overflow. Cert has a good link on how to prevent continuous integer overflow for various operations. For the add case, he recommends the following:

#include <limits.h>

void f(signed int si_a, signed int si_b) {
  signed int sum;
  if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
      ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
    /* Handle error */
  } else {
    sum = si_a + si_b;
  }

      

If we paste this code into godbolt, we can see that the checks have been removed, which is the behavior we expect.

+8


source







All Articles