I am having a hard time understanding these comments about integer overflow detection

You will find the text below under the I. Introduction

article Understanding Integer Overflow in C / C ++ (emphasis is mine):

Detecting integer overflows is relatively easy, using a modified compiler to insert runtime checks. However, reliable detection of overflow errors is unexpectedly difficult because the overflow behavior is not always an error. The low-level nature of C and C ++ means that manipulation of bit and byte levels of objects is common; the line between math and bit operations can often be blurry. Rotation using unsigned integers is legal and well-defined, and there are code idioms that deliberately use this. On the other hand, C and C ++ have undefined semantics for signed overflow and bit-to-bit switching: operations that are perfectly well defined in other languages ​​such as Java. C / C ++ programmers are not always aware of the different rules for signed or unsigned types in C, and may naively use signed types in deliberate workarounds of Operation. 1 If such use was rare, compiler-based overflow detection would be a sane way to perform integer error detection. If this is not uncommon, however, such an approach would be impractical and more for distinguishing between intentional and unintentional uses.

I don't understand why compiler based detection would be impractical to detect workarounds on signed types if such use is not uncommon? Also, why do we need to distinguish between intentional and unintentional use? Both are undefined by the standard.

+3


source to share


3 answers


Detecting colliding integer overflows at runtime is not a problem. Newer languages ​​like Swift do this automatically and reliably.

The problem is that while integer overflows are undefined behavior in C and C ++, there are tons and tons of code where integer overflows occur, and also because the compiler silently ignores integer overflows, everything works fine.



If you start detecting an integer overflow, this usage will break your application. And of course, these overflows will not be executed when the developer launches the application, or the tester launches it, but only when the program is sent to clients who will get very pissed off if their application crashes on the most unsatisfactory and costly simple because you decided to disallow undefined behavior, which worked fine.

+1


source


For the compiler to detect compile-time overflows in all but the simplest cases, the compiler must take into account all possible inputs that might affect the variable and calculate all possible values ​​that might occur.

This is obviously unrealistic.

An example of using overflow is using a side effect for something else. Here's a contrived example for a circular buffer:



 int main()
 {
   uint8 index = 8;
   char keys[256];

   init_keys(keys); // Put single chars in the array
   while(1) {
     int letter;
     letter = getc();
     letter ^= keys[index];
     index ++;
     printf("Encoded: %c\n", letter);
   }
 }

      

In this example we are creating an 8-bit integer that should overflow 255 + 1. We use this overflow to implement a circular buffer with this value directly, rather than using a modulus, which would be more typical.

0


source


There are 5 sensible ways to handle overflows, whether signed or unsigned:

  • Trap. Additional instructions are usually needed.
  • saturation. Rarely available initially, usually requires additional instructions.
  • Wrap around. Always available initially for everyone, but not for the 2nd level. Used for unsigned types.
  • Undefined Behavior. Always available natively and allows the compiler to make optimizations. Used for signed types.
  • An arbitrary result. Always available initially. Only wondering when the wrapper is not available initially. This is weaker than UB, which is both its biggest advantage and disadvantage.

UB is good for optimization, capture for error detection, wrapping and saturation sometimes require behavior.
Arbitrary Result - Fills in a gap where wrap is expensive, but full UB isn't warranted.

Now, sometimes the compiler can prove that an operation cannot overflow, so it doesn't need to handle that case. Often for a cycle counter and the like, so the extra work isn't as big as it sounds. But keeping track of what data the data might have is not perfect even with a full source, and embedding hurdles such as separate compilation and semantic tampering where permitted make it impossible.

0


source







All Articles