Optimizing your code to avoid forking

I just came across this article: Calculate minimum or maximum two integers without branching

It starts with "[o] n some rare machines where branching is expensive ...".

I used to think that branching is always expensive as it often forces the processor to flush and restart the execution pipeline (for example, see Why is it faster to process a sorted array than an unsorted array? ).

This leaves me with a couple of questions:

  • Is the author of the article wrong? Or this article may have been written one time before the forking was an issue (I can't seem to find a date on it).
  • Do modern processors have a way to complete minimal branches like in (x < y) ? x : y

    without degrading performance?
  • Or do all modern compilers just implement this hack automatically? Specifically, what is Java doing? Moreover, a function Math.min(...)

    is just a triple instruction ...
+3


source to share


1 answer


Is the author of the article wrong? Or this article may have been written one time before the forking was an issue (I can't seem to find a date on it).

The oldest comment is 5 years old, so this is not hot news. However, unpredictable forking is always expensive and it was 5 years ago. At the same time, this has only gotten worse as modern CPUs can do a lot more per cycle, and a mispredicted branch is therefore more work cost.

But in a sense, the author is right. Most of the processors are not found on our PCs and servers, but in embedded devices where the situation is different.

Do modern processors have a way to complete minimal branches like in (x <y)? x: y without degrading performance?

Yes and no. AFAIK Math.max

always translates to conditional move, which means no branching. You own max

, may or may not use it, depending on the statistics collected by the JVM.



No silver bullet. Branching is faster with predictable results. Figuring out which pattern the processor recognizes is difficult. The JVM just looks at how often it gets a branch and uses a magic threshold of about 18%. See my own question and answer for details .

Or do all modern compilers just implement this hack automatically? Specifically, what is Java doing? Moreover, its function Math.min (...) is just a ternary statement ...

It is actually a compiler. Whenever the JITc sees this method, it handles it specifically. When you copy a method, it receives no special handling.

In this case, the internal environment is not very useful as it is heavily optimized anyway. For methods like Long#numberOfLeadingZeros

, intrinsic value is important, since the code is quite long and slow, and modern CPUs get this in one cycle.

+6


source







All Articles