Should fxam work for single precision floating point values?

This question arose from Why isnormal () says a value is normal if it is not?

The C compiler generates the following code to detect if 32-bit is float

passed to Normal or not:

    flds    24(%esp)
    fxam; fstsw %ax;
    andw    $17664, %ax
    cmpw    $1024, %ax
    sete    %al

      

(full code can be viewed here ).

Is this code correct? It looks like the program is behaving incorrectly, stating that the number is normal if it is not. We think that perhaps the number is being checked for double normality here.

+1


source to share


1 answer


I checked the Intel reference guide insn , associated with fooobar.com/tags/x86 / ... .

There is only one version of the instruction fxam

and it works on 80-bit registers. So yes, this (ineffectively) checks for 80 bit temporal normality. (more effective would be test $1024, %eax

, rather than disguise, then cmp

.)

Accordingly, it flds

will itself throw a Denormal exception. I think this means that it is testing the actual source and not the result of the 80 bit conversion. This page says that a denormal exception will set bits in the status word.

The Intel ref does not say anything about fld

setting the status word, just adding the C1 flag and leaving C0, C2 and C3 undefined. It says that you can get a #D FPU exception if the source is denormal, but that won't happen if the source is 80 bits.

I don't know if the status word will actually be set for denormals unless FPU exceptions are enabled. I'm not an expert on this. My reading of this page (and the control-word section) is that after most of the instructions are updated, the FPU status word is updated. If a bit is D

set in the control register (which is the default), then the denormal operands set a bit D

in the status word . This has been unmasked, an exception will happen.

So, I think the float test function for denormal would look like this:

isdenormalf:
    flds (%rdi)   # sets FPU status based on the input to the 32->80bit conversion
    fstsw %ax
    fstp %st0     # pop
    test $2, %al  # avoid 16 bit ops (%ax), they're slow on Intel
    sete %al   #  or just branch on flags directly if your compiler smart
    ret

      



I haven't tried this, so it might be completely bogus. Writing this so that strings without the load / load data we want to store can be non-trivial. Maybe take the address of arg, return a float (so it can be in register x87) and have an output argument with a condition.

I don't see an instruction that can check the float

SSE register for denormal.

I think I have a (slow) way to check for denormals with SSE4.1 or AVX ROUNDSS

. You should use a different version depending on the sign of the input.

For positive values:

  • View k +inf

    with denormals-zero
  • The circle in the direction +inf

    without denormals is zero.
  • If the two rounding results are different, then denormals-are-zero have an effect (meaning the input was denormal)

Negative numbers need to be rounded to -inf

, not +inf

, otherwise they -0.xx

will always be rounded to zero. So this will have a branch, two ROUNDSS

es, and a comparison. IEEE floating point bit hacks are likely to be faster.

+1


source







All Articles