How to check inf for inline AVX __m256

What is the best way to check if there is any embedded in AVX __m256

(vector 8 float

) inf

? I tried

__m256 X=_mm256_set1_ps(1.0f/0.0f);
_mm256_cmp_ps(X,X,_CMP_EQ_OQ);

      

but this is compared to true

. Note that this method will find nan

(which is compared to false

). So one way is to check X!=nan && 0*X==nan

:

__m256 Y=_mm256_mul_ps(X,_mm256_setzero_ps());   // 0*X=nan if X=inf
_mm256_andnot_ps(_mm256_cmp_ps(Y,Y,_CMP_EQ_OQ),
                 _mm256_cmp_ps(X,X,_CMP_EQ_OQ));

      

However, this looks somewhat lengthy. Is there a faster way?

+3


source to share


3 answers


If you want to check if a vector has any infinities:

#include <limits>

bool has_infinity(__m256 x){
    const __m256 SIGN_MASK = _mm256_set1_ps(-0.0);
    const __m256 INF = _mm256_set1_ps(std::numeric_limits<float>::infinity());

    x = _mm256_andnot_ps(SIGN_MASK, x);
    x = _mm256_cmp_ps(x, INF, _CMP_EQ_OQ);
    return _mm256_movemask_ps(x) != 0;
}

      



If you want a vector mask of infinite values:

#include <limits>

__m256 is_infinity(__m256 x){
    const __m256 SIGN_MASK = _mm256_set1_ps(-0.0);
    const __m256 INF = _mm256_set1_ps(std::numeric_limits<float>::infinity());

    x = _mm256_andnot_ps(SIGN_MASK, x);
    x = _mm256_cmp_ps(x, INF, _CMP_EQ_OQ);
    return x;
}

      

+5


source


I think the best solution is to use vptest

instead of vmovmskps

.

bool has_infinity(const __m256 &x) {
    __m256 s   = _mm256_andnot_ps(_mm256_set1_ps(-0.0), x);
    __m256 cmp = _mm256_cmp_ps(s,_mm256_set1_ps(1.0f/0.0f),0);
    __m256i cmpi = _mm256_castps_si256(cmp);
    return !_mm256_testz_si256(cmpi,cmpi);
}

      



Characteristic <T23> just to make the compiler happy "This internal is only used for compilation and does not generate any instructions, so it has zero latency."

vptest

is superior vmovmskps

because it sets the flag to zero and vmovmskps

not. The vmovmskps

compiler must generate test

to set the flag to zero.

+2


source


I had an idea, but it only helps me if you want to check that ALL elements are infinite. Unfortunately.

With AVX2 you can check that all items are infinite with PTEST

. I got the idea to use xor to compare for equality from EOF's comment on this question , which I used for my answer there. I thought I could make a shorter version of test-for-any-inf, but of course pxor

only works as a test for all 256b equal.

#include <limits>

bool all_infinity(__m256 x){
    const __m256i SIGN_MASK = _mm256_set1_epi32(0x7FFFFFFF);  // -0.0f inverted
    const __m256 INF = _mm256_set1_ps(std::numeric_limits<float>::infinity());

    x = _mm256_xor_si256(x, INF);  // other than sign bit, x will be all-zero only if all the bits match.
    return _mm256_testz_si256(x, SIGN_MASK); // flags are ready to branch on directly
}

      

Since AVX512 exists __mmask8 _mm512_fpclass_pd_mask (__m512d a, int imm8)

. ( vfpclasspd

). (See Intel manual ). Its output is a mask register and I haven't looked into testing / branching for value there. But you can check any / all +/- zero, +/- inf, Q / S NaN, Denormal, Negative.

+1


source







All Articles