Rint is not present in Visual Studio 2010 math.h and CUDA rint equivalent

I am porting CUDA code to C ++ and using Visual Studio 2010. The CUDA code uses a feature rint

that doesn't seem to be present in Visual Studio 2010 math.h, so I feel like I need to implement it myself.

As per this link CUDA functionrint

rounds x to the nearest floating point integer value, with half cases rounded to zero.

I think I could use casting to int

, which discards the fractional part, effectively rounding to zero, so I ended up with the following function

inline double rint(double x)
{
    int temp; temp = (x >= 0. ? (int)(x + 0.5) : (int)(x - 0.5));
    return (double)temp;
}

      

which has two different castings: one to int

and one to double

.

I have three questions:

  • Is the above function fully equivalent to CUDA rint

    for "small" numbers? Will it not be "big" numbers that cannot be represented as int

    ?
  • Is there an even more efficient way to calculate (instead of using two casts) the definition rint

    ?

Thank you in advance.

+3


source to share


1 answer


The specified description of rint () in the CUDA documentation is incorrect. Rounding to integer calculations with floating point results display the IEEE-754 (2008) rounding options as follows:

trunc()   // round towards zero
floor()   // round down (towards negative infinity)
ceil()    // round up (towards positive infinity)
rint()    // round to nearest or even (i.e. ties are rounded to even)
round()   // round to nearest, ties away from zero

      

Typically, these functions work as described in the C99 standard. For the rint () parameter, the standard specifies that the function is rounded according to the current rounding mode (by default, it is rounded to the nearest or even). Because CUDA does not support dynamic rounding modes, all functions that are defined to use the current rounding mode use round-to-nearest or even rounding mode. Here are some examples to show the difference between round () and rint ():

argument  rint()  round()
1.5       2.0     2.0
2.5       2.0     3.0
3.5       4.0     4.0
4.5       4.0     5.0

      

round () can be easily emulated along the lines of code you posted, I don't know of a simple emulation for rint (). Note that you do not want to use an intermediate press for integer, as "int" supports a narrower numeric range than integers, which exactly represent "double". Use trunc (), ceil (), floor () instead if needed.



Since rint () is part of both the current C and C ++ standards, I am a little surprised that MSVC does not include this feature; I would suggest checking MSDN to see if a replacement is suggested. If your platforms support SSE4, you can use the built-in SSE functions _mm_round_sd(), _mm_round_pd()

defined in smmintrin.h

, with round-to mode _MM_FROUND_TO_NEAREST_INT

, to implement the CUDA rint () functionality.

While (in my experience) built-in SSE functions carry over to Windows, Linux and Mac OS X, you can avoid the hardware code. In this case, you can try the following code (slightly tested):

double my_rint(double a)
{
    const double two_to_52 = 4.5035996273704960e+15;
    double fa = fabs(a);
    double r = two_to_52 + fa;
    if (fa >= two_to_52) {
        r = a;
    } else {
        r = r - two_to_52;
        r = _copysign(r, a);
    }
    return r;
}

      

Note that MSVC 2010 also doesn't seem to have a standard copysign () function, so I had to substitute _copysign (). The above code assumes that the current rounding mode is rounded to nearest-even (which is the default). By adding 2 ** 52, it ensures that rounding occurs at the bit of the integer element. Note that this also assumes that a pure double precision computation is being performed. On platforms that use some higher precision for intermediate results, it may be necessary to declare "fa" and "r" as volatile.

+10


source







All Articles