VS2012 C ++ generates questionable code for conditional move (cmov)

I have a ringbuffer which is written by one manufacturer and is read by N users. Since this is a ringbuffer, it will be less than the current minimum consumer index for the manufacturer's index. The position of the producer and the consumer is tracked by their own Cursor

.

class Cursor
{
public:
    inline int64_t Get() const { return iValue; }
    inline void Set(int64_4 aNewValue)
    {
        ::InterlockedExchange64(&iValue, aNewValue);
    }

private:
    int64_t iValue;
};

//
// Returns the ringbuffer position of the furthest-behind Consumer
//
int64_t GetMinimum(const std::vector<Cursor*>& aCursors, int64_t aMinimum = INT64_MAX)
{
    for (auto c : aCursors)
    {
        int64_t next = c->Get();
        if (next < aMinimum)
        {
            aMinimum = next;
        } 
    }

    return aMinimum;
}

      

Looking at the generated assembly code, I see:

    mov rax, 922337203685477580   // rax = INT64_MAX
    cmp rdx, rcx    // Is the vector empty?
    je  SHORT $LN36@GetMinimum
    npad    10
$LL21@GetMinimum:
    mov r8, QWORD PTR [rdx]    // r8 = c
    cmp QWORD PTR [r8+56], rax // compare result of c->Get() and aMinimum
    cmovl   rax, QWORD PTR [r8+56] // if it less then aMinimum = result of c->Get()
    add rdx, 8                 // next vector element
    cmp rdx, rcx    // end of the vector?
    jne SHORT $LL21@GetMinimum
$LN36@GetMinimum:
    fatret  0   // beautiful friend, the end

      

I don't see how the compiler thinks it is correct to read the value c->Get()

, compare it with aMinimum

and then conditionally move the RE-READ value c->Get()

to aMinimum

. In my opinion, this value was changed between instructions cmp

and cmovl

. If I am correct, the following scenario is possible:

  • aMinimum

    currently set to 2

  • c->Get()

    returns 1

  • is executed cmp

    and the flag is less-than

    set

  • another thread updates the current value of the current current value c

    to 3

  • cmovl

    sets aMinimum

    to 3

  • The producer sees 3 and overwrites the data in position 2 of the call buffer, although it hasn't been processed yet.

Have I stared at him for too long? It shouldn't be something like:

mov rbx, QWORD PTR [r8+56]
cmp rbx, rax 
cmovl rax, rbx 

      

+3


source to share


1 answer


You are not using atomicity or any kind of interthread sequencing operations around your access to iValue

(the same would probably be true for what could be changed iValue

in another thread, but we will see that it doesn’t matter), so the compiler can assume it will remain the same between two assembly lines of code. If another thread changes iValue

, you have undefined behavior.

If your code is for thread safety, you will need to use atomics, locks, or some kind of sequencing operation.

The C ++ 11 standard formalizes this in section 1.10 Multithreading and Data Scheduling, which is not particularly lightweight. I think the parts related to this example:

Clause 10:

Grade A is ordered before grade B if

  • A performs a deallocate operation on an atomic object M, and on another thread, B performs a consume operation on M and reads the value written by any side effect in the allocation sequence headed by A, or
  • for some estimate X, A is ordered up to X and X is dependent on B.

If we say that score A matches a function Cursor::Get()

, and score B will match some invisible code that modifies iValue

. Evaluating A ( Cursor::Get()

) does not perform any operations on the atomic object, and is not a dependency ordered before anything else (so there is no "X" here).

And if we say that the grade A matches the code that modifies iValue

and B matches Cursor::Get()

, we can make the same conclusion. Thus, there is no order-dependent to relationship between Cursor::Get()

and the modifier iValue

.

Therefore, Cursor::Get()

it is not an ordered label before it can be changed iValue

.

Item 11:

Assessment. Inter-threading occurs before score B if

  • A syncs with B or
  • A is a dependency ordered before B, or
  • for some estimate X
    • A is synchronized with X and X is sequenced to B, or
    • A is sequenced before X and X inter-thread events occur before B, or
    • The inter-thread process happens before X and X inter-thread events happen before B.


Again, none of these conditions are met, so inter-flow does not occur earlier.

Clause 12

A grade occurs before B grade if:

  • A is sequenced to B or
  • Inter-stream happens before B.

We have shown that no "inter-thread" operation occurs before the "other". And the term "sequenced before" is defined in 1.9 / 13 "Program Execution" as applying only to evaluations that occur in the same thread ("interleaved before" is a C ++ 11 replacement for the old terminology of "sequence point"). Since we are talking about single stream operations, A cannot be sequenced to B.

So, at this point, we find out what Cursor::Get()

does not happen before the modification iValue

that occurs in another thread (and vice versa). Finally, we get the bottom line for this in paragraph 21:

Program execution contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic and does not occur before the other. Any such data race results in undefined behavior.

So, if you want to use Cursor::Get()

in one thread and modify something iValue

in another thread, you need to use atomatics or some other sequencing operation (mutex or such) to avoid undefined behavior.

Please note that according to the standard it is volatile

not enough to ensure consistency between threads. The Microsoft compiler may provide some additional promises to volatile

support well-defined interthread behavior, but this support is customizable, so my suggestion is to avoid using it volatile

for new code. Here's a bit about what MSDN has to say about it ( http://msdn.microsoft.com/en-us/library/vstudio/12a04hfd.aspx ):

ISO Compliance

If you are familiar with the C # volatile keyword, or are familiar with the behavior of volatile in earlier versions of Visual C ++, be aware that the C ++ 11 ISO Standard volatile keyword is different and supported in Visual Studio when the / volatile parameter is: iso compiler. (For ARM it is listed by default). The volatile keyword in C ++ 11 standard code should only be used for hardware access; do not use it for inter-thread communication. For communication between threads, use methods such as std :: atomic from the C ++ Standard Template Library.

Microsoft specification

When the / volatile: ms compiler option is used, by default, when non-ARM architectures are targeted, the compiler generates additional code to maintain order among references to volatile objects in addition to maintaining the order of references to other global objects. In particular:

  • Writing to a volatile object (also known as volatile write) has Release semantics; that is, a reference to a global or static object that occurs before writing to a volatile object in a command sequence occurs before that volatile write is written to the compiled binary.

  • Reading a volatile object (also known as a volatile read) has Acquire semantics; that is, a reference to a global or static object that occurs after reading volatile memory in a sequence of instructions after that volatile reading in a compiled binary.

This allows volatile objects to be used for locks and memory releases in multithreaded applications.

+3


source







All Articles