Do I need std :: atomic <bool> or POD bool good enough?

Consider this code:

// global
std::atomic<bool> run = true;

// thread 1
while (run) { /* do stuff */ }

// thread 2
/* do stuff until it time to shut down */
run = false;

      

Do I need overhead associated with an atomic variable? My intuition is that the read / write of a boolean variable is more or less uniform (this is a normal g ++ / Linux / Intel setup) and if there is some write / read weirdness and my loop on thread 1 stops one to go through sooner or later in As a result, I'm not really worried about it for this application.

Or is there some other consideration that I am missing here? Looking at the performance, it seems like my code is spending quite a lot of time on std::atomic_bool::operator bool

, and I would rather have it in a loop.

+3


source to share


1 answer


You need to use std::atomic

to avoid unwanted optimizations (the compiler reads the value once and always loops or never loops) and get correct behavior on systems without a highly ordered memory model (x86 is highly ordered, so write, the next reading will see this, but on other systems (unless threads clear the cpu cache to main RAM for other reasons, the entry may not be displayed for a long time, if ever).

However, you can improve performance. The default std::atomic

usage uses a sequential consistency model that overflows a single flag value. You can speed it up by using load

/ store

with explicit (and less strict) memory ordering, so everyone load

doesn't need to use the most paranoid mode to maintain consistency.

For example, you can do:



// global
std::atomic<bool> run = true;

// thread 1
while (run.load(std::memory_order_acquire)) { /* do stuff */ }

// thread 2
/* do stuff until it time to shut down */
run.store(false, std::memory_order_release);

      

On an x86 machine, any ordering that is less inline than the sequence ordering (which is the most stringent by default) usually ends up doing nothing except that the instructions are executed in a specific order; no bus lock or the like required, due to the highly ordered memory model. So, in addition to ensuring that the value is actually read from memory, and not cached into a register and reused using atomatics, thus, on x86 it is free, and on non-x86 computers, it makes your code correct (which otherwise would not be).

+8


source







All Articles