Why does adding a statement that is never executed cause performance degradation in my code?

Background:

I am writing a stack block and I am working on optimizing it.

I found that adding dreams after skipping compare_and_exchange operations results in significantly higher throughput when testing in highly parallel scenarios:

void stack::push(node* n)
{
    node old_head, new_head{ n };
    n->n_ = nullptr;

    if (head_.compare_exchange_weak(old_head, new_head))
        return;

    for (;;)
    {
        n->n_ = old_head.n_;
        new_head.create_id(old_head);
        if (head_.compare_exchange_weak(old_head, new_head))
            return;

        // testing conditions _never_ reach here, so why does this line make the program slower??
        std::this_thread::sleep_for(std::chrono::nanoseconds(5));

        // debug break is used to confirm execution never reaches here
        __debugbreak();
    }
}

      

(The complete code can be found here on GitHub.)

I sleep after compare_exchange fails twice - the first is actually load (), except when the stack is empty. Sounds good? This is an easy optimization. But...

Here's what I didn't expect:

Adding sleep code results in a significant reduction in throughput in scripts that will never run sleep code! This is confirmed by the addition of __debugbreak.

Examples of numbers:

test conditions:
----------------------
data_count = 1
loop_count = 100000000
thread_count = 1


sleep code commented out
-------------------------------
operations per second: 75357000
operations per second: 74487000
operations per second: 74571000
operations per second: 75357000
operations per second: 75843000
operations per second: 74183000
operations per second: 74822000
operations per second: 74321000
operations per second: 75301000
operations per second: 73991000

with sleep code
-------------------------------
operations per second: 60716000
operations per second: 61031000
operations per second: 61236000
operations per second: 60957000
operations per second: 60808000
operations per second: 60642000
operations per second: 60734000
operations per second: 60661000
operations per second: 60422000
operations per second: 61162000

      

This was the latest version of Xcode 5. I see a similar difference in numbers when using Visual Studio 2013.

So what's going on here? Why does the code show significantly smaller numbers when adding something that never gets executed?

+3


source to share


1 answer


Adding sleep adds another branch. Without sleep, there will be a jump back to the top of the loop if compare_exchange_weak is false. With sleep, there will be a branch of the epilogue function if compare_exchange_weak is true, and an unconditional jump back to the top of the loop after sleep.



0


source







All Articles