Capturing thread_local variable by reference in lambda does not work as expected

I am currently building a system that has multiple threads running and one thread can queue up another thread and wait for completion. I am using mutexes and condition_variables for synchronization. To avoid creating a new mutex and cv for each operation, I wanted to optimize it and tried to use a mutex / cv thread_local pair for each waiting thread. However, this unexpectedly does not work and I would be wondering why.

Basically my code where the queues are running on a different thread and waiting for it looks like this:

/* thread_local */ std::mutex mtx;
/* thread_local */ std::condition_variable cv;
bool done = false;  

io_service.post([&]() {
    // Execute the handler in context of the io thread
    functionWhichNeedsToBeCalledInOtherThread();

    // Signal completion to unblock the waiter
    {
        std::lock_guard<std::mutex> lock(mtx);
        done = true;
    }
    cv.notify_one();
});

// Wait until queued work has been executed in io thread
{
    std::unique_lock<std::mutex> lk(mtx);
    while (!done) cv.wait(lk);
}

      

This works fine if the sync objects are not thread_local

. When I add thread_local

, the waiting thread waits forever, indicating that the condition variable is never signaled. I now have a feeling that despite capturing objects by reference, the thread_local objects of another thread are being used inside the lambda. I can even confirm that capture is not doing the right thing by checking the address mtx

inside and outside the lambda -> They are not the same.

The question arises:

  • Is this a compiler bug or design bug? I am using Visual Studio 2015 and have not tested other compilers yet.
  • Is capturing thread_local

    variables by reference even allowed?

I can work around the error by making an explicit reference to the variables thread_local

outside the lambda and using those references inside it. However, I believe the behavior is unexpected, and I would like to hear an explanation of whether this is correct or not.

+3


source to share


2 answers


What you are observing is the correct behavior as you are not actually capturing anything. Static and streaming duration objects are directly accessible, so in the interests of efficiency, [&]

-capture does not affect them. You can, however, capture the appropriate local thread-local:



io_service.post([&mtx = mtx, &cv = cv]() {

      

+1


source


For the mutex to work, every thread that needs synchronization must lock the same mutex . What thread_local

it does is create a different mutex for each thread. If your threads have their own independent mutex, they cannot communicate through them. You need one mutex for all of your threads.

The same goes for condition variables. All threads must talk to the same condition variable. This means that it doesn't make sense to have a separate condition variable for each thread.



As for your lambda, each thread that instantiates the lambda will grab its own copy of the variables thread_local

. Given that the mutex and the condition variable you are accessing from the lambda are accessing the other from a different thread, there is no synchronization since your lambda operates on a completely different set of variables.

+3


source







All Articles