In multithreaded programming, synchronization limits the benefits of concurrent executions

I have a dilemma regarding the use of multithreading in an application I am working on. I have a workflow that changes the state of an object, which is not a problem for a single threaded operation. However, to improve performance, I plan on using multiple threads.

As far as I understand, since the state will be shared between threads, each thread must acquire a state lock before executing, so wouldn't that overcome the goal of multithreading? It seems that multiple threads will not result in actual concurrency, so it won't be any better than a one-liner.

Is my analysis correct? If I am wrong, would someone please clarify the concept?

+3


source to share


5 answers


Short answer: concurrency is tricky. Real concurrency, with multiple concurrent authors, is really heavy.

What you need to define is what your actual consistency guarantees should be. Does every reader need to see every post guaranteed? You will then be forced to linearize all threads in some way (using locks, for example) - your next effort should be to make sure you do as much work as possible from the lock to hold the lock for as short a time as possible.

One way to keep a lock for as short a time as possible is to use a lock algorithm . Most of the locking-based algorithms are based on the atomic comparison and set primitive, such as in a package java.util.concurrent.atomic

. They can be very high performing, but designing a successful blocking algorithm can be subtle. One simple non-blocking algorithm is to simply create a new (immutable) state object and then atomically make it a "live" state, repeating in a loop if another state was created by another author in the middle. (This approach is good enough for many applications, but it is vulnerable to livelock if you have too many writers.)



If you can get a simpler consistency guarantee, then many other optimizations are possible. For example, you can use thread-local caches so that each thread sees its own view of the data and can write in parallel. Then you need to deal with the consequences of stale or inconsistent data. Most methods in this vein tend to be as consistent as possible : entries may not be visible to all readers at once, but they are guaranteed to all readers in the end.

This is an active area of ​​research and a complete answer could fill a book (several books indeed!). If you are just getting started in this field, I would recommend that you read Java concurrency in practice from Goetz et al, as he gives a good introduction to the topic and lots of practical advice on how to successfully create concurrent systems.

+2


source


Your interpretation of the limits of multithreading and concurrency is correct. Since state must be acquired and controlled by threads in order for them to do work (and waiting for them to be down), you are essentially sharing the work of one thread among multiple threads.

The best way to fix this is to redesign your program to limit the size critical section

. As we learned in my course on operating system with process synchronization,

only one critical section should be executed at any given time



The specific term critical section may not apply directly to Java concurrency, but it still illustrates the concept.

What does the limitation of this critical section mean ? For example, let's say you have a program that manages one bank account (unrealistic, but illustrating my point). If an account lock is to be acquired by a thread for a balance to be renewed, the main option would be to have one thread running on balance updates at all times (no concurrency). The entire program will be the critical section . However, let's say there is other logic that must be executed, for example, warning other banks about the update balance. You can request to lock the state of the bank account only when updating the balance, and not when warning other banks, reduce the size of the critical section and allow other threads to do the workwarning other banks while one stream is updating the balance.

Please comment if unclear. You seem to understand the limitations of concurrency already, but hopefully this will show the possible steps to implement concurrency.

+1


source


Your need is not entirely clear, but you are guessing at the limitations that multithreading can have.
Concurrent thread execution makes sense if some "relatively autonomous" tasks can be executed concurrently by separate threads or a group of threads.

If your scenario looks like this: you start 5 threads and finally only one thread is active and the rest are waiting for a blocking resource, using multithreading does not make sense and might even lead to overhead due to cpu context switches.

I think in your use case, multithreading can be used to:

  • tasks that do not change state
  • executing a task that changes state if the task can be split into multiple processes with a minimal set of instructions that can benefit from using multithreading.
+1


source


As far as I understand, since the state will be shared between threads, each thread must acquire a state lock before executing, so wouldn't that overcome the goal of multithreading?

The short answer is, "It depends." It is rare that you have a multithreaded application that does not share data. Thus, sharing data, even if it requires a complete lock, does not necessarily result in performance degradation when creating a multi-threaded application with a single thread.

The big question is what is the frequency that the state should update on each stream. If threads read in state do their parallel processing that takes time and then change state at the end, then you might see a performance boost. On the other hand, if every step of processing needs to be coordinated in some way between threads, then they might waste time fighting over the state object. By reducing this dependency on shared state, you will improve your multi-threaded performance.

There are also more efficient ways to update a state variable that can avoid locks. Something like the following pattern is used a lot:

 private AtomicReference<State> sharedState;
 ...
 // inside a thread processing loop
 // do the processing job
 while (true) {
   State existingState = sharedState.get();
    // create a new state object from the existing and our processing
    State newState = updateState(state);
    // if the application state hasn't changed, then update it
    if (sharedState.compareAndSet(existingState, newState)) {
        break;
    }
    // otherwise we need to get the new existing state and try again
 }

      

One way to handle state changes is to have a coordinating thread. This is the only thread that reads from state and generates jobs. When the job completes, they add updates to the state BlockingQueue

, which is then read by the coordinating thread, which updates the state in turn. Then the processing threads do not have to compete for access to the shared state.

+1


source


Imagine this:

  • Synchronization is blocked
  • Concurrency - parallelization

You don't need to use sync. You can use an Atomic reference object as a wrapper for your shared modified state.

You can also use stamped locks, which improve concurrency by allowing optimistic reads. You can also use Accumulators to write parallel code. These features are part of Java 8.

Another way to prevent sync is to use immutable objects that are free to publish and publish and don't need to sync. I must add that you should use immutable objects anyway regardless of concurrency, which simplifies your object state space

0


source







All Articles