Fine-grained multithreading - how much does a worker task cost?

I use the work_pile pattern so that the threads are always running and waiting on the semaphore for incoming new function pointers + data in the queue. What the guys selling the apple now call the Grand Central Dispatch and advertise as the new sliced ​​bread.

I'm just wondering how to know if it's helpful to split a short task into two shorter ones. Is there a rule by which I can judge whether it is worth queuing for a new object?

+2


source to share


3 answers


Two possible answers:

  • It depends.
  • Check it out.


I prefer the second one.

In any case, if the two tasks are always executed one after the other (i.e. sequentially), then I believe there is no advantage to separating them.

+1


source


The limit for multitasking is how many cores you have and how parallel the algorithm is. Various types of overhead, including locking, can reduce the amount of concurrency, lowering or even eliminating the benefits of multitasking. This is why it works best when there are independent, long-term tasks. Having said that, as long as the overhead doesn't swallow the performance gain, it pays to split even a short task among the cores.



+1


source


Short answer: you need to think about resources + workload + benchmarking.

Here are some of the ways that can break:

  • Do you have any downtime? Is the workload short so that the thread is taking so long to complete while another thread is hanging pending reassignment (i.e. more threads than running)?
  • Do you have enough work? Is the overall task completing so quickly that you shouldn't think about additional threads? Remember that increasing multithreading adds some (sometimes) small but measurable amount of overhead.
  • Do you have the resources? Do you have more threads to give? Do you have CPU cycles sitting around?

So, in a word, I would say that you need to think before you enter. If you already have a code that works at all, it's like money in the bank. Is it worth spending more time improving the performance of this code, or is the return on investment too low (or negative!)?

+1


source







All Articles