Threads accessing a single cache line

I came across a suggestion that threads don't have access to the same cache lines, and I really can't figure out why, and also while doing a search on this topic, I came across these questions: Multiple threads and cpu cache where one of the suggested answers:

you just want two threads to simultaneously try to access data located on the same cache line

The way I see it, the cache stores memory pages for quick access to the process, and as it says here: http://en.wikipedia.org/wiki/Thread_%28computing%29#How_threads_differ_from_processes

threads share their address space

For two threads, there should be no problem accessing the same cache line as if the page is in the cache and the thread trying to access the memory will get the cache regardless of the other thread.

I've heard the argument about excluding threads from accessing the same cache line on several different occasions, so it can't be a myth. What am I missing here?

+3


source to share


2 answers


Why it is not recommended to talk about optimization of speed read-write problem when working on a multi-core processor

In this case, if it is faster to avoid cache lock

( LOCK# signal

) and suppress cache line bouncing

as needed for support cache coherence

by running read / write on different cache lines.

You are correct that this is not a problem to be avoided because something will not work. This is just one recommended speed optimization.

Thinking about internal processor caches is an extremely low level of speed optimization. For most typical programming tasks, the speed bottleneck is outside the hardware circuitry and after the Intel Guide to Developing Multithreaded Applications is sufficient




see also

Some illustrations of "cache lines" are available in the Intelยฎ 64 and IA-32 Developer Software Developer's Guide

enter image description here

enter image description here

+3


source


In most (probably all, but I don't have exhaustive hardware) multi-core processors, the cache will lock the currently available line when one core tries to write to the appropriate memory. Thus, other kernels trying to access the same cache line will be delayed.



You can share the same data between threads as long as it's only read (or rarely updated), but if you keep writing to it, serializing hidden access will give performance equivalent to running all threads on the same core (actually a bit worse due to latency cache locks).

+2


source







All Articles