L1 cache access time
From this web link:
http://www.7-cpu.com/cpu/IvyBridge.html
he says latency for Ivy Bridge L1 cache access is:
- L1 Data caching latency = 4 cycles for easy pointer access
- L1 Data caching latency = 5 cycles for complex address computation access (size_t n, * p; n = p [n]).
Instead of "simple", did they mean if the pointer is the same size as the word? So if the pointer is 32-bit and its 32-bit OS, then it will be "easy", otherwise it will cost "complex" latency?
I just don't quite understand their explanation for the differences in the two delays.
source to share
The full effective x86 address looks like displacement + base + index * scale
(where displacement
is constant, base
and index
are registers, and scale
is 1, 2, 4, or 8).
They seem to call an address simple if only displacement
(or possibly an optional term base
) is present , and index * scale
certainly falls into the complex category.
Refresh . Indeed, Intel's optimization manual has this statement (for Sandy Bridge): the total load latency is five cycles. When using simple addressing mode, base plus offset that is less than 2048, the load latency can be four cycles. See also Table 2-12. Impact of addressing modes on download latency.
source to share