What is PDE Cache?
I have the following ARM based SoC specifications:
- L1 Data Cache = 32KB, 64V / LINE, 2-WAY, LRU
- L2 Cache = 1 MB, 64 B / line, 16-WAY
- L1 Data TLB (for loads): 32 entries, fully associative
- L2 Data TLB: 512 entries, 4-WAY
- PDE cache: 16 entries (one entry per 1 MB of virtual space)
And I wonder what is PDE cache? I think it looks like a TLB, but I'm not sure.
Answer
It looks like PDE (Entry Directory Entry) is an intermediate cache table that can actually be implemented separately from the TLB.
The MPCore Cortex-A15 processor implements dedicated caches that store intermediate levels of translation table entries as part of the table walk.
source to share
It is interesting. ARM does not indicate the existence of this PDE cache in the Cortex-A15, Cortex-A57, and ARMv7 and ARMv8 programming guides.
PDE usually stands for Entry Directory Entry , so it can be a dedicated cache to store these entries and write the TTBR register when doing address translation.
ARM has several "staging cache tables" that are associated with the ASID (Address Space ID) and VMID (Virtual Machine ID) field, so it is similar to the PDE cache and staging table cache. In the documentation, "intermediate staging table caches" store intermediate levels of translation record entries ... so they could be page records .
source to share
The TLB caches full translations, it does not reflect a consistent chunk of memory per se (although it is not sequential, it can lead to a loss of consistency if the pagemap changes, so the SW must enforce consistency through flushing).
However, the folder itself is in memory, and as such - every part of it can also be cached, whether in a general purpose hierarchy or in special dedicated caches like the PDE cache. This is implementation specific, different processors may decide how to do it differently.
An access hitting the TLB (at any of its levels) will not require this data, but a TLB miss will cause a page break, which issues memory read from pagemap - these reads can hit caches if they include the pagemap data, instead of to go all the way to memory.
Since pagewalk is a long serialized critical access chain (even more so if you have virtualization), you can imagine how important it is to optimize the latency of these calls by caching them. Thus, a dedicated cache at any of the pagemap levels that will help them compete with normal data lines (which are much more likely to hijack the cache) is often very useful for performance
source to share
PDE ("Write to Page Directory") is the x86 architecture terminology for writing a top-level table * - equivalent to a "first-level descriptor" in ARM VMSA terms.
Assuming this is the source of the data in the question, it presumably refers to a Cortex-A15 "intermediate cache table" which is not entirely appropriate as it can actually cache any translation layer .
* in IA-32 at least - 64-bit mode has levels above this
source to share