C ++ alignment (when to use alignas)

I recently learned about sizing and aligning structures. I am very familiar with how to use and how the alignas () specifier works. I've seen examples of correct use (relative to semantics, not real-life use cases) and a way to resize a type / variable.

However, I don't know when it is useful in my code. Could you please provide some use cases where a developer has to manually specify data alignment?

+3


source to share


1 answer


There are many use cases where alignas comes in handy in latency-sensitive multithreaded applications. For example. High frequency trading applications.

Alignas provides tighter control over how the layout of your objects in CPU caches speeds up object access. The objectives are as follows for optimal use, which are used to use alignas

  • You want to avoid unnecessarily invalidating your data from cache lines.
  • You want to optimize CPU reads so you can waste CPU cycles.

How aligning cache lines with alignas helps
Use 1 - Avoid unnecessary invalidation of data from a cache line You can use alignas to store addresses or objects used by separate threads running on separate cache lines so that one thread does not unintentionally cause the cache line of another kernels.

How it happens: Consider the case where a thread in your process is running on kernel 0 and writes to say xxxx. This address is now loaded into the L1 cache of kernel 0. No. 2 refers to address xxxx + n bytes. Now, if both of these addresses are in the same cache line, then any writes on thread 2 would be redundant to invalidate the kernel 0 cache line. Thus, thread 0 is delayed until the cache line is canceled and loaded again. This hinders performance in a multi-threaded environment.

Use 2 Align objects to separate cache lines so that objects are not spread across multiple cache lines. This saves CPU cycles. For example. If the size of the object is specified, eg. 118 bytes, better aligned to 64 bytes as on most processors the cache line size is now 64 bytes.



If you don't, your object might be laid out like this in 64 byte cache lines. (For example, it is assumed that an object has an actual size of, say, 118 bytes and with natural alignment, the size becomes a multiple of 4, thus 120 bytes)

Cache line 1 <----- Object 1 60Bytes β†’ <--- your object 4> Bytes ---------->
Cache line 2 <--------- Your object 64 bytes --------------------------------->
Cache line 3 <----- Your object is 52 bytes - ---> <--- Another object 12 bytes β†’

Since the processor reads multiple cache lines, your object will be read in 3 processor cycles. If you want to optimize it, consider alignas (64). With this, your object will always spread over 2 cache lines.

Caveats Please note that you need to examine your objects carefully before considering alignas. The reason for the wrong methodology would lead to more padding and therefore more L2 cache crash. There are simple methods for ordering data items in sequence so that they avoid wastage.

Hope this helps and good luck!

+4


source







All Articles