Mprotect multi-thread behavior

Question

Mprotect multi-thread behavior

For parallel / parallel GC purposes, I am wondering what kind of memory order guarantee is provided by syscall mprotect (i.e. mprotect's multi-thread behavior or mprotect's memory model). My questions (assuming no compilation or sufficient compiler barrier)

If thread 1 segfault on the address due to mprotect on thread 2, can I be sure everything happens on thread 2 before syscall can be observed on thread 1 in the segmentation signal handler? What if a full handler memory barrier is placed in the signal before executing the load on thread1?
If thread 1 is performing a volatile load on an address that is set to PROT_NONE by thread 2 and has not caused a segfault, does enough a happen before the relationship between them. Or in other words, if the two streams ( *ga

starts with 0

, p

is a page-aligned address that starts read-only)
```
// thread 1
*ga = 1;
*(volatile int*)p; // no segfault happens

// thread 2
mprotect(p, 4096, PROT_NONE); // Or replace 4096 by the real userspace-visible page size
a = *ga;

      

        
        
        
      

    
```
is there a guarantee that a

there will be on stream 2 1

? (assuming there is no segfault seen on thread 1 and no other code changes *ga

)

I'm most interested in Linux behavior and in particular on x86 (_64), arm / aarch64 and ppc, although information on other arches / OS is welcome (for Windows, replace mprotect with VirtualProtect or whatever it is called .... ). So far, my tests on x64 and aarch64 Linux have not suggested any violations, although I'm not sure if my test is definitive or if the behavior can be relied upon in the long run.

Some lookups suggest that it mprotect

could cause the TLB to fail on all streams with the address displayed when the permission was removed, which could provide the guarantee specified here (or in other words, guarantee that this guarantee is the purpose of such an operation), although it is not clear to me whether whether future optimizations of the kernel code would violate this guarantee.

Ref post LKML where I asked about this a week ago until I got a response ...

Edit: clearing up the question. I knew churning tlb should provide the guarantee I was looking for, but I would like to know what to rely on. In other words, what is the reason for such requests being issued by the kernel since it is not needed, if not to provide some sort of order guarantee.

+3

c garbage-collection multithreading virtual-memory

yuyichao 01 May '17 at 16:18

source to share

1 answer

yuyichao · Accepted Answer · 2017-05-04T01:34:19+0000

So I asked about it in a group with mechanical sympathy a day after posting here and got a response from Gil Tene. With his permission, here is my summary of his answers. The full thread is available here in case anything that I have not included is not clear.

For general behavior you can expect from the OS.

(as in "it would be surprising if the OS didn't meet):

The call to the mprotect () function is completely ordered with respect to the loads and stores that occur before and after the call. This is usually trivially achieved at the CPU and OS level, since mprotect is a system call that includes a hook, which in turn includes full ordering. [In strange implementations without a ring jump (eg, in the kernel, etc.), the guard call would probably be responsible for emulating this ordering assumption].

The mprotect call will not return until the security request is semantically captured throughout the process. If a call to mprotect () sets up a defense that might throw an error, any operation on any thread that occurs after this call to mprotect () is required to fail. Likewise, if a call to mprotect () sets up protection that would prevent an error, any operation on any thread that occurs after this call to mprotect () is required to NOT error.

This essentially means that the memory operation on the affected pages on other threads is synchronized with the calling thread mprotect

. In particular, both of these cases mentioned in the original question can be expected to be guaranteed. I.e.

If the load on one thread on damaged pages is observed to cause errors due to the call to mprotect, this error occurs after the call to mprotect () and therefore after and can observe all memory operations that occur before mprotect.
If you observe that the load on one thread on the affected page does not cause an error, call mprotect, the load occurs before the call to mprotect and the call to mprotect, and any code after it has loaded, and will be able to observe any memory operations that occur before loading.

It was also pointed out that transitivity might not work, i.e. load on one thread cannot be after load without failures in another thread. This can (effectively) be caused by the non-atomization of the tlb flash causing different threads / cpu to watch the access permission change at different times.

Mprotect multi-thread behavior

More articles: