How do I force the program to use inconsistent addresses?

Question

How do I force the program to use inconsistent addresses?

I have heard that reads and writes of matched int are atomic and safe, I wonder when the system makes non malloc'd globals non-equal, non-packed structs, and arithmetic cast / pointer byte buffers?

[X86-64 linux] In all my usual cases, the system always picks whole locations that don't break a word, like two bytes for one word and two other bytes for another word. Can anyone post a program / snip (C or assembly), which forces the global variable to an unstyled address so that the integer breaks and the system has to use two reads to load one integer value?

When I type the program below, the addresses are close to each other, so that several variables are within 64 bits, but never a word break is visible (cleverness in the system or the compiler?)

#include <stdio.h>
int a;
char  b;
char c;
int      d;
int e = 0;


int isaligned(void *p, int N)
{
    if (((int)p % N) == 0)
        return 1;
    else
        return 0;
}

int main()
{

    printf("processor is %d byte mode \n", sizeof(int *));
    printf ( "a=%p/b=%p/c=%p/d=%p/f=%p\n", &a, &b, &c, &d, &e );

    printf ( " check for 64bit alignment of test result of 0x80 = %d \n", isaligned( 0x80, 64 ));
    printf ( " check for 64bit alignment of a result = %d \n", isaligned( &a, 64 ));
    printf ( " check for 64bit alignment of d  result = %d \n", isaligned( &e, 64 ));

return 0;}

Output:

processor is 8 byte mode 
a=0x601038/b=0x60103c/c=0x60103d/d=0x601034/f=0x601030
 check for 64bit alignment of test result of 0x80 = 1 
 check for 64bit alignment of a result = 0 
 check for 64bit alignment of d  result = 0

How is the reading of the char going in the above case? Is it read from an 8-byte aligned border (in my case 0x601030) and then go to 0x60103c?
Memory access granularity is always word size, right?

thank.

+3

c multithreading linux

resultsway Jan 31. 13 at 22:52

source to share

5 answers

This is probably not the case, outside of these cases.

In assembly, this is trivial. Something like:

     .org 0x2
myglobal:
     .word SOME_NUMBER

But on an Intel processor, the processor can safely read unmodified memory. It may not be atomic, but it may not be obvious from the generated code.

Intel, right? Intel ISA has single byte read / write opcodes. Parse your program and see what it uses.
Not necessary - you may have a mismatch between the memory word size and the processor word size.

+2

Carl norum Jan 31. 13:03 pm

source to share

1) This answer is platform specific. In general, however, the compiler will align the variables unless you do it otherwise.

2) It will take two reads to run one variable at startup on a 32-bit CPU:

uint64_t huge_variable;

The variable is larger than a register, so multiple operations are required to access. You can also do something like this using packed structs:

struct unaligned __attribute__ ((packed))
{
    char buffer[2];
    int  unaligned;
    char buffer2[2];
} sample_struct;

3) This answer is platform specific. Some platforms may behave as you describe. Some platforms have instructions to get half-register or quarter-register data. I recommend examining the assembly released by your compiler for more details (make sure you turn off all compiler optimizations first).

4) The C language allows you to access memory granular in size in bytes. How this is implemented under the hood and how much data your processor retrieves to read a single byte depends on the platform. For many processors, this is the same as the size of a general register.

+2

bta Jan 31. 13:05 pm

source to share

The C standards ensure that it malloc(3)

returns the region of memory that meets the most stringent alignment requirements, so this simply cannot happen in this case. If there is unbalanced data, it is likely read / written in chunks (this depends on the exact guarantees provided by the architecture).
On some architectures, invalid access is allowed, and on others, a fatal error. When enabled, it is usually much slower than negotiated access; when not allowed, the compiler has to take the pieces and concatenate them together, and this is even much slower.
Characters (really bytes) are usually allowed to have any byte address. Byte instructions just get / store a single byte in this case.
No, memory access depends on the data width. But real memory access is in terms of cache lines (read on CPU cache for this).

0

vonbrand 01 Feb 13 at 0:06

source to share

Unbound objects can never arise unless you invoke undefined behavior. In other words, there is no sequence of actions, all of which have well-defined behavior that a program can execute that would result in an unaligned pointer. In particular, there is no portable way to force the compiler to give you offset objects. The closest is the "boxed structure" that many compilers have, but this only applies to structure members, not independent objects.

Also, there is no way to check for consistency in portable C. You can use implementation-defined conversions on integers and check for low bits, but there is no fundamental requirement that "aligned" pointers have zeros on the low bit, or that the least significant bits after conversion to the integer even matches the "least significant" bits of the pointer, whatever that means. In other words, conversions between pointers and integers are not required for switching with arithmetic operations.

If you really want to make a few misconfigured pointers, the easiest way to do it, assuming alignof(int)>1

something like:

char buf[2*sizeof(int)+1];
int *p1 = (int *)buf, *p2 = (int *)(buf+sizeof(int)+1);

It is not possible for both buf

and to buf+sizeof(int)+1

be simultaneously aligned for int

if alignof(int)

greater than 1. Thus, at least one of the two butts (int *)

is applied to an invalid pointer, causing undefined, and the typical result is an invalid pointer.

0

R .. 01 Feb 13 at 0:56

source to share

Mats Petersson · Accepted Answer · 2013-02-01T00:01:49+0000

1) Yes, there is no guarantee that non-ironed accesses are atomic, because [at least sometimes on some types of processors] data can be written as two separate records - for example, if you cross a memory page boundary [I'm not talking about 4 KB pages for virtual memory, I'm talking about DDR2 / 3/4 pages, which is a fraction of the total memory, typically 16Kbps, regardless of the width of the actual memory chip, which will vary depending on the memory card itself]. Likewise, on processors other than x86, you get a trap for reading unaltered memory, which will either interrupt the program, or the read will be emulated in the software as multiple reads to "fix" the unaligned read.

2) You can always make an unbound area like this:

char *ptr = malloc(sizeof(long long) * number+1);
long long *unaligned = (long long *)&ptr[2];

for(i = 0; i < number; i++)
   temp = unaligned[i];

By the way, your alignment check is checking if the address is 64 bytes, not 64 bits. You need to divide by 8 to check that it is 64-bit aligned.

3) A char is a single byte read and the address will be at the actual address of the byte itself. The actual memory read is probably for a full cache line, starting at the target address and then looping around, like this:

0x60103d is the target address, so the processor will read a 32-byte cache line starting at the 64-bit word we want: 0x601038 (and once that is done, the processor moves on to the next instruction - the next read will be done to fill the cacheline), then the cacheline is filled with 0x601020, 0x601028, 0x601030. But is it necessary to disable the cache. If you want your 3GHz x86 processor to be slightly slower than 66MHz 486, disabling the cache is a good way to achieve this, the processor will only read one byte at 0x60103d.

4) Not on x86 processors, they are byte-addressable, but for normal memory, reading is done according to the "cacheline" principle, as explained above.

Note also that "cannot be atomic" is not at all the same as "not be an atom" - so you may have a hard time doing this by mistake - you really need to get all the timings from two different streams to the right, and chill in caches, straddle memory page boundaries, etc. to make it go wrong - it will happen if you don't want it to happen but trying to get it to go wrong, maybe believe me I was there. did this].

How do I force the program to use inconsistent addresses?

More articles: