Data exchange between processor and memory

I'm a newbie programmer, but I want to understand things a little deeper. I've done some research and read quite a lot of text, but I still have a few things to understand.

When coding the main thing (in C):

int myNumber;
myNumber = 3;
printf("Here my number: %d", myNumber);

      

I found out that (mostly on a 32 bit CPU) an integer is 32 bits = 4 bytes. So on the first line of my code, the processor goes into memory. The memory is byte-addressable, so the CPU fetches 4 contiguous bytes for my variable and stores the address for the first (or last) byte.

In the second line of my code, the CPU uses its stored address in the MyNumber variable, navigates to that address in memory, and finds the 32 bits of reserved space there. Now his task is to store the number "3" there, so he fills these four bytes with the sequence 00000000-00000000-00000000-00000011. On the third line, it does the same - the processor goes to that address in memory and loads the number stored in that address.

(First question - did I get it right?)

I don't understand what it is: The
size of this address (pointer to this variable) is 4 bytes on a 32 bit CPU. (This is why a 32-bit processor can use a maximum of 4GB of memory - because there are only 2 ^ 32 different addresses of binary length 32)
Now where does the CPU store these addresses? Does it have some sort of memory or cache of its own to store this? And why does it store a 32-bit long address up to a 32-bit integer? Wouldn't it be better to just store the actual number in your cache than a pointer to one when the sizes are the same?

And the last one - if it stores somewhere in its cache, the addresses for all these integers and lengths are the same (4 bytes), it will need exactly the same space to store addresses as for real variables. But variables can take up to 4 GB of space, so the processor must have 4 GB of its own space to store the addresses of these variables. And it sounds strange.

Thank you for your help!
I'm trying to figure it out, but it's so hard ..: - [

+3


source to share


4 answers


(First question - did I get it right?)

The first thing to know is that the value may not be stored in main memory at all. The compiler may decide to store it in a register instead, as this is more optimal. 1

The memory is byte-addressable, so the CPU fetches 4 contiguous bytes for my variable and stores the address for the first (or last) byte.

Assuming the compiler chose to store it in main memory, then yes, on a 32 bit machine, int

it is usually 4 bytes, so 4 bytes will be allocated for storage.

The size of this address (a pointer to this variable) is 4 bytes on a 32-bit CPU. (This is why a 32-bit processor can use a maximum of 4 GB of memory - because there are only 2 ^ 32 different addresses of binary length 32)

Note that the width int

and width of the pointer do not have to be the same, so there is no need to concatenate with the size of the address space.

Now where does the CPU store these addresses?

In the case of local variables, the address is effectively hard-coded into the executable itself, usually as an offset from the stack pointer.



In the case of dynamically allocated objects (i.e. stuff that was malloc

-ed), the programmer usually maintains an appropriate pointer variable (otherwise a memory leak occurred!). This pointer can also be dynamically allocated (in the case of a complex data structure), but if you go back far enough, you will eventually reach what is a local variable. In this case, the above rule applies.

But variables can take up to 4 GB of space, so the processor must have 4 GB of its own space to store the addresses of these variables.

If your program consists of, independently, of malloc

millions of millions of int

s, then yes, you will end up with as much memory as you need for pointers. But most programs don't look like this. You usually allocate much larger objects (like an array or a large structure).

cache

The specifics of where the material is stored depends on the architecture. On modern x86, usually between 2 and 3 cache levels between the processor and main memory. But the cache is not independently addressable; The CPU can't decide to store int

in cache instead of main memory. Rather, the cache is actually a redundant copy of a subset of main memory.

Another thing to consider is that the compiler will usually deal with virtual addresses when allocating storage for objects. On modern x86, they are mapped to physical addresses (that is, addresses that correspond to bytes of physical memory in main memory) using specialized hardware along with OS support.


<sub> 1. In addition, the compiler can fully optimize it.
+5


source


In the second line of my code, the CPU uses its stored address in the MyNumber variable, navigates to that address in memory, and finds the 32 bits of reserved space there.

Almost correct. Memory is mostly unstructured. The CPU cannot see that there are 32 bits of "reserved space". But the CPU was instructed to read 32 bits of data, so it reads 32 bits of data starting at the specified address. Then you just have to hope / assume that those 32 bits actually contain something meaningful.

Now where does the CPU store these addresses? Does it have some sort of memory or cache of its own to store this? And why does it store a 32-bit long address up to a 32-bit integer? Wouldn't it be better to just store the actual number in your cache than a pointer to one when the sizes are the same?



A processor has a small number of registers that it can use to store data (regular CPUs have 8, 16, or 32 registers, so they can only hold certain variables that you are working with here and now). So to answer the last part, yes, the compiler certainly could (and probably will) generate code to just store yours int

in a register instead of storing it in memory and tell the CPU to load it from specified address. As for the other part of the question: Ultimately, every part of the program is stored in memory. Some of them are a stream of instructions, and some of it is in chunks of data scattered across memory.

There are several tricks to help find the data the CPU needs: a portion of the program's memory contains the stack, which usually stores local variables while they are in scope. The CPU always maintains a pointer to the top of the stack in one of its registers, so it can easily locate data on the stack simply by changing the stack pointer at a fixed offset. Instructions can directly contain offsets like this, so to read your int, the compiler could, for example, generate code that writes the int to the top of the stack when you enter a function, and then when you need to access that function, the code that reads the data. found at the address pointed to by the stack pointer, plus the small offset needed to find your variable.

+3


source


And also remember that the addresses your program sees can be (or rather rarely) physical addresses starting at "memory start" or 0. They are basically offsets into a block of memory where the memory manager knows the real address and access via base + offest as a real data store.

And we need memory, since caches are limited; -)

Mario

0


source


Inside the CPU, there is one register that contains the address of the next instruction to be executed. The instructions themselves contain information about where the variable is located. If the variable is optimized, the instruction may point to a register, but in general the instruction will have the address of the variable being accessed. Your code, once compiled and loaded into memory, has everything built in! I recommend looking at assembly language to better understand all of this. Good luck!

0


source







All Articles