Storing literal constants in C ++

I would like to know where are literal constants stored in memory?

Example:

int i = 5;
char* data = char* &("abcdefgh");

      

storage partitions i

and data

depend on where they are declared. But does the compiler store 5

and "abcdefgh"

before copying it into variables?

And here I can get the address "abcdefgh"

where it is stored, but why can't I get the address 5

?

+3


source to share


3 answers


Integer literals such as 5

can be part of machine instructions. For example:

 LD A, 5

      



will load the value 5 into processor register A for some imaginary architecture, and since 5 is actually part of an instruction, it has no address. Few (if any) architectures have the ability to create string literals embedded in machine instructions, so they must actually be stored elsewhere in memory and accessed via pointers. Exactly where "elsewhere" is not specified in the C ++ standard.

+8


source


At the language level, string literals and numeric literals are different beasts.

The C and C ++ standard mainly states that string literals are treated "as if", you defined a constant character array with the appropriate size and content, and then you used its name instead of the literal. IOW when you write

const char *foo = "hello";

      

as if you wrote

// in global scope
const hello_literal[6] = {'h', 'e', 'l', 'l', 'o', '\0'};

...
const char *foo = hello_literal;

      

(there are backward compatibility exceptions that even allow writing char *foo = "hello";

without const

, but this is deprecated, and the behavior is undefined anyway to try to write through such a pointer)

So, given this equivalence, it is normal that you can have the address of a string literal. Integral literals, OTOH, are values ​​for which the standard states that you cannot accept any address β€” you can roughly think of them as values ​​that the standard expects to have no spare memory in the usual sense.




Now, this difference actually stems from the fact that at the machine level, they are usually done differently.

A string literal is usually stored as data somewhere in memory, usually in a read-only data section, which is mapped into memory directly from the executable. When the compiler needs its address, it easily commits, since it is data stuff that is already in memory and hence it has an address.

Instead, when you do something like

int a = 5;

      

5

doesn't really have a separate memory location like an array "hello world"

, but it is usually embedded in machine code as an immediate value.

It's tricky to have a pointer to it as it will be a pointer pointing halfway into the instruction, and generally pointing to data in a different format than what is expected for a regular variable int

that you can point to - think x86 where for small numbers you use more compact encodings or PowerPC / ARM and other RISC architectures where some values ​​are built from direct manipulation of an implicit switch switch and you can't even have some values ​​at once - you need to compose them from multiple instructions or Harvard architectures where data and code live in different address spaces.

For this reason, you cannot take the address of numeric literals (as well as numeric expressions, evaluation results, and many other temporary stuff); if you want to have the address of a number, you must first assign it to a variable (which can provide storage in memory) and then request its address.

+6


source


While the C and C ++ standards don't define where literals are stored, it is common practice to store them in one of two places: in code (see @ NeilButterworth's answer) or in the "constants" segment.

Common executables have a code section and a data section. The data segment can be divided into read-only, uninitialized read / write, and initialized read-write. Often literals are placed in the read-only section of the executable.

Some tools can also put literals in a separate data file. This data file can be used to program data in read-only memory devices (ROM, PROM, Flash, etc.).

Thus, the placement of literals is implementation dependent. The C and C ++ standards state that writing to literal location is undefined behavior. The preferred practice with character literals is to declare the variable as const

, so the compiler may generate warnings or errors when writing to the literal.

+3


source







All Articles