C array initialization - some basics

I read in a book that when you have a string like this "blabla"

it means there is a hidden char array and this expression returns the address to the first element and it is like a const array .

This confuses me about 2 scenarios:

  • char a[7] = "blabla"

    , not possible because "blabla" returns the address in the first element of the array, so how would you put the address in a

    instead of the actual elements?

  • it says that when you see "blabla" it means a const

    char array and that means I cannot change at all a

    (which is not correct).

I think that something really basic here is not clear to me.

+3


source to share


4 answers


According to the C standard (6.3.2.1 Lvalues, arrays and function notation)

3 Unless it is an operand of a sizeof operator or unary and an operator or string literal used to initialize an array , an expression that is of type `` array of type is converted to an expression with type '' a pointer to a type pointing to the starting element of the object array and is not an lvalue. If the array object has a register storage class, the behavior is undefined.

So, in this declaration

char a[7] = "blabla";

      

the elements of a string literal, which is of character array type char[7]

due to the inclusion of a null-terminated element of the string literal, are used to initialize the elements of the character arraya

In fact, this declaration is equivalent to the declaration

char a[7] = { 'b', 'l', 'a', 'b', 'l', 'a', '\0' };

      

Be aware that C string literals have non-persistent character array types. However, they themselves may not be modifiable.

From the C standard (6.4.5 String literals)

7 It is not known if these arrays are different if their elements have corresponding values. If the program tries to modify such an array, the behavior is undefined.

So you can write for example



char *s = "blabla";

      

In this case, according to the first quotation from the C standard, the string literal is converted to a pointer to its first element, and the value of the pointer is assigned to a variable s

.

That is, an unnamed character array is created in static memory, and the address of the first element of the array is assigned to a pointer s

. You cannot use a pointer to modify a literal that you cannot write, such as

char *s = "blabla";
s[0] = 'B';

      

In C ++, string literals do indeed have constant character array types. Therefore, you need to write in a C ++ program

const char *s = "blabla";

      

In C, you can also write

char a[6] = "blabla";
     ^^^^

      

In this case, the null-terminated string literal will not be used to initialize the character array a

. Thus, the array will not contain a string.

Such a declaration is not allowed in C ++.

+4


source


First case,

char a[7] = "blabla"

, impossible [...]

Yes, it is possible, it is initialization.

Quote C11

, chapter ยง6.7.9 / P14, Initialization,

A character type array can be initialized with a character string literal or with a UTFโˆ’8

literal string , optionally enclosed in curly braces. Consecutive bytes of a string literal (including null termination if there is room or an array of unknown size) to initialize the elements of the array.

Second case,



it says when you see "blabla" it means the array is const char and that means I can't change a at all (which is not correct).

[In terms of directly trying to modify a string literal]

You can, but you MUST NOT .

From chapter ยง6.4.5

[...] If the program tries to modify such an array, the behavior is undefined.

However, your case is a

not a pointer to a string literal, it is an array with elements initialized with the content from the string literal. You are allowed to modify the contents of the array a

.

+3


source


"blabla" is what the book says, an array of 7 bytes of characters, the last of which is "\ 0", placed in read-only data space (when possible).

(1) When you write:

 char a[7] = "blabla";

      

You tell the compiler to create a mutable 7 character array on the stack and copy the read-only array inside it. Note that you can also write:

 char a[] = "blabla";

      

... it's safer because the compiler will count the symbols for you.

(2) Given the fact that [] is a copy of "blabla", you can write to it without problems. If you want to keep the property read-only, you can write:

const char *a = "blabla";

      

This time a will be a constant pointer to a constant string, and its contents will not change. You should be able to reassign the pointer anyway:

const char *a = "blabla";
a = "blublu";

      

+2


source


too late, but I still provide my answer.

So let's make the difference between

main()
{
  char *a="blabla";
  a[3]='x';
}

      

and this one, yours.

main() 
{
  char a[7] = "blabla"
  a[3]='x';
}

      

So there is a big difference between the two.

In the first case, the object a

is a pointer whose value points to the beginning of the string blabla

.

Dropping the collected code, we see:

  4004aa:       48 c7 45 f8 54 05 40    movq   $0x400554,-0x8(%rbp)
  4004b1:       00 
  4004b2:       48 8b 45 f8             mov    -0x8(%rbp),%rax
  4004b6:       48 83 c0 03             add    $0x3,%rax
  4004ba:       c6 00 78                movb   $0x78,(%rax)

      

So it tries to set a pointer to an address 0x400554

.

Objdumpo reports that this address is in the segment .rodata

.

Disassembling the .rodata section:

0000000000400550 <_IO_stdin_used>:
  400550:       01 00                   add    %eax,(%rax)
  400552:       02 00                   add    (%rax),%al
  400554:       62                      (bad)  
  400555:       6c                      insb   (%dx),%es:(%rdi)
  400556:       61                      (bad)  
  400557:       62                      .byte 0x62
  400558:       6c                      insb   (%dx),%es:(%rdi)
  400559:       61                      (bad)  

      

So the compiler has set a string blabla

in .rodata to this address and then tries to change the .rodata segment, ending up with a segmentation fault.

readelf

reports no W

access to .rodata:

[13] .rodata           PROGBITS         0000000000400550  00000550
     000000000000000b  0000000000000000   A       0     0     4

      

On the other hand, what you are trying to do (second program) is compiled like this:

00000000004004a6 <main>:
  4004a6:       55                      push   %rbp
  4004a7:       48 89 e5                mov    %rsp,%rbp
  4004aa:       c7 45 f0 62 6c 61 62    movl   $0x62616c62,-0x10(%rbp)
  4004b1:       66 c7 45 f4 6c 61       movw   $0x616c,-0xc(%rbp)
  4004b7:       c6 45 f6 00             movb   $0x0,-0xa(%rbp)
  4004bb:       c6 45 f3 78             movb   $0x78,-0xd(%rbp)

      

In this case, the array object is a

allocated 7 bytes on the stack, starting at offset %RBP-0xA

to %RBP-0x10

.

When it tries to do [3] = 'x', it will change the stack to %RBP-0xD

. The stack has permission write

, everything is fine.

For more information, I suggest you read https://en.wikipedia.org/wiki/Identity_and_change

+1


source







All Articles