Does it really take the address of a string literal or iterate over it?

Of course this is true:

char src[] = "Allie has a cat.";
char buff[20];
strcpy(buff, src);
printf("%s\n", buff);

      

Or that:

printf("Allie has a cat.\n");

      

But is it correct to shorten this code to this?

char buff[20];
strcpy(buff, "Allie has a cat.");
printf("%s\n", buff);

      

Or write:

printf("%s", "Allie has a cat.");

      

The reason I am asking is because AFAIK in C string literals are a few different animals than character arrays, and while both examples seem to work ( ideone # 1 , ideone # 2 ), I think that when it comes to CI, I shouldn't do anything that I only assume should be valid or just seems to work b / c from the number of errors that this language has and how easy it is in this language to write a construct that looks innocent, but so no less is UB.

+3


source to share


2 answers


This is absolutely true. String literals are arrays of characters:

6.4.5 String literals

[...]

  1. In phase 7 translation, a byte or code of zero value is added to each multibyte a sequence of characters that is obtained from a string literal or literals. The multibyte character sequence is used to initialize an array of static storage duration and length sufficient to contain the sequence. For character string literals, array elements have an enter char

    and are initialized with individual bytes of a multibyte character sequence [...]

  2. It is not known if these arrays are different if their elements have corresponding values. If the program tries to modify such an array, the behavior is undefined.

(Source: ISO 9899: 1999 ("C99"))

So something like



strcpy(buff, "Allie has a cat.");

      

effectively equivalent

static char __s0[] = "Allie has a cat.";
strcpy(buff, __s0);

      

(other than changing the string literal has undefined behavior).

+6


source


For historical reasons, string literals are declared as an array char

, not const char

, but the standard does not guarantee that they can be modified, and that they cannot. This is because by 1989 there were already implementations that did it in both directions. In practice, treat them as they are const char[]

today and never try to modify them.

Compilers are allowed to reuse the same memory for multiple character constants, so don't assume that identical strings or substrings will or won't overlap.

Taking an address, iterating over it, and reading the contents to a terminating zero are all completely safe.

Postscript



The words of the standard, which Melpomene said in her answer, says changing a character constant is undefined behavior . Since this inspired a discussion in the comments, understands what that means.

If you change a character in a string literal, you void any guarantees the standard gave you about what a program can do. The compiler has the blessing of ANSI and ISO to do absolutely everything. So, if you declare char hello[] = "hello";

, then array[4] = 'p';

, here are some things that can happen:

  • The compiler may give you an error and reject the program.
  • puts("hello");

    can print hellp

    .
  • puts("Say hello");

    can print Say hellp

    .
  • puts(hello);

    can print hello

    .
  • puts(hello);

    can print hellp

    .
  • Either hello

    , or "hello"

    may be buggy hello

    or hellp

    depending on what the optimizer is doing.
  • The program may crash on the line array[4] = 'p';

  • The superintelligent AI compiling your programs might decide that human stupidity is jeopardizing the mission and trying to kill you.

There are examples of each of these behaviors, although the last one may be from a movie.

+3


source







All Articles