Confusion About Array and Strings in C

What is the difference between S1

, S2

and S3

?

char S1[6];
S1[0] = 'A';
S1[1] = 'r';
S1[2] = 'r';
S1[3] = 'a';
S1[4] = 'y';

char S2[6] = {'A','r','r','a','y'};

string S3 = "Array";

      

When I run the program using if (strcmp(a,b) == 0)

where a, b = S1, S2, S3

. He shows that S2

they are the S3

same, but S1

also S2

different. Why is this so ?? Why aren't all three equivalent?

And when I add back '\0'

to the S1b

, S1c

. All 3 are the same. This is clear.

BUT why in my first test S2

and are the S3

same then? I didn't include it either '\0'

. And I suspect that S1

u S2

should be the same, but not S2

both S3

.

Can anyone tell me why my thought is wrong?

Thank you for your responses. I tried and changed the settings like this:

char S1[5];
S1[0] = 'A';
S1[1] = 'r';
S1[2] = 'r';
S1[3] = 'a';
S1[4] = 'y';

char S2[5] = {'A','r','r','a','y'};

string S3 = "Array";

      

And now it is clear that S2

they S3

do not match, since they differ by '\0'

. However, I'm still a little confused as to why S1

and S2

don't match this time if I use strcmp to compare the two?

+3


source to share


4 answers


Compare the actual memory values ​​in the arrays:

  • S1 has 6 elements large, but you only specify values ​​for 0-5, the 6th element is not explicitly set, so it retains whatever value is in memory prior to allocation.
  • S2 is similar to S1, only 5 elements are provided, however {,}

    any additional elements are cleared when using the syntax . So, it char foo[5] = { 1, 2 }

    is identical char foo[5] = { 1, 2, 0, 0, 0}

    .
  • S3 uses a string syntactic way of initializing an array, which creates an array char

    (or wchar_t

    ) with an optional element set to \0

    (null terminator).

Visually:



S1 = 0x41, 0x72, 0x72, 0x61, 0x79, 0x??
S2 = 0x41, 0x72, 0x72, 0x61, 0x79, 0x00
S3 = 0x41, 0x72, 0x72, 0x61, 0x79, 0x00

      

Note that strcmp

you are facing a security issue with the problem: it does not have a length parameter, it continues searching until it encounters \0

, which may never be (i.e. until a segfault or violation access). Use a safer type function strncmp

or (if using C ++) type instead std::string

.

+4


source


It shows that S2 and S3 are the same, and S1 and S2 is different.

S3 contains a null terminator that S1 does not. This string S3 = "Array";

means

| A | r | r | a | y | \0 |

      

Bye S2

| A | r | r | a | y | \0 |

      

While S1



| A | r | r | a | y | Garbage |

      

Comparing S1 and S2 could result in UB (I suppose) because S1 is not nul-terminated and there is no length we pass to strcmp

.

#include <stdio.h>
#include <string.h>

int main(void) 
{
    char S1[6];
    S1[0] = 'A';
    S1[1] = 'r';
    S1[2] = 'r';
    S1[3] = 'a';
    S1[4] = 'y';
    S1[5] = 0;

    char S2[6] = {'A','r','r','a','y', 0};
    printf("%d" ,strcmp(S1,S2));
    return 0;
}

      

Outputs:

0

      

+2


source


Function

strcmp () starts comparing the first character of each string. If they are equal to each other, it continues the next pairs until the characters are different or until a terminating null character is reached.

I don't think it's safe to compare S1 and S2 using this. The input to strcmp is the address of the first character. S1 does not end with zero. Although 6 bytes are allocated in both cases, S1 [5] is not initialized. Most likely they have some garbage value. The risk here is that strcmp will end up comparing unallocated memory also looking for a diff or null character. It can even lead to a crash or access error.

Visualizing memory alignment S1, S2, S3 could be something like this

S1 = A | r | r | a | y | ?
S2 = A | r | r | a | y | 0
S3 = A | r | r | a | y | 0

      

Any comparison between S2 and S3 is safe. S1 vs S2 or S3 may not exist.

+1


source


Just an addition to existing answers

char S2[6] = {'A','r','r','a','y'};

string S3 = "Array";

      

Both are NULL-terminated and hence strcmp()

works well and says they are both the same. Although S1

assignment is explicit, there is no NULL termination for this array. Thus, it is not a valid string in C. Therefore, use strcmp()

can lead to undefined behavior.

Point c S3

is that S3 is a read-only string literal. Basically, these values ​​are stored in read-only locations. So when you try to write something in S3

after initialization, you might see a crash. Therefore, we must keep this in mind when using assignments likeS3

0


source







All Articles