How are arrays and pointer types handled internally by C compilers? (int * a; vs. int a [];)

I need a language lawyer with reputable sources.

Take a look at the following test program, which is compiled with gcc:

#include <stdio.h>


void foo(int *a) {
    a[98] = 0xFEADFACE;
}

void bar(int b[]) {
    *(b+498) = 0xFEADFACE;
}

int main(int argc, char **argv) {

int a[100], b[500], *a_p;

*(a+99) = 0xDEADBEEF;
*(b+499) = *(a+99);

foo(a);
bar(b);

printf("a[98] == %X\na[99] == %X\n", a[98], a[99]);
printf("b[498] == %X\nb[499] == %X\n", b[498], b[499]);

a_p = a+98;
*a_p = 0xDEADFACE;

printf("a[98] == %X\na[99] == %X\n", a[98], a[99]);

}

      

It produces the expected output:

anon@anon:~/study/test_code$ gcc arrayType.c -o arrayType
anon@anon:~/study/test_code$ ./arrayType 
a[98] == FEADFACE
a[99] == DEADBEEF
b[498] == FEADFACE
b[499] == DEADBEEF
a[98] == DEADFACE
a[99] == DEADBEEF

      

Are a and b the same? Is int *a

it executed as the same type as int a[]

inside the compiler?

From a practical point of view, int a[100], b[500], *a_p, b_a[];

everyone seems to be the same. I find it hard to believe that the compiler continually corrects these types under different circumstances in my example above. I'm happy that I was wrong.

Can anyone solve this issue for me definitively and in detail ?

+2


source to share


11 replies


Are a and b the same? Is int * treated as the same type as int a [] inside the compiler?

From the comp.lang.C

FAQ
:

... whenever an array appears in an expression, the compiler implicitly generates a pointer to the first element of the array, as if the programmer had written & a [0]. (The exceptions are when the array is the operand of the sizeof operator and or, or is a string literal initializer for a character array ...)

... Given an array a and a pointer p, an expression of the form a [i] causes the array to decay into a pointer, following the above rule, and is then indexed in the same way as the pointer variable in the expression p [i] (although possible memory accesses will be different ...

Given the ads



char a[] = "hello";
char *p = "world";

      

... when the compiler sees an expression a[3]

, it emits code starting at location a

, moving three past it and getting the character there. When it sees the expression p[3]

, it emits code to start at the location p

, retrieves the pointer's value there, adds three to the pointer, and finally retrieves the character it points to. In other words, a[3]

- these are three places (beginning) of the object with the name a

, and p[3]

- three places behind the object to which it points p

.

My accent. The biggest difference seems to be that the pointer is retrieved when it is a pointer, while there is no pointer to fetch if it is an array.

+9


source


One of the differences is int a[x][y]

that int **a

they are not interchangeable.

http://www.lysator.liu.se/c/c-faq/c-2.html



2.10

An array of arrays (i.e. a two-dimensional array in C) decays to pointer to array, not pointer to pointer.

+3


source


a and b are both int arrays. a [0] is not a memory address containing a memory address, it is a memory location containing an int.

Arrays and pointers are neither identical nor interchangeable. Arrays are the equivalent for iff pointers , when an array-of-T lvalue that appears in an expression decays (with three exceptions) to a pointer to its first element; the type of the resulting pointer is a pointer to T. This becomes clear when looking at the assembly for the associated code. Three exceptions: fyi when the array is the operand sizeof or or or literal for a string initializer for a character array.

If you picture this:

char a[] = "hello";
char *p = "world";

      

will result in the creation of data structures that can be represented as follows:

   +---+---+---+---+---+---+
a: | h | e | l | l | o |\0 |
   +---+---+---+---+---+---+

   +-----+     +---+---+---+---+---+---+
p: |  *======> | w | o | r | l | d |\0 |
   +-----+     +---+---+---+---+---+---+

      

and understand that a reference such as x [3] generates different code depending on whether x is a pointer or an array. a [3] for the compiler means: start at place a and move three past it and select char there. p [3] means go to location p, play the value there, move three behind it and get char there.

+3


source


From the C locale :

6.3.2.1.3 Except when it is the operand of the sizeof operator or the 
          unary & operator, or is a string literal used to initialize 
          an array, an expression that has type `` array of type '' is
          converted to an expression with type '' pointer to type '' that 
          points to the initial element of the array object and is not 
          an lvalue. If the array object has register storage class, the
          behavior is undefined.

Suppose the following code:

#include <stdio.h>
#include <string.h>
int main(void)
{
  char foo[10] = {0};
  char *p = foo;
  foo[0] = 'b';
  *(foo + 1) = 'a';
  strcat(foo, "t");
  printf("foo = %s, &foo = %p, &p = %p, sizeof foo = %lu, sizeof p = %lu\n", 
    foo, &foo, &p, (unsigned long) sizeof foo, (unsigned long) sizeof p);
  return 0;
}

      

foo is declared as a 10-element char array with all elements initialized to 0. p is declared as a pointer to char and is initialized to point to foo.

In line

char *p = foo;

      

the expression foo is of type "10-element char array"; since foo is not a sizeof or &, and is not a string literal used to initialize an array, its type is implicitly converted to pointer to char and set to point to the first element of the array. This pointer value is copied to p.

In lines

foo[0] = 'b';
*(foo + 1) = 'a';

      

the expression foo is of type "10-element char array"; since foo is not a sizeof or &, and is not a string literal used to initialize an array, its type is implicitly converted to pointer to char and set to point to the first element of the array. The subscript expression is interpreted as "* (foo + 0)".

In line

strcat(foo, "t");

      

foo is of type "10-element char array" and the string literal "t" is of type "2-element char array"; since neither is an operand of either sizeof or &, and although "t" is a string literal it is not used to initialize the array, both are implicitly converted to pointer-to-char, and the pointer values ​​are passed to strcat () ...

In line

  printf("foo = %s, &foo = %p, &p = %p, sizeof foo = %lu, sizeof p = %lu\n", 
    foo, &foo, &p, (unsigned long) sizeof foo, (unsigned long) sizeof p);

      

the first instance of foo is converted to a pointer to char as described above. The second instance of foo is the operand of the operator and, therefore, its type is not converted to "pointer to char", and the type of expression "& foo" is "pointer to a 10-element array" or "char ( *

) [10]". Compare this to the expression type "& p", which is a "pointer to pointer to char" or "char **

". The third instance of foo is the operand of the sizeof operator, so its type is not converted, and sizeof returns the number of bytes allocated to the array. Compare this to sizeof p, which returns the number of bytes assigned to the pointer.

Whenever someone tells you that "an array is just a pointer", they mangle the section from the above standard. Arrays are not pointers, and pointers are not arrays; however, in many cases, you can treat an array as if it were a pointer, and you can treat a pointer as if it were an array. "p" can be replaced with "foo" on lines 6, 7, and 8. However, they are not interchangeable like the sizeof or & operands.

Edit : btw like function parameters,

void foo(int *a);

      

and

void foo(int a[]);

      

are equivalent. "a []" is interpreted as " *

a". Note that this is only true for function parameters.

+3


source


I agree with sepp2k's answer and Rssakoff's quote from comp.lang.c. Let me add some important differences between the two ads and a common pitfall.

  • When you define a

    as an array (in a context other than a function argument, which is a special case), you cannot write a = 0; or A ++; because it is a

    not an lvalue (the value that might appear to the left of the assignment operator).

  • An array definition reserves space, but a pointer does not. Therefore, it sizeof(array)

    will return the memory space needed to store all the elements of the array (for example, 10 times four bytes for an array of 10 integers on a 32-bit architecture), while it sizeof(pointer)

    will only return the memory space needed to store this pointer (for example, 8 bytes in 64-bit architecture).

  • When you add pointer declarations or add array declarations, the situation is definitely at odds. For example, int **a

    is a pointer to a pointer to an integer. It can be used as a two-dimensional array (with different sized strings) by allocating an array of pointers to strings and creating each point in memory to store integers. To access a[2][3]

    , the compiler will retrieve the pointer to a[2]

    and then move three elements past the specified location to access the value. Compare that with b[10][20]

    , which is an array of 10 elements, each of which is an array of 20 integers. To access b[2][3]

    , the compiler will offset the beginning of the array's memory area by multiplying 2 by 20 integers and adding the size of 3 more integers.

Finally, consider this trap. If you have one C file

int a[10];

      

and in another

extern int *a;
a[0] = 42;

      

files will compile and link without error, but the code will not do what you expect; it probably crashes with a null pointer assignment. The reason is that in the second file a is a pointer whose value is the contents of the first file a[0]

, that is, initially 0.

+2


source


Look at here:

2.2: But I heard that char a [] was identical to char * a.

http://www.lysator.liu.se/c/c-faq/c-2.html

+2


source


In your example, there are two a's and two b's.

As parameters

void foo(int *a) {
    a[98] = 0xFEADFACE;
}

void bar(int b[]) {
    *(b+498) = 0xFEADFACE;
}

      

a and b are of the same type: a pointer to int.

As variables

int *a;
int b[10];

      

do not match. The first is a pointer, the second is an array.

Array behavior

An array (variable or not) is converted implicitly in most contexts in a pointer to its first element. Two contexts in C where it is not a sizeof argument and an argument &

; C ++ has some of them related to reference parameters and templates.

I wrote, variable or not, because conversion is not done only for variables, some examples:

int foo[10][10];
int (*bar)[10];

      

  • foo

    - an array of 10 arrays of 10 integers. In most cases this will be converted to a pointer to its first element, a pointer of type to an array of 10 int.

  • foo[10]

    - an array of 10 int; In most cases this will be converted to a pointer to its first element, a pointer of type to int.

  • *bar

    - an array of 10 int; In most cases this will be converted to a pointer to its first element, a pointer of type to int.

Some history

B, direct ancestor of C, equivalent

int x[10];

      

had the effect that in the current C we would write

int _x[10];
int *x = &_x;

      

those. it allocated memory and initialized a pointer to it. Some people have a misconception that this is still true in C.

In NB - when C was no longer B but not yet named C - there was a time when pointers were declared

int x[];

      

but

int foo[10];

      

will have the current value. Setting the function parameter is the remainder of that time.

+1


source


Are a and b the same type?

Yes. [Edit: I must clarify: the a parameter of the foo function is the same type as the b parameter in the function string. Both are pointers to int. Local variable a is basically the same type as local variable b in int. Both are arrays of ints (well, in fact, they are not of the same type, because they are not the same size, but both are arrays).

Is int * treated as the same type as int a [] inside the compiler?

Usually not. The exception is when you write foo bar[]

a function as a parameter (like you are here), it automatically becomes foo *bar

.

There is a big difference when declaring nonparametric variables.

int * a; /* pointer to int. points nowhere in paticular right now */
int b[10]; /* array of int. Memory for 10 ints has been allocated on the stack */
foo(a); /* calls foo with parameter `int*` */
foo(b); /* also calls foo with parameter `int*` because here the name b basically
           is a pointer to the first elment of the array */

      

0


source


No, they are not the same! One is a pointer to int, the other is an array of 100 ints. So yes, they are the same!

Okay, I'll try to explain this nonsense.

* a and [100] are basically the same for what you do. But if we take a closer look at the memory handling logic for the compiler, then we say the following:

  • *a

    compiler, I need memory, but I'll tell you how much later, so cold now!
  • a[100]

    , I need a memory now , and . I know I need 100, so make sure we have it!

Both are pointers . And your code can treat them the same way and trample memory next to whatever pointers you want. But a[100]

- this is contiguous memory from a pointer allocated at compile time, while * a allocates a pointer just because it doesn't know when you will need memory (memory nightmares in time).

So Who Cares , Right? Well, some features like sizeof()

care. sizeof(a)

will return a different answer for *a

and for a[100]

. And it will be different too. In this case, the compiler knows the difference, so you can use this to your advantage in your code too, for loops, memcpy, etc. Go on, try it.

This is a huge question, but the answer I am giving here is this. The compiler knows the subtle difference, and it will generate code that looks the same time, but different when it matters. It's up to you to find out what * a or [100] means to cimpiler and where it will relate to it differently. They may be almost the same, but they are not the same. And to make things worse, you can change the whole game by calling a function like yours.

Phew ... No wonder managed code like C # is so hot right now ?!

Edit: I should also add what you can do *a_p = X

, but try to do it with one of your arrays! Arrays work with memory in the same way as pointers, but they cannot be moved or changed. Type pointers *a_p

can point to different things.

0


source


I'll throw my hat into the ring for a simple explanation of this:

  • An array is a series of contiguous storage locations for the same type

  • A pointer is the address of one storage location

  • Taking the address of the array, we get the address (that is, a pointer to) its first element.

  • Array elements can be obtained through a pointer to the first array element. This works because the index operator [] is defined on pointers in a way to facilitate this.

  • An array can be passed where a is a pointer parameter, and it is automatically converted to a pointer to the first element (although this is not recursive for multiple levels of pointers or multidimensional arrays). Again, this is by design.

Thus, in many cases, the same piece of code can work on arrays and contiguous blocks of memory that were not allocated as an array due to the deliberately special relationship between the array and the pointer to its first element. However, they are different types and in some cases they behave differently. pointer-to-array is not the same as pointer to pointer at all.

Here is a recent SO question addressing the problem of pointer to array versus pointer to pointer: What is the difference between "abc" and {"abc"} in C?

0


source


If you have a pointer to a character array (and you want to get the size of that array), you cannot use sizeof (ptr), but you must use strlen (ptr) +1 instead!

0


source







All Articles