Is UB available for an element past the end of a 2d array string?

Question

Is UB available for an element past the end of a 2d array string?

Is the behavior of the following program undefined?

#include <stdio.h>

int main(void)
{
    int arr[2][3] = { { 1, 2, 3 },
                      { 4, 5, 6 }
    };

    int *ptr1 = &arr[0][0];      // pointer to first elem of { 1, 2, 3 }
    int *ptr3 = ptr1 + 2;        // pointer to last elem of { 1, 2, 3 }
    int *ptr3_plus_1 = ptr3 + 1; // pointer to one past last elem of { 1, 2, 3 }
    int *ptr4 = &arr[1][0];      // pointer to first elem of { 4, 5, 6 }
//    int *ptr_3_plus_2 = ptr3 + 2; // this is not legal

    /* It is legal to compare ptr3_plus_1 and ptr4 */
    if (ptr3_plus_1 == ptr4) {
        puts("ptr3_plus_1 == ptr4");

        /* ptr3_plus_1 is a valid address, but is it legal to dereference it? */
        printf("*ptr3_plus_1 = %d\n", *ptr3_plus_1);
    } else {
        puts("ptr3_plus_1 != ptr4");
    }

    return 0;
}

According to §6.5.6 ¶8 :

In addition, if the expression P points to the last element of the array, the expression (P) +1 points to the past last element of the array object .... If both the pointer operand and result point to the elements of the same array object or one after the last an element of an array object, evaluation must not create an overflow; otherwise, the behavior is undefined. If the resulting points are one after the last element of the array object, it must not be used as the operand of the unary * operator being evaluated.

From this the seemingly behavior of the above program is undefined; ptr3_plus_1

points to an address that is past the end of the array object from which it is derived, and dereferencing that address causes undefined behavior.

In addition, Appendix J.2 assumes this behavior is undefined:

The array index is out of range, even if the object appears to be reachable at the given index (as in lvalue a [1] [7] , given the declaration of int a [4] [5] ) (6.5.6).

This issue is discussed in the question, Univariate Multidimensional Array Access: A Well-Defined C? ... The consensus here is that this kind of access to arbitrary elements of a two-dimensional array via one-dimensional indices is indeed undefined behavior.

The problem, as I see it, is that it is not even legal to form a pointer address ptr3_plus_2

, so it is not legal to access arbitrary elements of a two-dimensional array this way. But it is legal to form a pointer address ptr3_plus_1

using this pointer arithmetic. Also, it is legal to compare two pointers ptr3_plus_1

and ptr4

, according to §6.5.9 ¶6 :

Two pointers compare the same if and only if both are null pointers, both are pointers to the same object (including the object pointer and subobject at the beginning) or a function, both are pointers to one past the last element of the same array object , or one is a pointer to one end of the end of one array object, and the other is a pointer to the beginning of another array object, which happens to immediately execute the first array object in the address space.

So, if both ptr3_plus_1

and ptr4

are valid pointers that compare the same and must point to the same address (the object it points ptr4

to must be contiguous in memory to the object it points ptr3

to anyway, since the storage of the array must be contiguous), it would seem that *ptr3_plus_1

matters how *ptr4

.

Is this behavior undefined as described in 6.5.6 8 and Appendix J.2, or is it an exceptional case?

To clarify

It seems unambiguous that undefined behavior is trying to access the element past the end of the last line of a two-dimensional array. I am interested in the question of whether it is legal to access the first element of intermediate lines by generating a new pointer using the pointer to the element from the previous line and pointer arithmetic. It seems to me that another example in Appendix J.2 could make this clearer.

Is it possible to reconcile the clear statement in 6.5.6 ¶8 that an attempt to dereference a pointer to a location beyond the end of an array results in undefined behavior with the idea that a pointer past the end of the first line of a two-dimensional array of type T [] [] is also is a pointer of type T * that points to an object of type T , namely the first element of an array of type T [] ?

+3

c language-lawyer

David bowling 08 jul. 17 at 15:04

source to share

1 answer

hvd · Answer 1 · 2017-07-08T16:53:03+0000

So if both ptr3_plus_1

and ptr4

are valid pointers that compare are the same and must point to the same address

They are.

it would seem that *ptr3_plus_1

is as fair as *ptr4

.

This is not true.

Pointers are equal, but not equivalent. A trivial well known example of the difference between equality and negative zero equivalence:

double a = 0.0, b = -0.0;
assert (a == b);
assert (1/a != 1/b);

Now, to be fair, there is a difference between the two, since positive and negative zero have different representations, ptr3_plus_1

and ptr4

for typical implementations have the same representation. This is not guaranteed, and in implementations where they have different views, it should be clear that your code might fail.

Even in typical implementations, although there are good arguments for the same representation to imply equivalent values, to my knowledge, the official interpretation is that the standard does not guarantee this, so programs cannot rely on it, so implementations can assume that programs do not and optimize accordingly.

Is UB available for an element past the end of a 2d array string?

More articles: