c - Are two dimensional arrays in C required to make all elements contiguous?

Question

I heard from a friend that two dimensional arrays in C are only supported syntactically.

He told me to better use float arr[M * N] instead of float[M][N] because C compilers like the gcc can't guarantee that on every system/platform the data lies in series within the memory.

I want to use this as an argument in my master thesis but I don't have any referrence.

So first question:

Is that right what he's saying?

Second question:

Do you know if there is a book or an article where to find this statement?

Thanks + Regards

score 13 · Accepted Answer

No, he's wrong.
Look at the C standard. Some relevant bits (bold emphasis mine):

6.2.5 Types ¶20

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.

6.7.6.2 Array declarators ¶3 (note 142)

When several "array of" specifications are adjacent, a multidimensional array is declared.

6.5.2.1 Array subscripting ¶3

Successive subscript operators designate an element of a multidimensional array object. ... It follows from this that arrays are stored in row-major order (last subscript varies fastest).

And perhaps most explicitly, the example in 6.5.2.1 Array subscripting ¶4:

EXAMPLE Consider the array object defined by the declaration

int x[3][5];

Here x is a 3 × 5 array of ints; more precisely, x is an array of three element objects, each of which is an array of five ints. In the expression x[i], which is equivalent to (*((x)+(i))), x is first converted to a pointer to the initial array of five ints. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five ints. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the ints, so x[i][j] yields an int.

Multidimensional arrays in C are just "arrays of arrays". They work fine and are 100% defined by the standard.

You may also find it helpful to read Section 6, Arrays and Pointers in the comp.lang.c FAQ.

score 7 · Accepted Answer

The issue is a bit more subtle than the other answers make it sound:

While multi-dimensional arrays are (semantically, possibly not physically) contiguous, pointer arithmetics is only defined if you stay within the bounds of the array your pointer originally referenced (actually, you can go 1 element past the upper bound, but only if you don't dereference).

This means that language semantics forbid walking through a multi-dimensional array from start to end, and a bounds-checking implementation of the C language (which are possible in principle but rarely seen in the wild for performance reasons) could raise a segfault, print a diagnostic or make demons fly from your nose whenever you cross a sub-array's boundary.

I'm not sure if compilers use this information for optimization purposes, but in principle, they could. For example, if you have

float *p = &arr[2][3];
float *q = &arr[5][9];

then p + x and q + y should never alias, regardless of the values of x and y.

score 4 · Accepted Answer

Section 6.2.5.20 requires that arrays be contiguously allocated. This applies as much to an array of arrays as it does to a single dimensional array.

Your friend is simply wrong.

score 3 · Accepted Answer

Built-in multi-dimensional arrays in C are implemented through index translation. This means that, for example, a 3D array T a[M][N][K] is implemented as a 1D array T a_impl[M * N * K], with multi-dimensional access a[i][j][k] being implicitly translated into the single-dimensional access a_impl[((i * N) + j) * K + k]. The language specification does not explicitly describe this implementation, however the requirements mandate it pretty much directly.

Taking this into account, it is not clear why your friend would tell you to use float arr[M * N] explicitly instead of relying on the implicit implementation of the same thing by the compiler.

The situation that might make you to consider float arr[M * N] approach is when both M and N are run-time values and your compiler does not support variable-length arrays (or you for some reason do not want to use them). In such cases the built-in support for multidimensional arrays is no longer applicable, since it relies on all sizes (except the first one) being compile-time constants. Maybe this is what your friend had in mind.

c - Are two dimensional arrays in C required to make all elements contiguous?

4 回答 4

Related

Reference