Conceptually, a M-by-N RGB image is a 2D matrix where each matrix element is a vector with 3 values.
There are many different ways you could represent this in physical memory. For example:
Using a MxN array where each element is a 24-bit integer. Each integer is formed by the red, green and blue values (each 8-bit integers) for example as so: red<<16 | green<<8 | blue
(or equivalently red*256*256 + green*256 + blue
).
Using 3 separate MxN arrays, one for each color channel.
Using a MxNx3 array, where the 3rd dimension is the "color dimension". You would index this as img[i,j,k]
, with k
being 0, 1 or 2. Thus, one pixel is formed by 3 array elements.
This last format is the one described in the question. Such a 3D array is typically implemented as a 1D array, with the indexing converted like this:
index = i + j * M + k * N*M;
or as this:
index = i * N*3 + j * 3 + k;
or in yet another different order, it does not matter (we're assuming 0-based indexing here). Thus, the array has M*N*3
elements, and three elements out of it together represent one pixel.