The original MIX architecture features 6-bit bytes and memory is addressed as 31-bit words (5 bytes and a sign bit). As a thought exercise I'm wondering how the C language can function in this environment, given:
- char has at least 8 bits (annex E of C99 spec)
- C99 spec section 6.3.2.3 ("Pointers") paragraph 8 says "When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object." My interpretation of this requirement is that it underpins "memcpy(&dst_obj, &src_obj, sizeof(src_obj))".
Approaches that I can think of:
- Make char 31 bits, so indirection through "char*" is simple memory access. But this makes strings wasteful (and means it isn't POSIX-compliant as that apparently requires 8 bit chars)
- Pack three 8 bit chars into one word, with 7 ignored bits: "char*" might be composed of word address and char index within it. However this seems to violate 6.3.2.3, i.e. memcpy() would necessarily skip the ignored bits (which are probably meaningful for the real object type)
- Fully pack chars into words, e.g. the fourth 8 bit char would have 7 bits in word 0 and one bit in word 1. However this seems to require that all objects are sized in 8 bit chars, e.g. a "uint31_t" couldn't be declared to match the word length since this again has the memcpy() problem.
So that seems to leave the first (wasteful) option of using 31-bit chars with all objects sized as multiples of char - am I correct in reading it this way?