First things, first. Disassembling the compiler output will most probably not help you in any way to understand any of the issues you have. The output of the compiler is no longer a c++ program, but plain assembly and that is really harsh to read if you do not know what the memory model is.
On the particular issues of why is the definition of base
required when you declare it to be a base class of derived
there are a few different reasons (and probably more that I am forgetting):
- When an object of type
derived
is created, the compiler must reserve memory for the full instance and all subclasses: it must know the size of base
- When you access a member attribute the compiler must know the offset from the implicit
this
pointer, and that offset requires knowledge of the size taken by the base
subobject.
- When an identifier is parsed in the context of
derived
and the identifier is not found in derived
class, the compiler must know whether it is defined in base
before looking for the identifier in the enclosing namespaces. The compiler cannot know whether foo();
is a valid call inside derived::function()
if foo()
is declared in the base
class.
- The number and signatures of all virtual functions defined in
base
must be known when the compiler defines the derived
class. It needs that information to build the dynamic dispatch mechanism --usually vtable--, and even to know whether a member function in derived
is bound for dynamic dispatch or not --if base::f()
is virtual, then derived::f()
will be virtual regardless of whether the declaration in derived
has the virtual
keyword.
- Multiple inheritance adds a few other requirements --like relative offsets from each
baseX
that must be rewritting before final overriders for the methods are called (a pointer of type base2
that points to an object of multiplyderived
does not point to the beginning of the instance, but to the beginning of the base2
subobject in the instance, which might be offsetted by other bases declared before base2
in the inheritance list.
To the last question in the comments:
So doesn't instantiation of objects (except for global ones) can wait until runtime and thus the size and offset etc could wait until link time and we shouldn't necessarily have to deal with it at the time we are generating object files?
void f() {
derived d;
//...
}
The previous code allocates and object of type derived
in the stack. The compiler will add assembler instructions to reserve some amount of memory for the object in the stack. After the compiler has parsed and generated the assembly, there is no trace of the object, in particular (assuming a trivial constructor for a POD type: i.e. nothing is initialized), that code and void f() { char array[ sizeof(derived) ]; }
will produce exactly the same assembler. When the compiler generates the instruction that will reserve the space, it needs to know how much.