As everybody knows, a "VTable" (short for "virtual table") is a
data structure used to lookup methods when they are called using dynamic
binding (which in Java, C++ and .NET is called doing "virtual calls").
In C++ they are generated by the compiler, while in Java and .NET they are
managed by the runtime.
Whet is less known is that Java and .NET, allowing multiple inheritance from interfaces (in the sense that every class can implement several interfaces), make building VTables a bit tricky.
The problem is not listing all the methods which a class implements and can be called with a CALLVIRT opcode, which is easy. The real issue is that the CALLVIRT opcode, as implemented in the JIT, must find the required method fast: nobody wants method call operations to be inherently slow.
It's here that the multiple inheritance issue shows up: typically, a VTable is just
an array of function pointers, and each method has its slot. The CALLVIRT opcode must
produce a call instruction that picks the right slot. If the this pointer is
typed (statically, at compile time) as a class, things are easy, because the slot
positions are kept constant along the inheritance hierarchy (just like the layout of
fields), thanks to the single inheritance between classes.
For instance, if method M of class C has slot 3, it will have slot 3 in every class
derived from C, so the CALLVIRT opcode just needs to access the VTable at that slot
and the correct method will be called.
Now, consider interfaces: method M in interface I could be implemented at slot S1 in class C1, slot S2 in class C2... in practice, it could be in a different slot in every class that implements I. Therefore, the CALLVIRT opcode must use some sort of lookup mechanism to find the right slot for any given instance of a class that implements I.
In this paper the issue is explored in detail, and several solutions are explained. In any case, the problem is the tradeoff between time and space: you can have very fast VTable access (constant time, like array accesses), or you can have small VTables, but you cannot have both.
The solution used in Mono until now is conceptually similar to the "Selector Indexed Tables" described in the paper (even if very different in practice). For each interface we generate a unique numeric ID at runtime (let's call it UIID). In the VTable, beside the space for the class methods, we reserve one "region" for each implemented interface, where the pointers to the interface methods are replicated; inside each of these regions the pointers have exactly the same order as the method declarations in the interface itself. Then we have, beside the proper VTable, a table that maps, for each UIID, the offset of the corresponding "interface region" inside the VTable (we call this table "interface_offsets").
This way, the CALLVIRT opcode must access interface_offsets using the UIID to find the right VTable region (simple array access), displace it with a constant corresponding to the method slot in the interface (a simple addition), and the method is found.
The downside is that "interface_offsets" is an array, and it must be as large as the highest UIID of the interfaces implemented by the class (and is then typically full of "holes", unused slots corresponding to UIIDs of unimplemented interfaces).
As Miguel noted, the use of generic types makes the number of interfaces in the runtime exceptionally high. Every class which implements even just one interface is then forced to have a very large "interface_offsets" table, only because its interface happens to have a high UIID. The more interfaces are generated instantiating generic types, the higher the UIIDs... In an application like Banshee, the runtime ends up allocating one Mbyte of RAM only for tables which are mostly empty, to support the execution of one Mbyte of jitted code.
And the situation is even worse than that!
Actually, the runtime maintains two kinds of VTables: one for "metadata handling",
inside MonoClass, which is an array of MonoMethod pointers, and another for the JIT,
which is MonoVTable and which contains the actual pointers to the machine code (and which
for this reason is usually specific to each application domain).
Needless to say, both tables need their related "interface offset mapping" table,
even if the "mono_stats" only keep track of the memory allocated for the second kind of
VTables (the "proper" ones, MonoVTable).
So interface offset tables waste their space twice.
To put a remedy to this, we decided to implement IMT (Interface Method Tables), as described
in the paper linked above.
Briefly, IMT replaces "interface_offsets" with a small fixed size hash table.
Note that the key hash functions are computed at JIT time, and not at run time,
so method calls, in the absence of collisions, are roughly just as fast as with the current
implementation.
Of course the tricky part is handling collisions.
About this, one must realize that the IMT contains pointers to code, not data, and it is
used as a jump table by the JIT. So, while an hash table, as a data structure, handles
collisions with lists, the IMT handles them with thunks of code which have
the capability of resolving the collision, and which the runtime must generate when it
fills up the IMT slots.
In a preparation for implementing this, in current SVN the "interface_offsets" field is gone from MonoClass (which means, the "metadata handling" virtual tables do not waste memory anymore). It has been replaced by a pair of compact arrays (which have the same function), and a bitmap used for "ininst" checks by the JIT.
The next step is actually implementing IMT in MonoVTable, which I'll do in the following weeks...