The problem with “IL_0009: callvirt instance void class [mscorlib]System.Collections.Generic.List`1::set_Item(int32, !0/*int32*/)”

Consider a program that uses System.Collections.Generic.List<>:

List<int> x = new List<int>();
x[0] = 2;

After compiling the program, we find CIL instructions to create the generic List<int>, add 1 to the list, and reset the first element in the list to be 2:

IL_0001: newobj instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()
IL_0009: callvirt instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0/*int32*/)
IL_0012: callvirt instance void class [mscorlib]System.Collections.Generic.List`1<int32>::set_Item(int32, !0/*int32*/)

Notice that in each call, a generic instance is referenced. The signature of the method in the instruction contains the instantiated generic parameters, not the actual generic instance arguments, e.g., “0!”. You may ask: “Why aren’t the generic parameters substituted with the actual argument System.Int32?” The only reason I can think of is so the name/signature encoding can be found in the assembly mscorlib. It also allows for the system to JIT the CIL code with various generic arguments when executed. You can see using DotPeek that in the MemberRef table for the set_Item method, there are three fields used to define the method called: (1) the declaring type System.Collections.Generic.List`<System.Int32>, which is an instantiated generic; (2) the name of the method, set_Item; and (3) the signature blob System.Void (System.Int32, !0). In order to find the CIL for the method in mscorlib, a compiler would need to find a method with the same name and same signature. It’s easier to get a match when the generic parameter is used, and not the generic argument.

The problem with these incomplete signatures is that the generic parameter is already typed. Campy fixes this problem by creating new MethodReference values that fully type the method parameters. It performs¬†unification of signatures instead of a simple string comparison for matching. Thus,¬†System.Collections.Generic.List`1<>::set_Item(Int32, !0) matches System.Collections.Generic.List`1<Int32>::set_Item(Int32, Int32). This change required quite a bit of jumping through hoops because the Mono Cecil’s assembly and metadata resolvers could not be used. I had to write new ones. The next release of Campy will add in this new code.


Waz up?

After a year plus some, Campy is starting to work on some practical examples. But, when things go sour in an executing kernel, there’s not much I can do but single step and look at disassemblies and registers of the GPU. I know what things should look like because you’d expect that from a compiler writer. But, for the average user, they’re not going to understand much. Before I get LLVM debugging information really working, the first step is good ol’ WriteLine() calls. What I should be able to do is this little ditty:

using System;
namespace test
    class Program
        static void Main(string[] args)
            Campy.Parallel.For(4, i => { System.Console.WriteLine(i); });

This simple kernel does quite a bit. First thing to note is the code generated :

Node: 1 
    Method System.Void ConsoleApp4.Program/<>c::b__1_0(System.Int32) ConsoleApp4.exe C:\Users\kenne\Documents\Campy2\ConsoleApp4\bin\Debug\ConsoleApp4.exe
    Method System.Void ConsoleApp4.Program/<>c::b__1_0(System.Int32) ConsoleApp4.exe .\ConsoleApp4.exe
    HasThis   False
    Args   0
    Locals 0
    Return (reuse) False
        IL_0000: nop    
        IL_0001: ldarg.1    
        IL_0002: call System.Void System.Console::WriteLine(System.String)    
        IL_0007: nop    
        IL_0008: ret    

In this example, there is no expect call to “ToString()” the value after the ldarg.1, so Campy must know to convert the integer to a string. I’m a little surprised when I see crap like this coming out of the C# compiler; it would have made my life a little easier if it generated code to convert to the appropriate parameter type. It’s likely there are many other such implicit type conversions: the rules for implicit argument coercion is in ECMA 335 (page 305), although it does not mention int to string conversion. Does anyone know where this is in the spec?

Second, while a lot of the infrastructure for compiling this test works, there are still a number of problems preventing it from working. Looking through the output of the compiler, the generated LLVM code isn’t correct for newarr:

        IL_0040: newarr System.Char    

This will be fixed. I’m hoping the next release will have WriteLine finally working.

Third, I’m noticing that there are lots of try-catch-finally blocks in the NET runtime to compile. I’ve been holding off on this, as it appears that CUDA does not allow try/catch exception handling whatsoever. For the moment, I can try to string together basic blocks so that the finally clauses are executed at least from the try clause. I might be able to implement some sort of exception handling, but it’s not at all clear at the moment.

BTW, does anyone else hate how Windows OS ignores case for file or directory names? I just found out that there’s a “Corlib” and a “corlib” in the Campy Git repository. Undoubtedly I added using the CLI for Git and typed in by mistake both ways. Unfortunately, to correct it, I’ll have to use Linux less I repeat the same mistakes on Windows.