I’ve a feeling we’re not in Kansas anymore

Now that Campy can compile simple value and reference types, the next issue is the runtime. Even on simple examples, when Campy decompiles a kernel in order to JIT it for the GPU, it rapidly discovers that the kernel references the NET runtime in most C# code.

In all NET runtimes, while most of the source is written in C#–which can be JIT’ed by Campy–there is a small portion of the Base Class Library  (see ECMA-334, Appendix D) that is also written in C source code, and therefore cannot be JIT’ed by Campy. Furthermore, depending on how a program that uses Campy is run, Campy decompiles different runtimes for the same kernel source code.

Let’s take a simple string search example, which searches for all occurrences of a pattern in another using a brute-force algorithm.

string text = ...;
string pattern = ...;
int[] found = ...;
Campy.Parallel.For(n, i =>
{
    int j;
    for (j = 0; j < m && i + j < n; j++)
        if (text[i + j] != pattern[j])
            break;
    // mismatch found, break the inner loop
    if (j == m)
    {
        // match found
        found[i] = 1;
    }
});

When built with NET Framework 4.7, Campy discovers the kernel method and the BCL method String.get_Chars(). However, get_Chars() has no CIL body as it is a native C function, which we also know because it is tagged “Is Internal” when using reflection.

If the program is run under Mono, Campy discovers a different implementation, all being in CIL, which can be JIT’ed.

The result is a richly textured landscape of different runtimes that need to be accommodated. That said, much of the runtime just doesn’t make any sense on the GPU: networking, graphics drawing, etc.

After reading some of the source code for Coreclr, Corert, Mono, NET Micro Framework, Dot Net Anywhere, I’ve come to the conclusion that I cannot simply replace the function calls to C functions of the runtime with unsafe C# code, which I have been doing so far. This cannot work as there must be an agreement between the implementation of the data structures in the BCL presumed between C# and C source.

Therefore, Campy must implement a BCL specific for it. Fortunately, there are plenty of examples to start from. Unfortunately, while CUDA C++ on Windows can compile C++11/14, it requires all device-runnable functions to be tagged with the __device__ qualifier, a rather annoying requirement in CUDA.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This blog is kept spam free by WP-SpamFree.