Just a note: Campy in now compiling System.Console.WriteLine() in the GPU runtime library. Not counting native CUDA code–which is quite sizable–there are 162 methods involved. This converts into 1065 basic blocks, and ~7000 CIL instructions. It’s damn slow, but I haven’t really started working on performance yet.