After a lot of work on the metadata subsystem, I decided to release a new version of Campy. This release fixes a lot of issues with programs that use/reference Net Core and Net Standard, impacting the closure of kernels. The memory allocation subsystem was also improved, although it is still just a first-fit free block allocator. There are some corrections for various CIL instructions, like ldlen, ldnull, and newobj. Generics still do not work. After some thought, rewriting a generic instance like “List<int>” into a non-generic Mono.Cecil.TypeDefinition where the name is “List<int>”, and every damn CIL instruction that references a generic argument is rewritten, isn’t going to work when System.Reflection is added. FFT finally works again, although I did find out that System.Numerics.dll in Net Core (C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.0.7\System.Numerics.dll) does not contain CIL, which you can verify yourself using DotPeek. It looks like–as with netstandard.dll–that System.Numerics.dll forwards types to System.Runtime.Numerics.dll–which does contain the CIL for things like “Complex operator +(Complex, Complex)”! Unfortunately, I found out just as I was about to release Campy that DNA does not handle any x64 Net Core assemblies on Ubuntu, while still working on Windows. It turns out that DNA does not read 0x8664 machine PE files, which is produced on Ubuntu. So, many last minute changes to get the Ubuntu platform working. It all means that there is still a lot to change in DNA to bring it up to snuff with respect to Mono, Net Core, Net Standard, Net Framework.
After using the DNA code for a while, I’ve identified some of the problems with the implementation that need to be corrected. Other problems were noted in Matt Warren’s article, and in the original DNA Git repository. Several problems mentioned have already been fixed.
- DNA does not conform to ECMA 335. There are missing table types, described below. The problem is that if any PE/assembly is read that contains one of these missing table types, DNA will not work, and likely you won’t even know! For example, in the original code, when reading a table that followed the missing table input, I recall it would segv because null would be passed to strlen. The following table illustrates the current state of DNA.
|Table number (base 10)||Type name||In ECMA 335 6th Ed. June ‘12||In CodeProject 12585||In original DNA||In Blazor DNA||In GPU DNA so far|
- The parser for signatures is just terrible. The parser should be an LL-like parser, which it sort of does on first glance seems to resemble, but actually isn’t. For example, MetaData_DecodeSigEntry() is used to decode the signature entry field. But, it is also called in many other places to just get a 32-bit unsigned integer. IT SHOULD NOT! That’s not how parsers should ever be written! It should follow the syntax descriptions of the ECMA 335 spec, section II.23.2, and from that, using the Dragon Book, a nice implementation written. This code needs to be completely rewritten.
- There is no tool for a human readable print out of the PE file metadata tables for debugging. I have added “CampyPeek” to fix this problem.
- Old Blazor code changed MetaData_DecodeSigEntry() in metadata.c, but it isn’t clear why. I will need to chase this down.
- Assembly resolution in DNA is a problem for the GPU. In DNA, assembly “resolution” is sort of done with function CLIFile_Load() in CLIFile.c. “Probing” occurs here, just opening the file in the current directory. Unfortunately, probing can only work if the files are pre-loaded into the GPU file system. So, assembly resolution doesn’t following that in the standard sense of the term. For the moment, I will assume that all assemblies are placed in the directory of the executable. For Net Core programs, this is already done with a “publish”. I will need to figure out a good solution for Net Framework programs.
- DNA does not seem to handle a number of Net Standard and Net Core assemblies: netstandard.dll (contains table type ExportedType), System.Numerics.dll (machine type 0x8664). This is the most critical problem, since it blocks execution of Net Core–and hence, an important aspect of Campy.
- DNA does not handle type forwarding from Net Standard to the referenced assemblies. In other words, a type may be referenced, but the meta says it’s defined in netstandard.dll. That DLL is a front for the real implementations of the framework used. I’ve identified MetaDat_GetTypeDefFromName() in MetaData_Search.c that should be modified.
Back in October 2017–which seems so long ago, but has been only 8 months–I was looking around for a NET runtime to use for Campy. It was apparent that in order to support C# on a GPU beyond value types, I was going to need a NET framework runtime. Why? It turned out there were many calls into C code, which depended on what runtime the program was compiled against. Even if you ignore this, you still need a meta on the C# side of Campy in order to get the size and alignment of fields in value and reference types when you allocate and copy objects from the CPU to GPU. The JIT compiler has this sort of baked into the code already, but it still needs to be formally added.
So, like any good programmer, I looked around. What I found were big, bloated packages: Mono, CoreCLR, etc. The NET framework that Campy needed I assumed would be a very small substituting layer for only the lowest layer of classes. Understand that GPUs don’t have file IO, don’t have threads in the classic OS sense, and many other things. So, the assumption here is that the lowest level layer isn’t changing, and hasn’t changed for a long time. Therefore, any class that uses the lowest level layer isn’t going to have problems calling into that layer because it is probably the same everywhere. Whether this assumption remains valid only time will tell. And, I can always use one of those bloated frameworks if my assumption is incorrect. But, there were greater problems–like writing a compiler for CIL, so I went fishing.
I came across an article in CodeProject, DotNetAnywhere: An Alternative .NET Runtime. Despite it not being modified for six years, I was heartened to learn that another project called Blazor was using DNA. (I learned a few months ago that Blazor switched to Mono two weeks after the CodeProject post.) So, I decided to port DotNetAnywhere (DNA) to CUDA. That turned out to be not terribly hard, but then I discovered the really big problems: DNA does not work in 64-bits, and there are quite a few bugs in reading the metadata tables. While I congratulate Chris Bacon for writing a good tool, DNA has a lot of problems. I fixed the code so that it runs on a 64-bit target. But, if an assembly contains metadata tables that aren’t supported by DNA, it craps out. And, I just found out that if I declare a field as an array of System.Numerics.Complex, DNA says the type of the field is an SByte!
At this point, I’m kind of committed to using DNA for Campy. I will be fixing the code that reads PE files, including code to read all tables in the ECMA 335 spec, and parsing the signature blobs robustly. I will also be writing a tool to read and output in a human readable format NET assemblies, similar to DotPeek, but with output to stdout so it can be used as a regression tool. As an old coworker said long ago about software: sometime you just have to pound it into submission.