PE, metadata, signatures, blobs, oh my!

After using the DNA code for a while, I’ve identified some of the problems with the implementation that need to be corrected. Other problems were noted in Matt Warren’s article, and in the original DNA Git repository. Several problems mentioned have already been fixed.

DNA does not conform to ECMA 335. There are missing table types, described below. The problem is that if any PE/assembly is read that contains one of these missing table types, DNA will not work, and likely you won’t even know! For example, in the original code, when reading a table that followed the missing table input, I recall it would segv because null would be passed to strlen. The following table illustrates the current state of DNA.

Table number (base 10)	Type name	In ECMA 335 6^th Ed. June ‘12	In CodeProject 12585	In original DNA	In Blazor DNA	In GPU DNA so far
00	Module	x	x	x	x	x
01	TypeRef	x	x	x	x	x
02	TypeDef	x	x	x	x	x
03	FieldPtr					x
04	Field	x	x	x	x	x
05	MethodPtr					x
06	MethodDef	x	x	x	x	x
07
08	Param	x	x	x	x	x
09	InterfaceImpl	x	x	x	x	x
10	MemberRef	x	x	x	x	x
11	Constant	x	x	x	x	x
12	CustomAttribute	x	x	x	x	x
13	FieldMarshal	x	x
14	DeclSecurity	x	x	x	x	x
15	ClassLayout	x	x	x	x	x
16	FieldLayout	x	x			x
17	StandAloneSig	x	x	x	x	x
18	EventMap	x	x	x	x	x
19
20	Event	x	x	x	x	x
21	PropertyMap	x	x	x	x	x
22
23	Property	x	x	x	x	x
24	MethodSemantics	x	x	x	x	x
25	MethodImpl	x	x	x	x	x
26	ModuleRef	x	x	x	x	x
27	TypeSpec	x	x	x	x	x
28	ImplMap	x	x	x	x	x
29	FieldRVA	x	x	x	x	x
30
31
32	Assembly	x	x	x	x	x
33	AssemblyProcessor	x	x
34	AssemblyOS	x	x
35	AssemblyRef	x	x	x	x	x
36	AssemblyRefProcessor	x	x
37	AssemblyRefOS	x	x
38	File	x	x
39	ExportedType	x	x			x
40	ManifestResource	x	x			x
41	NestedClass	x	x	x	x	x
42	GenericParam	x	x	x	x	x
43	MethodSpec	x		x	x	x
44	GenericParamConstraint	x	x	x	x	x

The parser for signatures is just terrible. The parser should be an LL-like parser, which it sort of does on first glance seems to resemble, but actually isn’t. For example, MetaData_DecodeSigEntry() is used to decode the signature entry field. But, it is also called in many other places to just get a 32-bit unsigned integer. IT SHOULD NOT! That’s not how parsers should ever be written! It should follow the syntax descriptions of the ECMA 335 spec, section II.23.2, and from that, using the Dragon Book, a nice implementation written. This code needs to be completely rewritten.
There is no tool for a human readable print out of the PE file metadata tables for debugging. I have added “CampyPeek” to fix this problem.
Old Blazor code changed MetaData_DecodeSigEntry() in metadata.c, but it isn’t clear why. I will need to chase this down.
Assembly resolution in DNA is a problem for the GPU. In DNA, assembly “resolution” is sort of done with function CLIFile_Load() in CLIFile.c. “Probing” occurs here, just opening the file in the current directory. Unfortunately, probing can only work if the files are pre-loaded into the GPU file system. So, assembly resolution doesn’t following that in the standard sense of the term. For the moment, I will assume that all assemblies are placed in the directory of the executable. For Net Core programs, this is already done with a “publish”. I will need to figure out a good solution for Net Framework programs.
DNA does not seem to handle a number of Net Standard and Net Core assemblies: netstandard.dll (contains table type ExportedType), System.Numerics.dll (machine type 0x8664). This is the most critical problem, since it blocks execution of Net Core–and hence, an important aspect of Campy.
DNA does not implement type forwarding within it’s metadata reader. So, a Net Standard library may reference a type in netstandard.dll, but it cannot resolve the type to its implementation in a referenced assembly. I’ve identified in DNA that MetaDat_GetTypeDefFromName() in MetaData_Search.c that should be modified.