Private Reflection, Whidbey, MetadataTables, and getting previously hard to get data
A newsgroup member was asking whether or not there were any mechanisms for converting metadata tokens to actual reflected types in Whidbey. My response was that there appeared to be a series of awesome classes almost ready to go in Beta 1, but they were all marked internal. This could either mean we'll never get access to them, or that the security concerns were far to great to make them available. It could also mean they have a more user friendly set of APIs that they will eventually surface. For me? I want the data today, so let's take a quick look at a sample that enumerates all of the strings in a given assembly.
For those not familiar with reflection this is going to seem fairly cryptic. I'll walk through the code one or two lines at a time and try to explain each portion. To start, we are going to grab a Type object for MetadataImport and privately invoke the OpenScope method. This is a static method and it takes a disk based file.
Type foo = Type.GetType( "System.Reflection.MetadataImport" );
object metadataImport = foo.InvokeMember(
"OpenScope", BindingFlags.Static | BindingFlags.InvokeMethod | BindingFlags.Public,
null, null, new object[] { "c:\\MaxTest.exe" } );
I say private reflection, but you don't see the NonPublic flag. Well, the MetadataImport type itself is marked internal to the assembly, so even though the static method is public, we still don't have normal visibility of it. At this stage we don't care about binders or instance types and simply call the method with the assembly we are going to inspect. In the next step, we need to get a MetadataTables class by doing an explicit cast operation. This is reflection, and doing this is harder than you might think. There are two explicit casts within the type, so you have to write a custom binder (unless someone wants to clue me in on how to differentiate by return type using the default binder) to pick the appropriate one. This is nasty, and I'm not including the binder code.
object metadataTables = foo.InvokeMember(
"op_Explicit", BindingFlags.Public | BindingFlags.Static | BindingFlags.InvokeMethod,
new SillyBinder(), null, new object[] { metadataImport });
I called my binder, SillyBinder, because I think it is pretty damn silly that you can't easily pick between which explicit cast you want to call. SillyBinder could easily take the type I'm looking for, but I just hard-coded everything in for now. With our casting done, you can start working on MetadataTables directly. Construct a Type so you have an implementation to hang your instance off of, set your initial string index offset, and start enumerating the strings. We'll be calling two methods GetString and GetNextString. The first returns the string given an index, while the second increments our string pointer to the next offset.
int i = 0;
Type metaTables = Type.GetType( "System.Reflection.MetadataTables" );
Console.WriteLine( metaTables.InvokeMember( "GetStringHeapSize",
BindingFlags.Public | BindingFlags.InvokeMethod | BindingFlags.Instance,
null, metadataTables, null ) );
do {
string s = (string) metaTables.InvokeMember( "GetString",
BindingFlags.InvokeMethod | BindingFlags.Instance | BindingFlags.Public,
null, metadataTables, new object[] { i } );
Console.WriteLine( "{0}: {1}", i, s );
i = (int) metaTables.InvokeMember( "GetNextString",
BindingFlags.Public | BindingFlags.InvokeMethod | BindingFlags.Instance,
null, metadataTables, new object[] { i } );
} while ( i > 0 );
I didn't choose 0 arbitrarily, I'm pretty sure the first string is always going to live there. If there aren't any strings then GetStringHeapSize should return 0. If you don't write your loop correctly, the string pointer will wrap around and start from the beginning of the heap again and just keep going, printing out the same strings over and over again.
This doesn't fully answer that the question of whether you can go from tokens to reflection information, but it is a start in the right direction. There are at least a hundred different methods that can be examined using this technique in order to get different bits and pieces of information. Most of them seem to mimic the actual metadata interfaces available to the C++ crowd for so long. Looking there for documentation on how each of the methods works and then using that to leverage these managed APIs might be nice. You have to eat the cost of reflection and late-binding, but there may be some venues to get around even that cost.