GSoC 2011 is now officially over, and I figure I should write some sort of status post.
First of all, the project is not completely done. I had anticipated this early on, and discussed with JB which parts I should focus on for the deadline. I did get a fair bit done for the deadline, but I wouldn’t quite say it’s ready for production use yet. I did, however, pass the final evaluation, as JB agreed to let me finish the project outside of GSoC (which I fully intend to do).
The project actually ended up being more involved than originally planned. The idea was to simply swap the PEAPI back end with Cecil, but I quickly realized that this was not quite as trivial as it seemed back then. I ended up ripping out the old code generation completely, emptying the code of every last parser production, updating the parser to .NET 4.0, and finally, adding the Cecil code generation back end in. You might think that all of this would have taken most of the entire GSoC period, but a lot of time was spent on spec reading too, not to mention figuring out weird quirks and corner cases in the Microsoft implementation of ILAsm. It’s scary just how much of the ILAsm language is poorly documented or not documented at all.
I’d like to extend my thanks to everyone in #cecil @ irc.gnome.org for helping me understand ILAsm, CIL, and the CLI in general.
I’ve also made some progress on the ILDasm front. It’s still missing a lot of features, but it does disassemble types and methods decently at this point. I’ll work more on it later; ILAsm comes first.
I recently switched ILAsm to a deferred type resolution model. This was JB’s idea, as a way to solve the generic parameter issue I blogged about earlier, and it’s worked nicely so far. This was probably the last major issue in the new ILAsm, so I expect to be doing more ILDasm work soon (there are only a couple of known ILAsm issues left).
During GSoC I also wrote a command line interface to the Mono soft debugger (SDB). It’s available here. I wrote it primarily because using MonoDevelop (which was the only existing interface to SDB at the time) as a debugger for a command line application was not very convenient. This debugger also has a few cool features such as decompilation of code with no source code (through ICSharpCode.Decompiler).
Overall, it’s been a very fun and educational program for me. I now know a lot of things about the CLI that I didn’t have the slightest clue about before, and I finally managed to contribute something really significant to Mono. Working with JB and the Mono community has been an awesome experience, as everyone’s very helpful and easily approachable. I would definitely do this again if I get the chance.
I’ll probably make a final blog post whenever I get this entire project merged into mainline Mono.
On ILDasm:
I hope you realize that ICSharpCode.Decompiler contains not only the decompiler, but also a Mono.Cecil-based disassembler.
It’s certainly not perfect, but I did some comparisons with the output of Microsoft’s ILDasm and fixed all the issues I could find.
It would be great if you could use that code as a starting point.
And I have to fully agree with you on the unexpected complexity of IL. When I started to write the disassembler for ILSpy, I didn’t expect the language to have syntax like “int32[] marshal(bool[64 + 1]) p”
And what is and isn’t a keyword and needs escaping seems to not be documented at all – the keyword list in the spec is not complete; the MS implementation has some keywords that aren’t mentioned in the spec at all.
For keyword reference, Expert .NET 2.0 IL Assembler is a great book. You will also want to have a peek into the Microsoft-annotated ECMA 335. These are generally how I built the keyword and directive tables in ILAsm/ILDasm.
I’ll have a peek at ILSpy’s disassembler. I’m sure it could be good reference. Just keep in mind that we can’t take a direct dependency on it due to packaging complications.
Thanks for the input! :)