Language parsing and compiler design doesn't have to be hard, but boy this book really sucks!
How'd you like that for an opening title? Did it grab your attention? Hell, your reading this far so I guess it did. The book I'm focusing on here is Build Your Own .NET Language and Compiler and please, don't click the link and then go buy it. I don't care about the 50 cents worth of referral money I'll get if you do. I wouldn't even recommend the book if I got 50 bucks of referral money (well, money talks, so maybe I would).
The book starts out with the basics of parsing and regular expressions and all that jazz. But the extent of the code is a bunch of screen shots. We are writing a parser/compiler dang it, we aren't WYSIWYGing our way through life at this point, you have to show some real frigin code. What you end up with is a bunch of screen shots of many tools for writing a compiler, but not really the code, unless of course you go grab the CD and break through all of the code without a lick of explanation from the book. God I hope the code is well documented with comments, or you just bought an issue of Compiler's Illustrated and this isn't the Swimsuit edition. I'll include some of my own links at the bottom, where I give actual code for many of these processes.
OK, so you get to see a bunch of tools, and what do you get? Well, you get a bunch of half-assed tools (sorry for the language if your kid is reading my highly technical blog... In fact, if he/she is I could use some interns, must type 50+ WPM and be proficient at C, C++, or C#). A mathematical expression evaluator is the first. I think it is always the first. People always trivialize math. So make sure you look at all the pretty pictures and try to glean some wisdom from the text. I have a mathematical expression evaluator by the way, it's called calc.exe and from what I can tell it has shipped since 16-bit windows. He also makes an attempt at a regular expression workbench. You can't have enough of those (actually I'm not being sarcastic here, I always appreciate a new regex tool), but then he never writes anything or demonstrates compiler technology that uses regular expressions. Does he go into NFA/DFA technology? Well, he does talk about it for a few sentences. BNF format? Again a few sentences here and there. But wait, another tool is what you get and this time it is a picture of a drop-down menu with all sorts of really tantalizing names (convert from BNF to XML, display a BNF parse tree, display formatted docs, etc...). At this point use one of the pages to catch the drool coming off your lip, because that is as close as you'll get in this book to anything cool.
OK, so forget the tools. At some point he actually starts talking about real compiler technology. I think around chapter 7 maybe? I really should dig up the TOC on Amazon, but I'm only going to waste enough time on this book to finish this posting. Anyway, they start talking about the various parsing techniques. Recursive descent (RD), Top-Down, Bottom-Up... I think there are some other odd names they throw in there to mystify the reader. After reading all of the major compiler design books I shouldn't be mystified by something that could classify as a 4 Dummies book (unless it is something like Cross Dressing 4 Dummies, I could probably use that after my Halloween party)... Anyway, they really don't do the entire process justice, and I think at some point some more tools are used, Yacc might be mentioned, and bam, back to the pictures.
At this point I want to identify the worst problem I found throughout the entire book. Apparently the author didn't have time to finish the code so they left a bunch of exercises for the reader. Nah, nah... You don't leave the compiler as an exercise in a book on how to write a compiler. You leave bits and pieces, but not the important stuff. Going through my Knuth books, I'm actually surprised when he leaves problems as exercises that require more know-how than what has been provided in the chapter. I don't mind exercises for the reader, but there is a limit people. Imagine getting back from Home Depot with a 300 page picture book on building a house, that had a bunch of pictures of completed homes, and some text offering that the building of the house will be left as an exercise for the reader. Doh!
At the end of the book, it is apparent I'm not going to get anything of use and then it starts talking about code generation. Oooh, something with some meat. In reality, they've been naming their nodes for the calculator in such a way that the name of the node was pretty much the name of the op code that was going to be called. They may have some Quick Basic implementation code spits as well, but I'm confused at this point (and mystified) because I've been thumbing this book for an hour. In reality the act of spitting IL is probably worth an entire book of it's own (oh wait it is Inside Microsoft .NET IL Assembler and you really should buy this one so I get 50 cents). That isn't fair because that book is actually how IL functions and not how to spit it. But I'd think one does precede the other since eventually your going to run out of node names to match to IL op-codes and when opComplexOperation isn't mirrored by OpCodes.ComplexOperation I just don't know what you'll do.
How fair of a review is this? Well, I've read actual compiler books, quite a few of them. I've implemented my own parsers and compilers many times for many different circumstances. I don't think it is a hard process and I think extending the process to a more general development audience is important. There should be a relatively accessible book on writing your own .NET languages, but this book is certainly not it. I'll keep looking around, I hear there is another book focused on .NET language generation and I'll have to search it out. Maybe an O'Reilly publication? Can you get an accurate review from something in about an hour's time? Well, I read fast, the words were quite large, most of the content was entirely familiar and only about 30% of the page material was text, so I'd hope so. Take this for what it is worth, but if I see any referral money for that book, I'll know someone is going to be laughing hysterically when they get that book in a 2-3 days from Amazon. PS: I didn't and won't buy the book. I spent a couple of hours at Borders today running through two books that caught my eye when I was really looking for a great .NET Localization book. I need to dig up Michael Kaplan, since I'm sure he has written something somewhere.
Lexer/Parser/Compiler Code and articles for different types of parsers
Lexer, Parser, Compiler, Oh My! Postings, with code, on even more lexer/parser stuff
ftp://ftp.cs.vu.nl/pub/dick/PTAPG/BookBody.pdf A more hard-core text on parser technologies