March 2005 - Posts

Oleg "XML MVP" Tkachenko nailed it. I'm a "resource waster" and proud of it.

I think the time has come where we can actually afford to trade off performance for maintainability. Although many Über-experts (Yourdon, Jackson, etc.) have been saying this for thirty years, I wasn't a true believer myself until a few years ago when I realized exactly how fast hundreds of millions of clock cycles per second is. It's absurdly fast.

But now-a-days we don't deal in 100,000,000's of clock cycles. Heck, you can't even give a few hundred megahertz computer to charity anymore. No, we're in the age of billions of cycles per second, twenty-four stage pipelines (actually, I think that was old news as of Pentium III), multi/dual-core-processors, and so on. A new consumer desktop computer from Dell can perform a good five, maybe six billion floating point operations per second.

Let's put that in perspective. If you were to print out 5,000,000,000 floating point problems (482.224 x 127.8038) at fifty per page, the stack of paper would be 6.2 miles (10km) tall*. That's the equivalent of twenty-six Empire State Buildings floor to tip. All computed in a single second. Is that fast enough for ya?

Yet some folks, like Mr. Tkachenko, still worry about speed. In response to my post about how exceptions really aren't bad to use, Oleg countered, explaining exactly what happens when you throw an exception:

  • Grab a stack trace by interpreting metadata emitted by the compiler to guide our stack unwind.
  • Run through a chain of handlers up the stack, calling each handler twice.
  • Compensate for mismatches between SEH, C++ and managed exceptions.
  • Allocate a managed Exception instance and run its constructor. Most likely, this involves looking up resources for the various error messages.
  • Probably take a trip through the OS kernel. Often take a hardware exception.
  • Notify any attached debuggers, profilers, vectored exception handlers and other interested parties

I have to admit, that does sound rather intimidating. I honestly had no idea that it did all that for just a lousy exception. But let's put it into perspective: billions of operations per second. Yes, this sounds like a lot of stuff for an exception. Yes, it is a lot. Yes, your computer can handle all that ten times over so freakin' fast you wouldn't know it happened.

Exceptions are the ideal way of dealing with exceptional circumstances. If your database stored procedure yells "Documents status may not be altered while an approval is pending," exceptions are the simplest way to wrap that up, jump across your tiers, and display the message directly to the client. Any other way of doing it would be needlessly more complicated.

So next time someone tells you "it's faster this way," put it in the "Gigahertz Perspective" and counter with "but, is it better?"

(*) 5,000,000,000 FLOPS
  / 50 lines per page
  * 0.1mm per page (standard for 20lb)
  / 1,000,000 mm per km
  = 10 km

 Empire State Building = 1250 feet

A while back, I went to a local .NET event which had a number of presentations given on a variety of topics. I attended an intermediate-level talk presented by an out-of-town MVP that was entitled "Advanced .Net Programming," or something like that. One of the sub-topics discussed was error handling, to which our MVP had some rather simple advice: don't throw exceptions. This seemed to be some rather peculiar advice, especially considering how exception handling was such an integral part of the .NET Framework, so I interrupted the speaker to ask for some clarification. He explained a bit further saying essentially that exceptions kill application performance and that you should use return codes because they are faster.

For most attendees, such reasoning would suffice. But not this one. No, it still didn't make sense. After all, I've used exceptions quite extensively to pass messages from the database all the way to the client, and I've never noticed a performance problem. Could it be that I perceive things in fast-motion? Maybe I'm oblivious to the fact that all of my applications are sluggish because an hour to me is what a minute it is to you?

Something's not right. Either our MVP cares a bit too much about speed or my perception of time is completely out of whack. What better than a few objective tests to figure this out? So I created a console app and started coding ...

First, I needed some code that did nothing. Well, not nothing, but nothing important.

Sub DoNothingImportant()
 Dim x, y, z As
Integer
 y = 8432
 x = 17751
 z = (y + 112) * (x - 2040) / (Math.PI)
 Dim s As
String
 s = "blah blah blooh"
 s = s & "beep"
End Sub

This shouldn't take too long to run, right? Let's find out:

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
DoNothingImportant()
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:35:33.5930062
10:35:33.5930062

Sheesh, it goes so fast I can't even measure it. For fun, I thought I'd see how fast I could do the simple arithmetic in DoNothingImportant(). Unfortunately, manual decimal long division is not like riding a bike. As it turns out, I have no idea how to go about dividing 3.1428 into 134,234,784 by hand. I'm ashamed and I'm embarrassed. For this very reason, I have decided not share with you how long the multiplication portion took me. Did I mention I have a minor in mathematics?

To make myself feel a little better, let's watch the computer choke on doing this arithmetic 100 times in a row!

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
For i As Integer = 1 To 100
 DoNothingImportant()
Next
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:37:05.5096764
10:37:05.5096764

Damn you gigahertz! Back to the topic at hand though. Let's go kill our performance with exceptions:

Sub ThrowException()
 
Try
 
Throw New
Exception
 
Catch ex As
Exception
 
Finally
 
End
Try
End Sub

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
ThrowException()
DoNothingImportant()
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:40:18.5456990
10:40:18.5456990

Hmm. Weird. It looked instant to me and the computer. Ok, how about if we throw 10 exceptions.

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
For i As Integer = 1 To 10
 ThrowException()
 DoNothingImportant()
Next
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:46:11.7123974
10:46:11.7123974

Sheesh. This computer is frickin amazing. Ok, maybe it's not exceptions that kill performance, but nested exceptions. Let's find out:

Sub ThrowRecursiveExceptions(ByVal count As Integer)
 
If count < 1 Then
Return
 
Try
 
Throw New
Exception
 
Catch ex As
Exception
  ThrowRecursiveExceptions(count - 1)
 
Finally
 End
Try
End Sub

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
DoNothingImportant()
ThrowRecursiveExceptions(10)
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:48:45.8648346
10:48:45.8648346

Ok this is getting ridiculous. I was told by an expert that exceptions would kill performance. What gives? Let's increase the magnitude:

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
For
i As Integer = 1 To
100
 ThrowException()
 DoNothingImportant()
Next
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:51:56.5876694
10:51:56.6377384

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
DoNothingImportant()
ThrowRecursiveExceptions(100)
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:52:34.6477522
10:52:35.2385664

Finally! A measurable difference! Granted, it's still only only measured in hundredths and tenths of a second respectably, but this is progress. Let's kick it up a notch:

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
For
i As Integer = 1 To
1000
 ThrowException()
 DoNothingImportant()
Next
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:55:17.5446078
10:55:18.0152564

Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))
DoNothingImportant()
ThrowRecursiveExceptions(1000)
Console.WriteLine(Now().ToString("hh:mm:ss.fffffff"))

Output:
10:57:38.0252702
10:58:39.9205680

Ok. Throwing 1000 nested exceptions takes a long frickin time. Maybe this is what our MVP was talking about?

Now, let's review what we learned:

  1. Modern computers are fast. Really fast. Really, really, really, really, really fast.
  2. Long division is hard. Really hard.
  3. Throwing one exception won't affect performance.
  4. Throwing ten exceptions (nested or otherwise) won't affect performance.
  5. Throwing one hundred exceptions (nested or otherwise) probably won't affect performance.
  6. Throwing one thousand nested exceptions will most definitely cause your application to perform slowly.
  7. The call stack actually supports 1000 levels of recursion
  8. Some people don't believe Lessons #1, #3, and #4.
  9. An individual's Title does not automatically mean they have any clue what they're talking about.
  10. If some one ever says "because it's faster," think of Lesson #1 and #9 and laugh.

Note: I "primed" each method before running it in order to not have JIT compilation included in the time tests.

I've made a handful of modifications to the Community Server software that I use as a back-end to a website of mine (TheDailyWTF.com). Since quite a few people seem to be using this software, I thought I'd share what I've done. If you find this post valuable, let me know in the comments and I'll post more in the future.

One of the first things I noticed when switching from .TEXT to CommunityServer::Blogs Beta 2 was that it lacked the ability to sort threads by the date of the first post. Since I have a custom RSS feed that displays the ten most recent posts, this definitely would not do. So I'll explain how I added this very easy change ...

Step one is to edit the Enumerations\SortThreadsBy.cs (in CommunityServerComponents) and add the emboldened code:

public enum SortThreadsBy { 
PostDate,
LastPost,
ThreadAuthor,
TotalReplies,
TotalViews,
TotalRatings,
Subject }

Next, we need to change the data access code. This is in ForumsSqlDataProvider.cs in the SqlDataProvider class. Go the the overloaded "public override ThreadSet GetThreads" method and locate the region "#region Order By and Active Topics". Just add the following block under "switch (sortBy)":

case SortThreadsBy.PostDate:
  if (sortOrder == SortOrder.Ascending)
    orderClause.Append("PostDate");
  else
    orderClause.Append("PostDate DESC");
  break;

Finally, we need to add the option to the picklist. This change is appropriately in Controls\Utility\ThreadSortDropDownList.cs in the CommunityServerForums project. The items are added in the constructor, so you can easily modify it to this:

public ThreadSortDropDownList() {
// Add countries //
Items.Add(new ListItem(CommunityServer.Components.ResourceManager.GetString("ThreadSortDropDownList_LastPost"), ((int) SortThreadsBy.LastPost).ToString()));
Items.Add(new ListItem("Post Date", ((int) SortThreadsBy.PostDate).ToString()));
Items.Add(new ListItem(CommunityServer.Components.ResourceManager.GetString("ThreadSortDropDownList_StartedBy"), ((int) SortThreadsBy.ThreadAuthor).ToString()));
Items.Add(new ListItem(CommunityServer.Components.ResourceManager.GetString("ThreadSortDropDownList_Ratings"), ((int) SortThreadsBy.TotalRatings).ToString()));
Items.Add(new ListItem(CommunityServer.Components.ResourceManager.GetString("ThreadSortDropDownList_Views"), ((int) SortThreadsBy.TotalViews).ToString()));
Items.Add(new ListItem(CommunityServer.Components.ResourceManager.GetString("ThreadSortDropDownList_Replies"), ((int) SortThreadsBy.TotalReplies).ToString()));
}

Note how I didn't add it to the resource file and the subsequent code to do that. Since I don't plan on supporting more than one language on the site, there's really no need to do this.

And that's that. Told ya it was an easy modifcation ;-).

I think I may have been a bit harsh yesterday in my review of Practical Guidelines and Best Practices for Microsoft Visual Basic and Visual C# Developers. Although I stand by my statements, I wanted to expand on and clarify things based on some of the feedback I received (especially one from the author).

First and foremost, there are lots of good tips in this book on doing a whole variety of things from remoting to threading. In fact, there are many more good tips than there are bad. But when you title your book "Best Practices," you really can’t afford to have one bad paragraph (let alone the number I found) in your book. With the countless books to choose from, readers will look to a "Best Practices" book as the standard for development. Therefore, such a book must be judged on a higher standard than others.

One of the first things I touched on was credibility (of which I said the authors had none). Now, I didn’t intend for this to be demeaning or insulting; it’s simply a statement of fact. There are few (if any) among us who has the credibility to make statements of fact without any backing. Just as your professors would say, you need to reference and back up your facts. Yes, "MSDN Regional Director" is a prestigious title, but it certainly is no where near "Turing Award Recipient." And from reading some of their books, even those folks qualify their statements of fact.

In a number of cases, the authors did provide a reason for the "magic number" they chose. For example (and I do not have the book in front of me), they said to not have more than 64 local variables per method. The reason they gave was that the JIT compiler has to use a less-efficient method of allocating memory. Fair enough, but this is the wrong primary reason to give for not having more than sixty-four local variables per method.

Let’s think about that for a minute. Do we really want a developer to think, gee, I’d love to add a sixty-fifth variable to my method, but that’ll just kill JIT performance. What the authors should have said was that managing sixty-four variables in a method make code hard to fricken’ follow. Harder to follow means more bugs. Had the authors bothered to look at nearly forty years of computer science research, they would have discovered that maintainability vs. bugs vs. speed is the subject of many, many studies.

Francesco Balena’s response my criticism on his focus of speed was:

"[M]ost of the techniques you consider as questionable can make your code run faster by at least 50%, or more. If the offending statement appears in a tight loop they can save you a significant amount of time, not just a few CPU cycles. In a server-side component this sort of optimization makes the difference and can positively affect scalability - I am surprised you missed the point.

I respectfully disagree, Francesco. You missed my point. The book is targeted not towards the gurus and experts, but towards the beginner and intermediate level developers. For this I will quote the two rules of optimization from M.A. Jackson (Principles of Program Design, 1975):

Rule 1. Don't do it

Rule 2. (for experts only) Don't do it yet.

Catch the date on there? Even in 1975 experts understood the problems with the authors’ line of thinking. Believe me when I say your customer would prefer code that works to code that saves forty nanoseconds out three milliseconds. Yes, tuning and optimization is important, but it is not an appropriate theme for a book with this target audience.

This is exactly what I was thinking when I described the book as "dangerous." If it had a different title, I’d give the book a "C" rating and say it’s "okay." But as a "Best Practices" book, it will encourage developers to think about "is it good for the JIT" instead of "is it good for my predecessor." I’ve seen code like that. I’ve seen the cost of code like that. Believe me when I say it’s not pretty.

As you may or may not know, I generally critisize things on a daily basis. But this post is much different than my normal (hopefully witty) commentary on bad code. Today I'm pointing out some serious flaws in a book I came acrossed called Practical Guidelines and Best Practices for Microsoft Visual Basic and Visual C# Developers that make the book not only bad (which I can live with) but dangerous for an inexperienced developer. Not only does it fall short of "best practices," but it actually advocates bad practices. In summary, DO NOT BUY THIS BOOK. And if an enlarged & emboldened statement isn't enough, read on ...

Although Balena and Dimauro are MSDN Regional Directors (a prestegious title, I think), they are no where near "IT legend" status (unlike Don Knuth [pdf], Andrew Tanenbaum, Ed Yourdon, etc.). You would think they would supplement their lack of credibility with cold hard facts and research, but no, they just seem to make up arbitrary figures (pp 139, "avoid methods longer than 50 executable statements") and shove them in the book. And what's with 50? Why not 60? Why not 45? Look at a good book (Code Complete by Steve McConnell') and see that Steve backed up his best practice on method length with six different studies. SIX!! You won't find a single reference in this book.

On to the content, the authors use the words "always" and "never" way too much. The architects behind the .NET Framework didn't put in "forbidden fruit" that looks tempting but should never be used. For example, on page 325, they say that you should never catch the ThreadAbortException. What they don't mention is that catching ThreadAbortException (and ThreadInterruptedException for that matter) is the *only* way to gracefully clean up and exit from a sleeping background thread using unmanaged resources. This happens more often than you'd think; if you're interfacing directly with specialized hardware that takes a long time to do process (like a printer), you would put that in a background thread that waits for the printer to finish. If the user closes the application (and the background thread doesn't catch the Thread exceptions), you're hardware could be hung up.

Another example, on page 191: "All events must define two parameters. The first parameter is an Object named sender; the second parameter is an instance of the EventArgs type (or a type that derives from EventArgs) and must be named e." Wrong, wrong, wrong! An event can have any number parameters of any type with any name. That may be the way *most* events are handled, but it's certainly not what you *must* handle events. A lot of times there is no need for a sender reference or event arguments.

They also seem to really like the words "right"/"wrong" and "correct"/"incorrect". Let's take a look at page 140:
  ' *** Wrong: uses incorrect casing, type is embedded in parameter name.
  Sub PerformTask(ByVal UserName as String, ByVal boolIsAdmin as Boolean)
  ' *** Correct
  Sub PerformTask(ByVal userName as String, ByVal isAdmin as Boolean)
Their way is neither correct nor incorrect. It's simply a preference, it also happens to be the same preference that Microsoft spells out in MSDN, but they actually use the term "preference." Ironically, on the very next page (pp 141), they use a class named "frmMain." This is one of many examples where their preference doesn't line up with Microsoft's preference. MSDN prefers MainForm.

Next, the authors focus on "speed and efficiency" as if it's 1963 and a few CPU hours cost as much as a car. In the world of BILLIONS (1,000,000,000's) of cycles per second (Ghz), it really is ok to "waste" a few cycles on making sure your code is easier to read and maintain. You wouldn't get that impression from reading this book as a beginner:
pp 127: "the as operator ... speeds up execution"
pp 149: "Return keyword enables the compiler to optimize your code more efficiently."
pp 251: "language specific functions perform worse"
pp 253: "the ChrW function is faster"
pp 257: "the CompareOrdinal static method ... is typically faster"
They did not drive the point of maintainability over speed. 100 clock cycles will make ZERO difference on a 10 millisecond request.

I've also found that some parts of the book are hopelessly out of date. It would appear that the authors have not yet fully entered the world of .NET themselves. Take for example page 408, where they provide a rather "Classic ASP" method of setting the page title: <title><%=GetTitle()%></title>. The ".NET" way of doing this would be either using a literal <title><asp:literal id=pageTitle runat=server></title>) or making the title tag a server tag (<title runat=server id=Title/>).

It would also appear that the authors have little to no knowledge of relational database design. On page 380 they say "Don't use primary keys that have meaning for the end user, such as invoice number or the ISBN value." This is beyond absurd and defeats the whole purpose of "relational databases". Don't take my word for it, of course; ask any "Database 101" student or read it straight from the creators of relational databases (EF Codd and CJ Date). The authors' technique is as close to a COBOL-mainframe-flatfile method of development you can possibly get and is, ironically, what relational databases were designed to fix.

And on top of all of this, they're inconsistent in following their own "best practices." On page 251 they say "don't use Visual Basic-specific string functions, such as Len, Left, and Mid." Flip the page (pp 253) to another tip and see "favor the ChrW function over the Chr function." No mention about using "Convert.ToInt16," which they suggested as a replacement just on the previous page. And speaking of using "Int16", they didn't follow their own advise of "Use 32-bit integer variables" as suggested on page 240. And why int32? Because they're faster!

I really do hate to be so negative about this book. Yes, it does have some decent tips that will help an inexperienced developer. But at the same time it has some absolutely horrendous tips and there is absolutely no way that an inexperienced developer could distinguish between the good and bad. Stay clear of this book. Buy McConnel's Code Complete instead.

More Posts