Gunnar Kudrjavets

Paranoia is a virtue

July 2004 - Posts

Giving a personal touch to this "all about bugs" blog ;-)

Sometimes it’s good to associate people's faces and blog entries they compose. When you see them walking in the cafeteria then you can say “Hey, you're the person who got it all wrong  in your blog ;-)“ Here are two pictures of me during my last week’s vacation in San Francisco. My error in judgment (not using any sunblock) meant that after second day my face was in constant pain and I looked very unhappy ;-) That’s what happens when you’re used to Seattle and occasionally visit California.

San Francisco Bay Area rules!

Posted: Jul 31 2004, 09:12 PM by gunnarku | with no comments
Filed under:
Measuring the value of preventing the bugs

One of the things I’m constantly thinking about is how to measure the value of someone's actions which prevented bugs reaching our customers. How do I quantify something like "If Alice hasn’t fixed this buffer overflow which Bob discovered then three months from now we would had to issue a security bulletin and spend $X amount of money as a result of it all." It’s very hard (rather impossible) to prove that if some bug hasn’t been discovered by us then somebody would have discovered it in outside world and there’ll a disaster on our hands. Well, in case when your utility accidentally overwrites the boot sector then it’s quite clearly a bad thing ;-) But what about bugs which don't have so clear impact? Not every memory leak is a showstopper. Not every buffer overflow is a security hole.

Quite large percentage of bugs has a root cause which is relatively easy to fix: variables not initialized properly, missing call to release some kind of resource, string being not terminated properly etc. Here are a couple of situations we had while developing both external and internal tools:

  • Missing call to closesocket() and therefore leaking socket descriptors caused us and our partners to spend days diagnosing the root cause of the issue. Fix was very simple, just one line of the code.
  • Occasionally some of our BVT-s crashed for no reason at all. Every time we spent hours troubleshooting the problem, unable to understand why it happened. Finally we had an opportunity to capture the crash information and found out that the problem was string which wasn’t properly terminated.
  • We had to spend significant amount of time during shipping our first version because in one of the components we were using was tiny memory leak. Just a couple of bytes, but in a couple of days it all accumulated so much that OS started running out of virtual memory.

Probably every programmer can continue this list with hundreds of examples, there’s nothing new here and let’s note the fact I’m being very modest when talking about the cost of bugs. I’m not talking about bugs which caused products to be late or things like Code Red or Nimda. If you have some time on your hands then check out "Collection of Software Bugs".

What tends to happen quite often IRL is that after we’ve hit some kind of blocking issue then somebody spends day and night chasing some bug down in his code, fixing it, and we pat him on the back and say "Good work, that’s the spirit!" instead of asking the question "If Trent had asked Eve to review his code before checking it in, would it sill have happened?" Or let’s take simplified example and two hypothetical development teams: team A and team B. Team A stores all their string constants in resource files because they think it’s a right thing to do going forward. Team B thinks that "We’re US English only, let’s not bother." At some point the decision is made that the product needs to be shipped at international markets. Team A doesn’t do much when it comes to these string constants. Team B spends weekend fixing the code, testers test it during the night, and finally it’s ready. Guess who looks as a hero to the public’s eye? Of course team B when the actual prize should have been team A’s because they followed the proper engineering practices from the beginning.

But how do you effectively measure this? Do you take notes about everything during the entire year and later analyze all this? Can you even compare one person’s commitment to spend the entire weekend fixing bugs against other person’s thorough approach to use proper engineering practices and prevent these bugs from happening? Lots of things to philosophize about ;-)

Posted: Jul 31 2004, 08:16 PM by gunnarku | with 1 comment(s)
Filed under:
Since when opening bugs becomes a counterproductive?

Here’s one everlasting problem from real life what I and my colleagues are constantly trying to solve efficiently. The problem statement is very simple - when is the right time to stop looking for bugs in specific component if we know that the component will be obsolete in a small number of months? Yes, I know that all the software becomes obsolete after some period of time, but also not all the releases are ground-up rewrites i.e., the noticeable amount of the code base stays the same. Also, please note that this is different from the classical question about when it's the right time to stop testing at all ;-)

Let me give you a manifestation of this situation IRL. Assume that you have a component C what you’re trying to test and you know that during the next milestone it’ll be rewritten or major design changes which will affect the majority of the code base will be applied to this component. At the same time you’re also shipping this component to the customers with the current release and you’ll need to support it for years to come.

The question is simple. Should you either:

  • Continue to hammer at the current implementation even after the shipping and try to find all the places where access violations, race conditions, resource leaks etc. may occur. Being informed and knowing about the existing bugs and their impact is always better then sailing in dark. The downside here is that probably all bugs you open will be resolved as "Won’t Fix" (unless it’s something groundbreaking), triage team’s time will be wasted, and people will start questioning if you’re working on right things.
  • Use the following rationale: "The code will change anyway, resources are always spare, and we should test the component so that it’s good enough for this release (Well, how do you quantify this and how "good enough for this release" is any different from our usual quality bar?). If the customer data will indicate that they’re having problems with the old implementation then we’ll fix bugs which are being discovered." The downside here is that if everyone testing the component knows that you’re doing something which will be obsolete in a couple of months then it’s very clear that nobody will spending long hours trying to discover every little bug.

There are no easy answers to this question and in my experience every situation is pretty much unique and there are lots of factors to consider. Anyone cares to share how they approach this particular decision and what are the things involved in making the decision?

Posted: Jul 22 2004, 05:55 PM by gunnarku | with 2 comment(s)
Filed under:
Basic sources to get answers about Speech Server related issues

Suddenly the realization came to me that I’ve been composing posts for this blog for last 6 months, but I haven’t been mentioned Speech Server much. The main reason for this is that we’re brand new product and just now appearing in the price lists. Therefore compared to other servers we don’t have books written about us, knowledge base entries, tons of traffic in newsgroups, user groups established etc. Being a new product means also that we don’t have any security vulnerabilities which are/were discovered outside Microsoft. Yet ;-)

A couple of newsgroup related things you need to know about Speech Server:

  • The best newsgroup to ask any questions about Speech Server is microsoft.public.netspeechsdk. Don’t let the name SDK confuse you, this is perfectly valid place to ask questions, report bugs, and post feature requests. A number of people in our team are monitoring this newsgroup on daily basis, so you should get relatively fast response. If you don’t then please use my "Contact" link and I’ll see what I can do to speed things up ;-)
  • There are two additional newsgroups microsoft.public.speech_tech and microsoft.public.speech_tech.sdk. These newsgroups are mainly for SAPI and Speech SDK related questions.

Everyone can use Google, but here’s the PC Magazine (the source authoritative enough?) review of Microsoft Speech Server to save you some time. Hopefully the probability of somebody reading this post and using the freshly released Speech Server is greater than zero ;-)

Posted: Jul 13 2004, 10:50 AM by gunnarku | with no comments
Filed under:
Some interesting quotes about assertions

Every one of us has probably his own passions when it comes to software engineering. Mine are assertions, design by contract, root cause analysis, and static source code analysis. It’s my sad belief that this is as close to Silver Bullet as we’ll get during the next decade.

Based on that I also have to admit that I enjoy pretty much everything John Robbins has ever written. Especially the first part of "Debugging Applications for Microsoft® .NET and Microsoft Windows®". Robbins has this thrilling style of writing where he mixes technical content with his peculiar sense of humor, untypical to most of the technical books I’ve ever read. Here are a couple of quotes from him (quotes without the context are sometimes pretty hard to understand, so if you're looking for more then pick up a book):

To avoid bugs, however, I verify everything. I verify the data that others pass into my code, I verify my code’s internal manipulations, I verify every assumption I make in my code, I verify data my code passes to others, and I verify data coming back from calls my code makes. If there’s something to verify, I verify it. (Page 84)

My stock answer when asked what to assert it to assert everything. (Page 86)

Without assertions, I felt like I was programming naked, and I knew I had to do something about it. (Page 104)

Well, I think that John Robbins is definitely in my list of cool people ;-)

Posted: Jul 11 2004, 08:45 PM by gunnarku | with 1 comment(s)
Filed under:
New resolution type for bugs – "Not a bug"

[GK, 07/10/2004] Please apply 's/Not a bug/Invalid/g' while reading this post. The resolution type is meant to describe the bug quality not the correctness of application's behavior (Thanks, Larry Osterman!) Lesson learned: read and reread the stuff you post ;-)

We’re currently in the process of restructuring our bug database and here’s one thing I have wanted to do a very long time - add a new resolution type called “Not a bug“. I seriously doubt that this proposal will be accepted, but there’s no harm in trying ;-) The current set of values for resolution we’re using is following:

  • By Design
  • Duplicate
  • External
  • Fixed
  • Not Repro
  • Postponed
  • Won’t Fix

When I put on my triage hat then I’m quite passionate about making a very clear distinction between different resolution types. The only way to get any meaningful statistics and take action based on that is to make sure that your data is correct. For years we’ve been encountering some bug entries which we just can’t classify under existing resolution types. They are either bug entries just stating some basic facts without having expected result and actual result being specified; bug entries with content which doesn’t make sense to anyone in the room; general suggestions which are too broad to classify under ‘Suggestion’ type etc.

Typically we just assign the active bug we don’t understand back to a bug opener and send a follow-up e-mail asking additional information. There are a couple of problems with that:

  • People working in test organization tend to care more about resolved bugs than active bugs (your test organization may vary of course). Based on my experience it takes more time to get a response in regards to active bug than resolved bug.
  • Development leads and managers are monitoring constantly active bugs and then we run into "Why SDE/T or STE has active product bug assigned to him? Is he/she going to fix it?" discussion.
  • If we would use any other resolution like "Won’t Fix" or "Not Repro" then this would imply that there is actually bug in the product which we decided not to fix or we desperately tried to reproduce the problem, but couldn’t. This will of course start playing tricks with my beloved figures ;-)

Personally I would like to resolve any bug triage team doesn’t understand as “Not a bug” because this will keep everyone honest. Also I would assume that term ‘Not a bug’ will have bigger psychological impact than “Other” or something like this ;-)

The main justification I’ve been using for this is efficiency. Let’s say that there are 20 people in triage meeting and we spend 3 minutes per every triage discussing bugs nobody understands, deciding should we either resolve it, should we ask for the additional information, to whom to assign this bug etc. Practically we just spent 1h in total of people’s time instead of making a quick decision and moving on.

It’s a bug world and I think good way to mock my post is to say that we should be also using tabs instead of spaces, because this will help us to save hard disk space ;-) This is what I call self-critical cynicism ;-)

Posted: Jul 10 2004, 02:59 PM by gunnarku | with 14 comment(s)
Filed under:
Borland blogs at http://blogs.borland.com/

If you know this already then feel free to ignore this post, but I discovered just today that Borland has their Blog Central at http://blogs.borland.com/. This makes me very nostalgic. I spent first two years of my career as a professional programmer writing applications mainly with Delphi 1.0 and Delphi 2.0 ;-)

Posted: Jul 06 2004, 06:18 PM by gunnarku | with 1 comment(s)
Filed under:
No more .plan files, public keys go to blogs

In spirit of trying to be modern, here’s my GnuPG public key. .plan files and home pages are obsolete. Trendy people push their public keys right into RSS feed ;-)

-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.2.2 (MingW32)

mQGiBEDmB38RBACrjvQf49zzmpW3zda8MliydaohciaUjd5JR4dQJOvY+4lkoLwI
BI2+y8fH3+TwQ0DWtWLxNO6o1n6Ps0T3+U++p1aSYTmi0pxBarxiDoyLqtbDPdLd
yXCNq8FMrrT9JW9CVGhl5q9fNvWVj9gkLpKch5vDD56exxdi/y+ZKM5ENwCgyBU0
BD33nDPhQHJaPqIBXj29dN0EAInPQ9LzNN8MyRtzexMloS4nv6hjpR6+kNrD6DPt
gvcja/eCYH5yuTmOdjV9MwvJdSoDDxAPJPQRMPdtKT2NVYA6J6at2dTJdu78AwY+
/mJXLBAovkKJyqf5R8UyHSsd6kXIOMN+cHwTsPUdz7Bqcb69toenWuwt5X6OhvI4
iC6HA/9cTZ+a0Wf7Fn9PgkrqX73GXwiqjjp6UtuvZGf+ldXKCBLijxpAlq/Cz5/Q
OwJaoYQcaAxlsuxS4LmC7w8SgLmLW8SM0oKwtyO12+FTdTCTYkpB0/x01ROwWNBv
7v/BNEDORcIe4QsyPpOcmoUYEUxX4q2KWuZ97EStdzwp6UA7i7QmR3VubmFyIEt1
ZHJqYXZldHMgPGd1bm5hcmt1QGdtYWlsLmNvbT6IWwQTEQIAGwUCQOYHfwYLCQgH
AwIDFQIDAxYCAQIeAQIXgAAKCRBrFn2QxwL4U8VxAJ0R1rfzq0MWpQB4zr08YNH9
xdeo5wCfSmIjscAltyBE88XE0NUY6nP0Xkq5Ag0EQOYHkxAIAIXkNhB5xNgYilc4
ANdxsIpufVFJZl4woTpOqfTTuYtmOTJ40/ZbKHXeRyFdNOZ3zD2iMh5hutL8rx6Q
ztLjMtBF93lr/TT/R5PQtqCImi3dwTtLMVuRkeY1uA3u9RShpYSxv41AGfrAZmx6
uyOeRNMqsUH3MGdGVW4GsDAAIQj/OOirf5XudanpPyVtKID8+A9VwY0qyuLqkM7t
5FdQA39yFLQc7BK1y3OEmv0FcXl9kvoUfRQQMILuM53ewWdBF20TXGgQfz7f39zW
DzqO1UzpfRkTsloBs2osVJWAJzyhRbxt7woCCA3LncA73JkjxIRxyMc/FbKHJoYa
Rqq116sAAwUH/0HnZN7pKeXNvYb1uQMy10GYPz9totW5H2ijHWHGh5+96AKytW0T
0tu3DkbWyFhZes2RXNoXJIl3GqxWv8s4aYWpUCACn96E+f4hIRNcHC9+8Du62c3P
ZK1pDArcFTW0ofpG8eD4aMtFleS2FDKeUEgzEQam9w/aNGuC15an86zYHgg+EtuU
UYnqrbHNC8a5JkQRe/neoxg+lKfwoeOvLi5xUj1JCfQRsTTuy65sKXBHkOXDh/4X
tttiOhj7PYDyOtj7WOom/ZN+PN5mhCCfj1VNO8aYgTAhd/TPQy/Qv2Ble/eroZYq
i47HP9iOLEsrEE66A22KJLnxPAzL3mUwRHiIRgQYEQIABgUCQOYHkwAKCRBrFn2Q
xwL4U/GnAKCt2UpSPlBQB9DGnPPD2b25570CLwCcCnVsHwBweemYy7NY0BpatmZP
vqM=
=i4in
-----END PGP PUBLIC KEY BLOCK-----

This is key's fingerprint: E31F 29FE 74F6 60C9 9774 041A 6B16 7D90 C702 F853.

Posted: Jul 03 2004, 09:38 PM by gunnarku | with 10 comment(s)
Filed under:
Software Entropy: Don't Live with Broken Windows

When it comes to the books then IMHO "The Pragmatic Programmer: From Journeyman to Master" by Andrew Hunt and David Thomas is one of the masterpieces anyone taking software development seriously must read. I’m pretty sure that every one of us has personal mandatory reading lists somewhere, but this one is definitely in mine. Hunt and Thomas describe the "Don't Live with Broken Windows" attitude. You can read the entire article, so I don’t need to retell the contents. Believe me, it’s worth reading.

My personal story with broken windows is following: first task I had after shortly starting at Microsoft in October 2000 was to develop an execution environment which provided the user a way to define different scenarios involving speech recognition engines and text-to-speech engines and execute them. Practically it was a little silly XML-based programming language and its interpreter. Needless to say that I was full of energy and excitement and took the entire thing very seriously. I tried to religiously follow some basic rules like:

  1. All member variables are prefixed with m_.
  2. When it comes to indenting the code then I used only spaces.
  3. Every method had to had assertions as pre- and post conditions and also number of invariants. I used only one specific macro for assertions. If you know something about Microsoft Speech SDK then SPDBG_ASSERT should sound familiar.
  4. ...

Occasionally some people changed the code (added some features, fixed some bugs) and then we always had the following interesting situation: whenever somebody checked in the code indented with tabs then I changed it to spaces, whenever somebody used ATLASSERT then I went and changed it to SPDBG_ASSERT, whenever somebody commented out just some block of code then I went ahead and remove it from to code base to keep it clean. It’s been three years since I touched this code base, but some of my old coworkers occasionally still remind me my obsessive-compulsive behavior when they see me in the hallway ;-)

For me these little things represented the act of "breaking the window". Till this day I haven’t finally figured out if the effort I threw into this was worth it or not? Sometimes I comfort myself with the thought that "Yeah, RefactorMercilessly, go-go-go, I was so cool (and young and clueless)!" Sometimes I think that do these little things really matter and did I waste my time? Common sense, most of my colleagues, and all the articles and books written about good-enough software tell me that of course this was wasted effort, perfection is impossible in commercial software, focus on things in priority order etc.

For me this is like programmer's Zen koan. Hopefully at one beautiful day the enlightenment will come to me.

Posted: Jul 02 2004, 12:12 AM by gunnarku | with 2 comment(s)
Filed under:
More Posts