Archives / 2004 / February
  • Paying money to development/QA team for finding bugs ;-)

    The title of one of the previous posts reminded me that I did relatively unusual thing during the last milestone - starting from specific date in milestone, I offered $1 for every bug people can find in the product I’m working on (Telephony Application Services). The success criterion was that bug will get accepted and fixed by the end of the milestone.

    Reaching closer to the official milestone exit dates, I increased the sum to $5 per bug. Finally I ended up writing checks for ~$40 (don’t remember the precise sum and too lazy to look it up now). Some members of my team have still their checks hanging on the wall ;-)

    A couple of weeks later, while looking at my bank statement, I noticed that one person actually cashed in one $1 check and one $5 check. I was pretty amused by this ;-) Isn’t it great, want to buy a new shiny BMW? All you have to do is find enough quality bugs ;-)


  • What is BVT (Basic/Build Verification Test)? (Part III)

    Grand final. If I would have to summarize everything I have to say in one sentence, it’ll be: paranoia is a virtue. Yes, repeat after me: paranoia is a virtue ;-) So, here’s our problem – how to write code what is responsible for verifying the correctness of some other piece of code? I don’t have a good answer for this, if I would, I’ll be currently retired and living from profits of “GK's patented technique for developing high-quality software“, spending my days reading good literature, writing some cool code, working on PhD, raising kids, and going to gym in the evenings ;-) Or maybe not, I'm too much of a workaholic anyway. Some people say that the only possible way to go is formal methods and languages like Z. Unfortunately I’ve never met anyone (possibly I just socialize with wrong people ;-)) who has successfully done anything with formal methods outside the academic circles. The cost is too high and there aren’t many knowledgeable people around either. Anyhow, here are a couple of pragmatic things we’ve learned during last couple of years in our group:

    1. Do absolutely your best while writing the BVT code. Pick your best people, give them more time than you would usually give, and if somebody starts complaining about time spent, think about this: how much time (which can be directly  converted to money) you spend each time when false alarm happens? Somebody from test organization needs to investigate the failure, somebody from the development organization needs to investigate also, somebody needs to explain what happened and why etc.
    2. Review, review, and review once more. Anyone who has ever read Steve McConnell’s “Code Complete” or any major papers by Capers Jones, knows that formal code inspections and personal desk-checking of the code are one of the best ways to detect defects. For references see “The Software –Quality Landscape” from “Code Complete, 2d Ed“.
    3. Check every single return value for every single API call which can fail (basics again, but nobody bothers to do it). Here’s the difference between writing production code and writing the BVT code. In BVT case you can afford to check for every error possible – performance isn’t as important as it would be for speech recognition engine for example. Believe me, Murphy's Law works, everything can go wrong and fail ;-)
    4. Log as detailed information as possible per every failure (basics again). Timestamps, thread ID-s, expected return value, actual return value, break automatically into debugger if necessary etc. I’ve seen my share of horribly written code which just says: “Test failed” and that's it. This doesn’t get you any closer to the root cause and quickly resolving it.
    5. Use automated check-in systems. In our group we use system called "Gauntlet". To shortly summarize: "Gauntlet" is an automated way of building all the versions of your product and running the designated subset of test cases per every change to your codebase. Jonathan Caves has written longer post about how "Gauntlet" is used in Visual C++ team. We’ve been using "Gauntlet" for some time now and it proved to be extremely useful because of the following reasons:
      • We have non-existent number of build breaks. Nothing gets changed in the main codebase unless all the flavors of build succeed.
      • We greatly reduced the number of BVT breaks due to actual problems in the product. Yes, it still happens, but the main reasons are now infrastructure related problems, hardware related issues, and problems with the automation.
      • Psychologically the stress level in the team has been reduced. While making major changes to some interface for example, lots of people were concerned about how it’ll affect build and BVT results next day. Yes, you can do peer reviews, buddy builds and use all the other possibly countermeasures, but there’s only so much human can do in time allocated for the task to be completed. Penalties for build and BVT breaks are quite severe, so nobody wants to break anything second time ;-)
    6. Use "rolling build" 24h a day. If you don’t have automated check-in system, here’s how the basic implementation may look like. Set up the automated process for doing the following:
      • Make sure that the source tree is synchronized and up to date.
      • Build everything and if there are any problems then send e-mail to your team.
      • Run selected set of tests on this fresh build and if there are any problems then send e-mail to your team.
      • Optional step: you may want to send e-mail also every time “rolling build” succeeds. This will indicate that nothing is broken yet. Lots of e-mail? Yes, but that's what e-mail filters are for.
      • Go back to the first step.
    7. Use two-phase process to declare some test cases as BVT-s. The first phase is just letting the BVT to be part of daily processes for n days, but not to treat the results the test produces as official ones. After team decides that code looks solid enough - it’s been passing for number of days on all the editions under number of different configurations, the test case is "promoted" to be official. This approach proved to be also extremely useful in ironing out all the possible issues and increasing the trustworthiness of BVT code.

    Notice that I’m not talking about TDD (Test Driven Development), unit testing, check-in tests, and other related subjects. Mainly because my knowledge in these areas is mainly theoretical and without seeing application of these techniques IRL during a couple of milestones, I don’t feel competent to be talking about this. You may check Jay Bazuzi’s blog for TDD and Marco Dorantes's blog for interesting agile software development and design related stuff.


  • Using "Bug of the Month" column to determine somebody's C/C++ skills

    Those of us who read regularly either "Dr. Dobb’s Journal" or "C/C++ Users Journal" definitely have noticed the "Bug of the Month" column where various bugs are given for curious reader to track down. After remembering Eric Lippert’s article "Six out of ten ain't bad" I had this strange idea - what about picking 10 (or any other number) random problems from "Bug of the Month" column and using this as some kind of indicator to determine the C/C++ skills of the person being interviewed? Will it give better/broader/more random coverage than EveryonesFavouriteSetOfC/C++Questions? It’s definitely a good self-assessment tool ;-) Please note that I'm talking about specific technical skills only, not general qualities as intellectual horsepower, problem-solving skills, creativity etc. My main rational is based on a idea that one should be able to prove that his/her knowledge is practical, applicable, and can be used to solve real life problems. Fred Brooks once said: "A scientist builds in order to learn; an engineer learns in order to build."


  • Directing ATL/CRT assertions to file or debug output

    In real life the test organization is running overnight simulations, performance tests or stress tests on your codebase. In some cases it’s done with debug build which means that the assertions in your code will be enabled. Quite often somebody will start the test run at 08:00 PM, return next morning, and discover that after 30 minutes of execution this nice message-box saying that an assertion failed suddenly appeared ;-( As a net result your entire run is blocked and no actionable metrics is gathered. Sure, you need to investigate every assertion and understand what’s wrong, but sometimes you may not care. People assert all kind of different things and some invariants, pre- and post-conditions may not be relevant to what you’re trying to accomplish. To paraphrase Orwell: all assertions are equal but some assertions are more equal than others.

    Second common scenario IRL is that really paranoid (or should we say diligent?) people tend to write code like this:

    HRESULT SomeFunc(const CComBSTR& strFoo, int *pBar)
        ATLASSERT(0 != strFoo.Length()); // Make some noise in debug builds.
        ATLASSERT(0 != pBar);
        if (0 == strFoo.Length())        // Handle the error properly in release builds.
            return E_INVALIDARG;
        else if (0 == pBar)
            return E_POINTER;

    Before adding this code into a live codebase, a decent developer would really like to step through every code path in debugger and make sure that without assertions being enabled all the error conditions will be properly handled. While doing something like this, I personally get very annoyed by these message-boxes ;-) Well, how to still make sure that assertions will be triggered when appropriate, but not to have any visual side-effects? Fortunately C run-time library provides you everything you need: _Crt* functions are the key, especially _CrtSetReportMode and _CrtSetReportFile, and also the documentation about CRT assertion macros. Enough talk, let’s see how to direct assertions to debug output :

    #include <crtdbg.h>
    #include <cstdlib>
    int main()
        // The line which does all the magic.
        _CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_DEBUG);
        int i = 0;
        _ASSERTE(0 != i);
        return EXIT_SUCCESS;

    This is actually one of these features I always needed, but lived years without knowing that it already existed ;-( Moral of the story: one should read more MSDN and/or CRT source code. What about ATL mentioned in the title? You’re lucky again, as MSDN says “The ATLASSERT macro performs the same functionality as the _ASSERTE macro found in the C run-time library.” Looking at file “atldef.h” in your nearest Visual Studio installation will confirm this.


  • What is BVT (Basic/Build Verification Test)? (Part II)

    In my previous post about testing I covered the basic definition of the BVT. Now let’s talk about how to make the entire process to work in real life with real people and taking into the account different agendas everyone is having. Initially we practically used the Oracle (this is politer way of saying that semi random decisions were made ;-)) to pick some set of tests which “looked like BVT-s” and mark them in our internal test case management system as BVT-s. What happens eventually is that somebody will write a piece of code which will break one of these test cases and then you’ll reach the situation when you’re forced to justify why this test case is very important and product code must be fixed ASAP.

    After going through this situation a number of times and receiving inconsistent messages from all three different disciplines (development, program management, and test), we decided that it’s time to become adults and put some processes in place. And by processes I don’t some horrendous amount of pointless work, but just a few simple steps. KISS principle rules. We ended up doing the following:

    1. Agreeing on the daily quality bar for the product across three different disciplines.
    2. Negotiating the set of test cases which will be used to measure this quality bar.
    3. Communicating widely what’s expected from everyone involved in the product development and testing.

    Reaching the consensus on the daily quality bar. I would consider this as the most important step. It’s vital that all three disciplines agree on what is a reasonable quality bar for the product to have. Quality bar depends on a number of factors: in what milestone you’re in (is it three years till shipping or four months?), in what part of the milestone you’re in (is it free coding season or only fixes for showstopper bugs are accepted?), and number of other conditions. As much as I hate to say this, “it depends” is the universal answer. It depends on what type of software you’re trying to build, what is the level of maturity in your team, what is the timetable for your internal and external deliverables etc. To set your quality bar, you really need to take the time and analyze the major features of the product and the main user scenarios. The following simple criterion was used in our decision-making process: “If this feature/scenario won’t work, will the product still be usable for daily development and testing?” For example, in case of our product (Microsoft Speech Server), one of the scenarios which can be classified as BVT is: "Product must be able to start up, answer a phone call, play a prompt to the user, perform some speech related operations (recognize speech, generate speech from text), and hang up the phone call." If a product is able to do all this every day, then probably developers and testers can use it for performing their daily duties. The similar examples can be made about text editor, operating system, some API for dealing with network sockets etc. Initially it may seem like a very basic requirement, but one has to take into account that it’s very strongly desired that BVT tests will pass every single day with every single edition of the product on every single platform you support. If you look at what’s happening in typical software development process: interfaces between components are changing, design is changing, bugs are introduced and fixed, new features are added etc., then it actually isn’t the easiest task to accomplish (as many of my esteemed colleagues may witness).

    Negotiating the usage of selected test cases. I won't go into the details how to successfully negotiate or reach the consensus within the group. Every good textbook about game theory, management essentials or basic psychology can provide you the knowledge needed. Our experience showed that in most of the cases the decision was actual consensus, in some instances majority voted. Again, it’s crucial that everyone understands why some scenario is important and must work every day. Or somebody may even propose a new scenario which is necessary for this individual to get his/her work done.

    Communicating very widely what are expectations for all disciplines when the there’s a BVT break. One example set of rules for the development and test organizations may look like this:

    • By 09:00 AM the e-mail with daily BVT results is sent out to all interested parties.
    • In case of any failures by 10:00 AM the person who wrote the test replies with the initial diagnosis: is this a bug in product, is this a problem with the test cases, is this something else? All the relevant test case logs are handed over to the development team, the machine is provided for developer(s) to perform the initial investigation.
    • By 11:00 AM developer(s) investigating the problem will respond with the initial diagnosis (is this a bug, how far are we while diagnosing the problem, what's the ETA for full diagnosis and possibly fix).
    • ...

    The key here is that some protocol must be in place, all the participants must accept the protocol, and follow it.

    BVT implementation process. “Quis custodit ipsos custodies?” - “Who will guard the guards themselves?” Here’s the worst thing you can do: write a piece of code which later on is declared as a BVT; at one beautiful day this BVT fails and error logs from your code indicate that this is in fact a serious problem with the product; the entire development team is put under immense pressure – high importance e-mails going all over the place; people are investigating the possible root cause and trying to understand what change may have caused this major feature not to work. Two hours later somebody determines that the actual problem is that the code testing the product is using hard-coded constant to wait for some operation to be completed, instead of waiting for the proper event being received. In most of the cases the test case passed, but today it failed. Believe me; you don’t want be a person who is known for developing non-deterministically behaving code which is responsible for certifying the correctness of other components in the system ;-) How to do your best in trying to prevent these kinds of situations from happening? Let’s see next week.


  • North America Wadokai Summer camp

    I'm going to Hilo, Hawaii this July. Wrong, not for the vacation, but for sweat, blood, and tears ;-) The North America Wadokai Summer camp has been officially set for July 18th thru the 24th. This event is sponsored by Wado Guseikai USA and the Canadian Zenkuren Wadokai Association and hosted by the Hawaiian Wadokai. Invitation is being extended to any Wado practitioner on this planet. The following people will be teaching:

    • Dr. Hideho Takagi, 8th dan
    • Mr. Shinji Kohata, 6th dan
    • Mr. Koji Okumachi, 6th dan
    • Mr. Reza Salmani, 5th dan

    If anyone is interested, just send me a piece of e-mail, I’ll give you more details. And remember, there are only five months left to get into decent physical shape ;-)

    Updated information: the Summer Camp will be actually in Lakeland, Florida not Hawaii.


  • What is BVT (Basic/Build Verification Test)? (Part I)

    During upcoming weeks I'm planning to compose a series of posts which will describe (a) what are the BVT-s; (b) how different disciplines (development, program management, and test) are coming up with the BVT-s and achieving consensus; (c) how BVT-s are executed, results are interpreted, and what is an impact of BVT failure(s); (d) what are the characteristics of a good BVT. A reader should take a note that I’m not prophesizing the absolute truth and I’ll be talking about the way we’re doing things in Microsoft Speech Server team. Your mileage may vary. Different teams have different practices across the Microsoft and even outside the Microsoft you should not try to adjust some specific methodology/process to your needs and blindly follow the process. Following blindly the process reminds me one software design project about 5 years ago back in Estonia when we’re doing high-level software design using OMT and trying to squeeze out just three more state diagrams for the sake of having state diagrams in the final design document ;-) You don’t have to make this mistake.

    Here’s the definition we use: “A build acceptance test (sometimes also called build verification test a.k.a. BVT, smoke test, quick check, or the like) is a set of tests run on each new build of a product to verify that the build is testable before the build is released into the hands of the test team. The build acceptance test is generally a short set of tests, which exercises the mainstream functionality of the application. Any build that fails the build verification test is rejected, and testing continues on the previous build (provided there has been at least one build that has passed the acceptance test). So build acceptance tests are a type of regression testing that is done every time a new build is taken. Build acceptance tests are important because they let developers know right away if there is a serious problem with the build, and they save the test team wasted time and frustration.”

    The BVT-s and concept of daily build are very tightly tied together; therefore one may want to read Steve McConnell’s article “Daily Build and Smoke Test” from IEEE Software as a memory refresher. Next time we'll take a look how based on your product's feature set and major user scenarios to decide what test cases are suitable as BVT-s and how to keep everyone happy while doing so.


  • How to cut your couch into pieces? (Part I)

    I have to start from the year 2000 which is when I joined Microsoft and my friend Targo proposed that we all, me and his family, rent a house together. So, we rented a house until July of 2002. Targo’s family was planning for a second child and we decided to split (that’s why I only know how to deal with kids who’re less than two years old ;-)). During the move we divided the furniture and I inherited an old couch (which we’ve got from somebody else who got it from some other person etc. etc.) Two weeks ago I had a problem: I buy lots of books due to my addiction to reading, books need to be stored somewhere, therefore I need book-shelves, unfortunately I had run out of the room in my apartment for book-shelves.

    After some thinking I decided to get rid of the couch. It was old, one leg was slightly broken and I wasn’t using it anyway. Removing a big couch from your apartment is actually quite a bit of a problem. To do it properly, you need to (a) make sure there’s somebody who can help you to move the couch; (b) rent a truck from U-Haul or similar service; (c) load the couch to the truck and deliver it to SomeLocationWhereTheOldCouchesGo; (d) return the truck; (e) buy a dinner for somebody who helped you.

    This all takes time (and money). As I’ve recently reread George Pólya’s “How to Solve It”, I was trying to come up with some (elegant?) way to solve this particular problem. One of the things which came to my mind was: “If you can’t solve the problem itself, divide it into the smaller pieces, solve each piece separately, and see if it solves the entire problem.” Therefore I decided to cut the couch into pieces and remove the pieces from my apartment. First thing I did was that I called my girlfriend in Estonia (she has MSc in psychology) and asked her to assess the state of my mental health because of this idea of cutting the couch into pieces. After we determined that these aren’t the voices that are telling me to cut my couch into the pieces and I’m not doing it because They are after me and I’m not afraid of my couch and  ... (number of other things, check DSM-IV for more) then I decided to proceed with the plan.

    First I needed some tools, Targo was able to provide me with a saw and cutting pliers. I had a cut-off knife and scissors. One beautiful evening after my martial arts practice, I came home, gathered my courage and started the awful plan. It was about 09:00 PM, fortunately my neighbors either didn’t hear me or just didn’t pay attention to all of the noises. Deconstructing the couch into three pieces was extremely tricky. There was some wood which I needed to saw, some wires which I needed to cut, and lots of cotton for which I simply used brute force and cut-off knife as applicable. It took about 1.5 hours to cut the couch into three equivalent pieces which I was able to carry myself out of the apartment and 0.5 hours to clean up the apartment itself. The entire experience was pretty surrealistic and reminded me of both Dali pictures and the way Freud's theory explains dreams ;-) I actually think that somebody should paint a picture called “Software engineer cutting couch into pieces” ;-)

    That’s not the end of it, the next week one of my friends called me and told me that he is moving and asked me to come and help move his big couch. This turned out to be entire another story. I’ll write about it if I’ll have some time next week.


  • My programming mistakes: const BSTR

    One evening I was reading through the Eric Lippert’s blog, noticed that his guide for using BSTR-s is published, and suddenly remembered one programming mistake I made about 3 years ago. I was just hired to Microsoft and writing my first piece of C++ code which had to do some stuff and use MSXML API-s while doing it. The only thing I knew about ATL and COM was what I’ve read in articles before and code samples I’ve inspected in diagonal. Anyway, I ended up writing the following variable declaration:

    const BSTR g_bstrSomething = L"FooBar";

    ... and later on passing this variable to some MSXML API. Of course I had no clue that something may be wrong until OneResidentGuruOfEverything pointed out that I was just looking for trouble to happen. Occasionally I still spot the similar problem in somebody else’s code and then pass the wisdom ;-) Do you know what the problem is?

    When it comes to the prevention, then one can just compose a regular expression which will search for the similar patterns in the source code. But of course you’ll have to deal with the situations like this also:

    WCHAR *szFoo = L"Foo";
    BSTR bstrBar = szFoo;

    ... and with multiple variations of this case. I find it useful once in a couple of months to read either Eric Lippert's guide or relevant sections of MSDN.


  • PREfast for driver developers

    In my last post I forgot to mention that Microsoft actually made PREfast available for driver developers. There's also pointer to the white paper which contains more detailed information PREfast for Drivers. BTW, I don't know anything about driver development, I'm just pointing out the availability of the tool ;-)

    It'll be also interesting to know how many people out there are using other products like PC-lint/FlexeLint or Insure++ or YourOtherFavouriteTool? If you're reading this and you have experience with using any (static) source code analysis tools it'll be useful to know. I'm just curious about the final conclusions. Were they helpful during the development process? Do you think it was worthwhile to dig through all the warning messages and fix them? Did you incorporate them into your product cycle?


  • Static source code analysis - prevention is a key!

    Actually I've been fascinated with the source code quality since I was narcissistic UN*X programmer ;-) who really enjoyed writing the following code to keep my C programs lint-free:

    (void) printf(...);

    When it comes to the writing code then my personal philosophy can be characterized as: “Don’t ever make the same mistake twice.” People who know me may actually notice that this is significant improvement from my previous motto “Mistakes are not an option.” ;-) Static source code analysis is one of the many ways to prevent bugs. Well, that's what this all is about - prevention, researching the root cause, and making sure that we find all the bugs as early as possible. When you work in software development then every day is like fighting with Lernaean Hydra - you fix one bug, another one surfaces, you fix this new bug, and there's a new one waiting for you ...

    Fortunately Reliability group in Programmer Productivity Research Center at Microsoft Research is working at really cool stuff which helps to prevent bugs (search for PREfast and PREfix). There are number of public presentations about PREfast and PREfix which should give everyone a general idea what these tools are about and how Microsoft uses them internally to improve the software quality:

    If you have more time on your hands, feel free to read also the following article (W. R. Bush, J. D. Pincus and D. J. Sielaff. A static analyzer for finding dynamic programming errors. In Software Practice and Experience, 2000; 30 : 775-802.). Dawson Engler's home page has also a number of pointers to the useful and educational papers. So does Jonathan Pincus's.

    Of course if you’re .NET programmer then you should be feeling lucky, because there’s FxCop. Though FxCop has been popularized for quite some time now, here's the short summary: "FxCop is a code analysis tool that checks .NET managed code assemblies for conformance to the Microsoft .NET Framework Design Guidelines. It uses reflection, MSIL parsing, and callgraph analysis to inspect assemblies for more than 200 defects in the following areas: naming conventions, library design, localization, security, and performance". And yes, FxCop team has their blog.


  • My programming mistakes: ExitProcess()

    Time for self-bashing! I was writing some code over the weekend (for fun) and made one very immature mistake which probably I'm not the last person to make ;-) The pseudo-code looked like this:

    if (!::SomeWin32APICall(...))

    After finally understanding why the error message isn't logged, I was actually pretty ashamed of myself ;-( The lesson learned from this story is that if time permits then one should probably inspect all the occurrences of the following function calls:

    ... and make sure that by accident there's no code in the current scope which you expect to be executed after one of these functions returns (which won't happen). People who're good with awk, Perl or any other scripting language can probably produce a little script in 5 minutes which performs the sanity checks on your code base.


  • Yes, you can just delete the pointer

    One must write good code. One must write perfect code. One must write as efficient code as possible. One thing which amazes me is that there are number of people out there in C++ world who're still checking for pointer being NULL (0) before deleting it. Here's the pseudo-code:

    void Foo()
        CBar *pBar = new CBar();
        // Do something with pBar (validate, use) ...
        if (0 != pBar)
            delete pBar;

    Please don't do this, you can omit the if-statement and just delete the pointer. Everything will be fine. Additional lines of code only: a) make life harder for the people who have to review your code; b) decrease your code coverage numbers; c) demonstrate that you don't know the language basics.

    The obvious places for answers are:

    For C people - yes, the same applies to free() also, no need to check for NULL. Check out also a draft rationale for the C99 standard. One thing to keep in mind though is that this information is Microsoft-centric, please check the documentation for your compiler first.


  • Who is GK?

    I've been with Microsoft Speech Technologies for almost 3.5 years. Most of my time has been spent working on Microsoft Speech Server (new and proud member of Microsoft server products community.  My place of birth is small country called Estonia. You can check the CIA - The World Factbook for more information. There are actually very few people from Estonia working at Redmond. Most of us are proud graduates of University of Tartu (Institute of Computer Science). If you ever want to visit this country then I'll be happy to give you some guidance ;-)

    My non-existent free time consists of the following major activities:

    • Reading both fiction and technical books (lately I've become a big fan of Robert A. Heinlein).
    • Learning Japanese (no, it's not easy ;-)).
    • Practicing martial arts, running, and going to free weights training room with my friend Targo.
    • Reading and writing interesting code.

    I also like to keep my life complex by maintaining long distance relationships ;-)