Your API Fails, Who is at Fault?

Friday, May 9, 2008

C# DBC Spec# TDDBDD

I decided to stay on the Design by Contract side for just a little bit. Recently, Raymond Chen posted "If you pass invalid parameters, then all bets are off" in which he goes into parameter validation and basic defensive programming. Many of the conversations had on the blog take me back to my C++ and early Java days of checking for null pointers, buffer lengths, etc. This brings me back to some recent conversations I've had about how to make it explicit about what I expect. Typical defensive behavior looks something like this:

public static void Foreach<T>(this IEnumerable<T> items, Action<T> action)
{
    if (action == null)
        throw new ArgumentNullException("action");

    foreach (var item in items)
        action(item);
}

After all, how many times have you not had any idea what the preconditions are for a given method due to lack of documentation or non-intuitive method naming? it gets worse when they don't provide much documentation, XML comments or otherwise. At that point, it's time to break out .NET Reflector and dig deep. Believe me, I've done it quite a bit lately.

The Erlang Way

The Erlang crowd takes an interesting approach to the issue that I've really been intrigued by. Joe Armstrong calls this approach "Let it crash" in which you only code to the sunny day scenario, and if the call to it does not conform to the spec, just let it crash. You can read more about that on the Erlang mailing list here.

Some paragraphs stuck out in my mind.

Check inputs where they are "untrusted"
- at a human interface
- a foreign language program

What this basically states is the only time you should do such checks is at the bounds when you have possible untrusted input, such as bounds overflows, unexpected nulls and such. He goes on to say about letting it crash:

specifications always say what to do if everything works - but never what to do if the input conditions are not met - the usual answer is something sensible - but what you're the programmer - In C etc. you have to write *something* if you detect an error - in Erlang it's easy - don't even bother to write code that checks for errors - "just let it crash".

So, what Joe advocates is not checking at all, and if they don't conform to the spec, just let it crash, no need for null checks, etc. But, how would you recover from such a thing? Joe goes on to say:

Then write a *independent* process that observes the crashes (a linked process) - the independent process should try to correct the error, if it can't correct the error it should crash (same principle) - each monitor should try a simpler error recovery strategy - until finally the error is fixed (this is the principle behind the error recovery tree behaviour).

It's an interesting approach, but proves to a valuable one for parallel processing systems. As I dig further into more functional programming languages, I'm finding such constructs useful.

Design by Contract Again and DDD

Defensive programming is a key part of Design by Contract. But, in a way it differs. With defensive programming, the callee is responsible for determining whether the parameters are valid and if not, throws an exception or otherwise handles it. DbC with the help of the language helps the caller better understand how to cope with the exception if it can.

Bertrand Meyer wrote a bit about this in the Eiffel documentation here. But, let's go back to basics. DbC asserts that the contracts (what we expect, what we guarantee, what we maintain) are such a crucial piece of the software, that it's part of the design process. What that means is that we should write these contract assertions FIRST.

What do these contract assertions contain? It normally contains the following:

Acceptable/Unacceptable input values and the related meaning
Return values and their meaning
Exception conditions and why
Preconditions (may be weakened by subclasses)
Postconditions (may be strengthened by subclasses)
Invariants (may be strengthened by subclasses)

So, in effect, I'm still doing TDD/BDD, but an important part of this is identifying my preconditions, postconditions and invariants. These ideas mesh pretty well with my understanding of BDD and we should be testing those behaviors in our specs. Some people saw in my previous posts that they were afraid I was de-emphasizing TDD/BDD and that couldn't be further from the truth. I'm just using another tool in the toolkit to express my intent for my classes, methods, etc. I'll explain further in a bit down below.

Also, my heavy use of Domain Driven Design patterns help as well. I mentioned those previously when I talked about Side Effects being Code Smells. With the combination of intention revealing interfaces which express to the caller what I am intending to do, and my use of assertions not only in the code but also in the documentation as well. This usually includes using the <exception> XML tag in my code comments. Something like this is usually pretty effective:

/// <exception cref="T:System.ArgumentNullException"><paramref name="action"/> is null.</exception>

If you haven't read Eric's book, I suggest you take my advice and Peter's advice and do so.

Making It Explicit

Once again, the use of Spec# to enforce these as part of the method signature to me makes sense. To be able to put the burden back on the client to conform to the contract or else they cannot continue. And to have static checking to enforce that is pretty powerful as well.

But, what are we testing here? Remember that DbC and Spec# can ensure your preconditions, your postconditions and your invariants hold, but they cannot determine whether your code is correct and conforms to the specs. That's why I think that BDD plays a pretty good role with my use of Spec#.

DbC and Spec# can also play a role in enforcing things that are harder with BDD, such as enforcing invariants. BDD does great things by emphasizing behaviors which I'm really on board with. But, what I mean by being harder is that your invariants may be only private member variables which you are not going to expose to the outside world. If you are not going to expose them, it makes it harder for your specs to control such behavior. DbC and Spec# can fill that role. Let's look at the example of an ArrayList written in Spec#.

public class ArrayList
{
    invariant 0 <= _size && _size <= _items.Length;
    invariant forall { int i in (_size : _items.Length); _items[i] == null }; // all unused slots are null

    [NotDelayed]
    public ArrayList (int capacity)
      requires 0 <= capacity otherwise ArgumentOutOfRangeException;
      ensures _size/*Count*/ == 0;
      ensures _items.Length/*Capacity*/ == capacity;
    {
      _items = new object[capacity];
      base();
    }

    public virtual void Clear ()
      ensures Count == 0;
    {
      expose (this) {
        Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
        assume forall{int i in (0: _size); _items[i] == null}; // postcondition of Array.Clear
        _size = 0;
      }
    }

// Rest of code omitted

What I've been able to do is set the inner array to the new capacity, but also ensure that when I do that, my count doesn't go up, but only my capacity. When I call the Clear method, I need to make sure the inner array is peer consistent by the way of all slots not in the array must be null as well as resetting the size. We use the expose block to expose to the runtime to have the verifier analyze the code. By the end of the expose block, we should be peer consistent, else we have issues. How would we test some of these scenarios in BDD? Since they are not exposed to the outside world, it's pretty difficult. What it would be doing is leaving me with black box artifacts that are harder to prove. Instead, if I were to expose them, it would then break encapsulation which is not necessarily something I want to do. Instead, Spec# gives me the opportunity to enforce this through the DbC constructs afforded in the language.

The Dangers of Checked Exceptions

But with this, comes a cost of course. I recently spoke with a colleague about Spec# and the instant thoughts of checked exceptions in Java came to mind. Earlier in my career, I was a Java guy who had to deal with those who put large try/catch blocks around methods with checked exceptions and were guilty of just catching and swallowing or catching and rethrowing RuntimeExceptions. Worse yet, I saw this as a way of breaking encapsulation by throwing exceptions that I didn't think the outside world needed to know about. I was kind of glad that this feature wasn't brought to C# due to the fact I saw rampant abuse for little benefit. What people forgot about during the early days of Java that exceptions are meant to be exceptional and not control flow.

How I see Spec# being different is that since we have a static verification tool through the use of Boogie to verify whether those exceptional conditions are valid. The green squigglies give warnings about possible null values or arguments in ranges, etc. This gives me further insight into what I can control and what I cannot. Resharper also has some of those nice features as well, but I've found Boogie to be a bit more helpful with more advanced static verification.

Conclusion

Explicit DbC constructs give us a pretty powerful tool in terms of expressing our domain and our behaviors of our components. Unfortunately, in C# there are no real valuable implementations that enforce DbC constructs to both the caller and the callee. And hence Spec# is an important project to come out of Microsoft Research.

Scott Hanselman just posted his interview with the Spec# team on his blog, so if you haven't heard it yet, go ahead and download it now. It's a great show and it's important that if you find Spec# to be useful, that you press Microsoft to give it to us as a full feature.

No Comments