Contents tagged with Spec#
-
Spec# and Boogie Released on CodePlex
You may have noticed that in the past that I’ve talked extensively about Spec#, an object-oriented .NET language based upon C# with contract-first features as well as a non-null type system. This project has not only been covered by myself, but also my CodeBetter compatriot, Greg Young, and by the illustrious Tony Hoare at QCon London during his “Null References: The Billion Dollar Mistake” presentation. This project, after gaining momentum in the .NET world, has now been made part of .NET 4.0 as Code Contracts for .NET.
-
How would the CLR Be Different?
UPDATED: Added improved generics with higher-kinded polymorphism
-
Code Contracts - TDD in a DbC World
Lately, I've been talking about the new feature coming to .NET 4.0, Code Contracts, which is to bring Design by Contract (DbC) idioms to all .NET languages as part of the base class library. Last week, I attended QCon, where Greg Young, one of my CodeBetter cohorts, gave a talk titled "TDD in a DbC World" in which he talked about that they are not in conflict, but instead are complementary to a test first mentality. Both improve upon the usage of the other.
-
.NET Code Contracts and TDD Are Complementary
After my recent post on the introduction of Code Contracts in .NET 4.0, I got some strong reaction that they would rather rely on TDD, or the better term, Example Driven Development specifications to ensure correctness. Instead, it's my intent to talk about how they can complement each other in some rather powerful ways, such as having contracts bound to interfaces instead of class instances to ensure uniform interaction. When we combine these two things together, the edge cases of our behaviors quickly melt away.
-
Code Contracts for .NET 4.0 - Spec# Comes Alive
As I've said in many posts before, I'm a big fan of stating your preconditions, postconditions, invariants and so on explicitly in your code through the use of contracts in the Design by Contract Parlance. Spec# is a project that came from Microsoft Research as a language based upon C# to add Design by Contract features to the language. I've talked extensively on the subject in the past on this blog and previous blogs of mine especially around the time of the ALT.NET Open Spaces, Seattle event back in April of this year. The importance of making side effects well known is something that I mentioned briefly during my Approaching Functional Programming talk at KaizenConf. To be able to express and statically verify behavior is important in regards to side effects and method purity.
-
Static versus Dynamic Languages - Attack of the Clones
Very recently there has been an ongoing debate between static and dynamically typed languages. Since it seems that there has been some Star Wars references, I thought I'd add my own. I originally wanted to cover this as part of the future of C#, but I think it deserves its own topic. There have been many voices in the matter and I've read all sides and thought I'd weigh in on the matter. I find myself with my feet right now in the statically typed community right now. I do appreciate dynamic typing and it definitely has its use, but to me the static verification is a key aspect. But, of course I do appreciate dynamic languages, especially those of the past including Lisp, Erlang, etc.
Here are some of the salvos that have been fired so far:
- Dynamic Languages Strike Back - Steve Yegge
- Return of Statically Typed Languages - Cedric Beust
- Guide You the Force Should - Ted Neward
- Revenge of the Statically Typed Languages - Greg Young
- A New Hope - Polyglotism - Ola Bini
The Salvos Fired
First, Steve Yegge posted a transcript from his talk at Stanford called "Dynamic Languages Strike Back". In this talk, he talks about the history of dynamic languages, the performance, what can be done, and the politics of it all. But at the end of the day, it comes down to the tools used. It was a pretty interesting talk, but of course dredge up some pretty strong feelings. In turn, you had responses from Cedric Beust coming out in favor of statically typed languages, and Ted Neward, Ola Bini and Greg Young analyzing the results of the two of them. I won't get into the me too aspect of it all, but I encourage you to read the posts, but also the responses as well.
I think Cedric lost me on the argument though is when he brought Scala into the argument. To me, it was kind of nonsensical to mention it in this case. And to mention that pattern matching is a leaky abstraction is unfortunate and I think very wrong. The thing that functional languages give us is the ability to express what we want, and not necessarily how to get it. Whether it puts it in a switch statement, an if statement, or anything else doesn't matter, as long as the decision tree was followed. I don't see any leakiness here. So, that was a bad aside on there. I'm not a huge fan of Scala either, but for entirely different reasons. First off, the type inference isn't really as strong as it should be and the syntax to me just doesn't seem to be as functional as I'd like. F# and Scala tackle the problems in vastly different ways.
Ola Bini, who has been advocating the polyglot programmer for some time, summed up the Steve versus Cedric posts very concisely in these two paragraphs:
So let's see. Distilled, Steve thinks that static languages have reached the ceiling for what's possible to do, and that dynamic languages offer more flexibility and power without actually sacrificing performance and maintainability. He backs this up with several research papers that point to very interesting runtime performance improvement techniques that really can help dynamic languages perform exceptionally well.
On the other hand Cedric believes that Scala is bad because of implicits and pattern matching, that it's common sense to not allow people to use the languages they like, that tools for dynamic languages will never be as good as the ones for static ones, that Java generics isn't really a problem, that dynamic language performance will improve but that this doesn't matter, that static languages really hasn't failed at all and that Java is still the best language of choice, and will continue to be for a long time.
It seems that many of the modern dynamic languages are pretty flexible, but also not as performance oriented as the ones in the past. Why is this? It's a good question to ask. And what can be done about it? Of course Ola takes the tact, and I think correctly so that the tooling won't be the same or as rich for dynamic languages as it is for statically typed. It simply can't be. But that doesn't mean that it needs those tools won't exist, they'll just be different. But at the end, Ola argues for the polyglot programmer and each language to its strength. He talks a bit more about this with Mike Moore on the Rubiverse podcast here.
Impedance Mismatch?
There was a topic discussed at the ALT.NET Open Spaces, Seattle fishbowl on the polyglot programmer which talked about the impedance mismatch between statically typed languages and dynamic ones. What's great is that Greg Young got together a session with Rustan Leino and Mike Barnett from Microsoft Research on the Spec# team, John Lam from the IronRuby team, and me. It was a great discussion which revolved around the flexibility that dynamic languages give you versus the static verification that you lack when you do that. And there is a balance to be had. When you look at that flexibility that Ruby and other dynamic languages give you, also creates a bit more responsibility for ensuring its correctness. It's a great conversation and well worth the time invested. But one of the benefits we're seeing from CLR and in turn the DLR is the interop story so that you could have your front end be Ruby, service layer in C#, rules engine in F#, Boo for configuration and so on.
Anders Hejlsberg on C# And Statically Typed Languages
As I noted earlier, Anders Hejlsberg was on Software Engineering Radio Episode 97 to discuss the future of C#. Although Anders has his foot firmly in the statically typed camp, he sees the value of dynamic dispatch. The phrase that was used and quite apt was "Static Programming but Dynamically Generated". I think the metaprogramming story in C# needs to be improved for this to happen. Doing Reflection.Emit isn't the strongest story for doing this, and certainly not easy.
Where I think that C# can go however is more towards making DSL creation much easier. Boo, F# and others on the .NET platform are statically typed, yet go well beyond what C# can do in this arena. Ayende has been doing a lot with Boo and making the language, although statically typed, very flexible and readable. Ruby has a pretty strong story here and C# and other languages have some lessons it can learn.
Another example is that Erlang is a dynamic language, yet very concurrent and pretty interesting. C# and other .NET languages can learn a bit from Erlang. I'm not sure Erlang itself will be taking off, as it would need some sort of sponsorship and some better frameworks before it could. F# has learned some of those lessons in terms of messaging patterns, but no in terms of recovery and process isolation just yet. I covered a bit of that on my previous post.
Wrapping It Up
It's a pretty interesting debate, and at the end of the day, it really comes down to what language meets your needs. The .NET CLR has a pretty strong story of allowing other languages to interoperate that nicely compliments the polyglot. But, I don't think that static typing is going the way of the dodo and I also don't think dynamic typing will win the day. Both have their places. Sounds like a copout, I know, but deal with it. I have a bit more to discuss on this matter, especially about learning lessons from Erlang, one of the more interesting languages that has seen a resurgence lately.
-
What Is the Future of C# Anyways?
It was often asked during some of my presentations on F# and Functional C# about the future direction of C# and where I think it's going. Last night I was pinged about this with my F# talk at the Philly ALT.NET meeting. The question was asked, why bother learning F#, when eventually I'll get these things for free once they steal it and bring it to C#. Being the language geek that I am, I'm pretty interested in this question as well. Right now, the language itself keeps evolving at a rather quick pace as compared to C++ and Java. And we have many developers that are struggling to keep up with the evolution of the language to a more functional style with LINQ, lambda expressions, lazy evaluation, etc. There are plenty of places to go with the language and a few questions to ask along the way.
An Interview With Anders Hejlsberg
Recently on Software Engineering Radio, Anders Hejlsberg was interviewed about the past, present and future of C# on Episode 97. Of course there are interesting aspects of the history of his involvement with languages such as Tubro Pascal and Delphi and some great commentary on Visual Basic and dynamic languages as well. But, the real core of the discussion was focused around what problems are the ones we need to solve next? And how will C# handle some of these features? Will they be language constructs or built-in to the framework itself?
Let's go through some of the issues discussed.
Concurrency Programming
Concurrency programming is hard. Let's not mince words about it. Once we start getting into multiple processors and multiple cores, this becomes even more of an issue. Are we using the machine effectively? It's important because with the standard locks/mutexes it's literally impossible to have shared memory parallelism with more than two processors without at some point being blocking and serial.
The way things are currently designed in the frameworks and the languages themselves are not designed for concurrency to make it easy. The Erlang guys of course would disagree since they started with that idea from the very start. Since things are sandboxed to a particular thread, they are free to mutate state to their heart's content, and then when they need to talk to another process, they pick up the data and completely copy it over, so there is a penalty for doing so. Joe Armstrong, the creator of Erlang, covered a lot of these things in his Erlang book "Programming Erlang: Software for a Concurrent World ".
Mutable State
Part of the issue concerning concurrency is the idea of mutable state. As far back as I remember, we were always taught in CS classes that you can feel free to mutate state as need be. But, that only really works when you've got a nicely serial application where A calls B calls C calls D and all on the same thread. But, that's a fairly limiting thing idea as we start to scale out to multiple threads, machines and so on. Instead, we need to focus on the mutability and control it in a meaningful way through not only the use of our language constructs, but our design patterns as well.
In the C# world, we have the ability to create opt-in immutability through the use of the readonly keyword. This is really helpful to decide those fields that we don't really need to or want to modify. This also helps the JIT better determine the use of our particular variable. I'm not sure about performance gains, but that's not really the point of it all, anyways. Take the canonical example of the 2D point such as this:
public class Point2D
{
private readonly double x;
private readonly double y;
public Point2D() { }
public Point2D(double x, double y)
{
this.x = x;
this.y = y;
}
public double X { get { return x; } }
public double Y { get { return y; } }
public Point2D Add(Size2D size)
{
return new Point2D(x + size.Height, y + size.Width);
}
}
We've created this class as to not allow for mutable state, instead returning a new object that you are free to work with. This of course is a positive thing. But, can we go further in a language than just this? I think so, and I think Anders does too. Spec# and design by contract can take this just a bit further in this regard. What if I can state that my object, as it is, is immutable? That would certainly help the compiler to optimize. Take for example doing Value Objects in the Domain Driven Design world. How would something like that look? Well, let's follow the Spec# example and mark my class as being immutable, meaning that once I initialize it, I cannot change it for any reason:
[Immutable]
public class Point2D
{
// Class implementation the same
}
This helps make it more transparent to the caller and the callee that what you have cannot be changed. This enforces the behaviors for my member variables in a pretty interesting way. Let's take a look at the actual C# generated in Spec# for the above code. I'll only paste the relevant information about what it did to the properties. I'll only look at the X, but the identical happened for the Y as well.
public double X
{
[Witness(false, 0, "", "0", ""), Witness(false, 0, "", "1", ""), Witness(false, 0, "", "this@ClassLibrary1.Point2D::x", "", Filename=@"D:\Work\SpecSharpSamples\SpecSharpSamples\Class1.ssc", StartLine=20, StartColumn=0x21, EndLine=20, EndColumn=0x22, SourceText="x"), Ensures("::==(double,double){${double,\"return value\"},this@ClassLibrary1.Point2D::x}", Filename=@"D:\Work\SpecSharpSamples\SpecSharpSamples\Class1.ssc", StartLine=20, StartColumn=20, EndLine=20, EndColumn=0x17, SourceText="get")]
get
{
double return value = this.x;
try
{
if (return value != this.x)
{
throw new EnsuresException("Postcondition 'get' violated from method 'ClassLibrary1.Point2D.get_X'");
}
}
catch (ContractMarkerException)
{
throw;
}
double SS$Display Return Local = return value;
return return value;
}
}
What I like about F# and functional programming is the opt-out mutability, which means by default, my classes, lists, structures and so on are immutable by default. So, this makes you think long and hard about any particular mutability you want to introduce into your program. It's not to say that there can be no mutability in your application, but on the other hand, you need to think about it, and isolate it in a meaningful manner. Haskell takes a more hardline stance on the issue, and mutability can only occur in monadic expressions. If you're not aware of what those are, check out F# workflows which are perfectly analogous. But by default, we get code that looks like this and is immutable:
type Point2D = class
val x : double
val y : double
new() = { x = 0.0; y = 0.0 }
new(x, y) =
{
x = x
y = y
}
member this.X
with get() = this.x
member this.Y
with get() = this.y
end
So, as you can see, I'm not having to express the immutability, only the mutability if I so choose. Very important differentiator.
Method Purity
Method purity is another important topic as we talk about concurrent programming and such. What I mean by this is that I'm not going to modify the incoming parameters or cause some side effects, and instead I will produce a new object instead. This has lasting effects if I'm going to be doing things on other threads. Eric Evans talked about this topic briefly in his Domain Driven Design book on Supple Design. The idea is to have side effect free functions as much as you can, and carefully control where you mutate state through intention revealing interfaces and so on.
But, how do you communicate this? Well, Command-Query Separation gets us part of the way there. That's the idea of having the mutation and side effects in your command functions where you return nothing, and then queries which return data but do not modify state. Spec# can enforce this behavior as well. To be able to mark our particular functions as being pure is quite helpful in communicating whether I can expect a change in state. Therefore I know whether I have to manage the mutation in some special way. To communicate something like that in Spec#, all I have to do is something like this:
[Pure]
public Point2D Add(Size2D size)
requires size != null;
{
return new Point2D(x + size.Height, y + size.Width);
}
This becomes part of the method contract and some good documentation as well for your system.
Asynchronous Communication and Messaging
Another piece of interest is messaging and process isolation. The Erlang guys figured out a while ago, that you can have mutation as well as mass concurrency, fail safety and so on with process isolation. Two ideas come to mind from other .NET languages. An important distinction must be made between concurrency memory models between shared-memory and message passing concurrency. Messaging and asynchronous communication are key foundations for concurrent programming.
In F#, there is support for the mailbox processing messaging. This is already popular in Erlang, hence probably where the idea came from. The idea is that a mailbox is a message queue that you can listen to for a message that is relevant to the agent you've defined. This is implemented in the MailboxProcessor class in the Microsoft.FSharp.Control.Mailboxes namespace. Doing a simple receive is pretty simple as shown here:
#light
#nowarn "57"
open Microsoft.FSharp.Control.CommonExtensions
open Microsoft.FSharp.Control.Mailboxes
let incrementor =
new MailboxProcessor<int>(fun inbox ->
let rec loopMessage(n) =
async {
do printfn "n = %d" n
let! message = inbox.Receive()
return! loopMessage(n + message)
}
loopMessage(0))
Robert Pickering has more information about the Erlang style message passing here.
Now, let's come back just a second. Erlang also introduces another concept that Sing# and the Singularity OS took up. It's a concept called the Software Isolated Process (SIP). The idea is to isolate your processes in a little sandbox. Therefore if you load up a bad driver or something like that, the process can die and then spin up another process without having killed the entire system. That's a really key part of Singularity and quite frankly one of the most intriguing. Galen Hunt, the main researcher behind this talked about this on Software Engineering Radio Episode 88. He also talks about it more here on Channel9 and it's well worth looking at. You can also download the source on CodePlex and check it out.
Dynamic C#?
As you can probably note, Anders is pretty much a static typing fan and I'd have to say that I'm also firmly in that camp as well. But, there are elements that are intriguing such as metaprogramming and creating DSLs which are pretty weak in C# as of now. Sure, people are trying to bend C# in all sorts of interesting ways, but it's not a natural fit as the language stands now. So, I think there can be some improvements here in some areas.
Metaprogramming
Metaprogramming is another area that was mentioned as a particularly interesting aspect. As of right now, it's not an easy fit to do this with C#. But once again, F# has many of these features built-in to do such things as quotations to do some metaprogramming because that's what it was created to do, a language built to create other languages. Tomas Petricek is by far one of the authorities on the subject as he has leveraged it in interesting ways to create AJAX applications. You can read about his introduction to metaprogramming here and his AJAX toolkit here. Don Syme has also written a paper about leveraging Meta-programming with F# which you can find here. But I guess I have to ask the question, does C# need this or shouldn't we just use F# for what it's really good at and not shoehorn yet another piece onto the language? Or the same could be said of Ruby and its power with metaprogramming as well, why not use the best language for the job?
Dynamic Dispatch
The idea of dynamic dispatch is an interesting idea as well. This is the idea that you can invoke a method on an object that doesn't exist, and instead, the system figures out where to send it. In Ruby, we have the method_missing concept which allows us to define that behavior when that method that is being invoked is not found. Anders thought it was an intriguing idea and it was something to look at. This might help in the creation of DSLs as well when you can define that behavior even though that method may not exist at all.
In the Language or the Framework?
Another good question though is do these features belong in the language itself or the in the framework? The argument here is that if you somehow put a lot of constraints on the language syntax, then you might prematurely age the language and as a result, decline in usage. Instead, the idea is to focus on the libraries to make these things available. For example, the MailboxProcessor functionality being brought to all languages might not be a bad idea. Those sorts of concepts around process isolation would be more of a framework concept than a language concept. But, it's an interesting debate as to what belongs where. Because at the end of the day, you do need some language differentiation when you use C#, F#, Ruby, Python, C++, etc or else what's the point of having all of them? To that point I've been dismayed that VB.NET and C# have mirrored themselves pretty well and tried to make themselves equal and I wish they would just stop. Let VB find a niche and let C# find its niche.
Conclusion
Well, I hope this little discussion got you thinking as well about the future of C# and the future of the .NET framework as well. What does C# need in order to better express the problems we are trying to solve? And is it language specific or does it belong in the framework? Shouldn't we just use the best language for the job instead of everything by default being in C#? Good questions to answer, so now discuss...
-
Your API Fails, Who is at Fault?
I decided to stay on the Design by Contract side for just a little bit. Recently, Raymond Chen posted "If you pass invalid parameters, then all bets are off" in which he goes into parameter validation and basic defensive programming. Many of the conversations had on the blog take me back to my C++ and early Java days of checking for null pointers, buffer lengths, etc. This brings me back to some recent conversations I've had about how to make it explicit about what I expect. Typical defensive behavior looks something like this:
public static void Foreach<T>(this IEnumerable<T> items, Action<T> action)
{
if (action == null)
throw new ArgumentNullException("action");
foreach (var item in items)
action(item);
}
After all, how many times have you not had any idea what the preconditions are for a given method due to lack of documentation or non-intuitive method naming? it gets worse when they don't provide much documentation, XML comments or otherwise. At that point, it's time to break out .NET Reflector and dig deep. Believe me, I've done it quite a bit lately.
The Erlang Way
The Erlang crowd takes an interesting approach to the issue that I've really been intrigued by. Joe Armstrong calls this approach "Let it crash" in which you only code to the sunny day scenario, and if the call to it does not conform to the spec, just let it crash. You can read more about that on the Erlang mailing list here.
Some paragraphs stuck out in my mind.
Check inputs where they are "untrusted"
- at a human interface
- a foreign language program
What this basically states is the only time you should do such checks is at the bounds when you have possible untrusted input, such as bounds overflows, unexpected nulls and such. He goes on to say about letting it crash:
specifications always say what to do if everything works - but never what to do if the input conditions are not met - the usual answer is something sensible - but what you're the programmer - In C etc. you have to write *something* if you detect an error - in Erlang it's easy - don't even bother to write code that checks for errors - "just let it crash".
So, what Joe advocates is not checking at all, and if they don't conform to the spec, just let it crash, no need for null checks, etc. But, how would you recover from such a thing? Joe goes on to say:
Then write a *independent* process that observes the crashes (a linked process) - the independent process should try to correct the error, if it can't correct the error it should crash (same principle) - each monitor should try a simpler error recovery strategy - until finally the error is fixed (this is the principle behind the error recovery tree behaviour).
It's an interesting approach, but proves to a valuable one for parallel processing systems. As I dig further into more functional programming languages, I'm finding such constructs useful.
Design by Contract Again and DDD
Defensive programming is a key part of Design by Contract. But, in a way it differs. With defensive programming, the callee is responsible for determining whether the parameters are valid and if not, throws an exception or otherwise handles it. DbC with the help of the language helps the caller better understand how to cope with the exception if it can.
Bertrand Meyer wrote a bit about this in the Eiffel documentation here. But, let's go back to basics. DbC asserts that the contracts (what we expect, what we guarantee, what we maintain) are such a crucial piece of the software, that it's part of the design process. What that means is that we should write these contract assertions FIRST.
What do these contract assertions contain? It normally contains the following:
- Acceptable/Unacceptable input values and the related meaning
- Return values and their meaning
- Exception conditions and why
- Preconditions (may be weakened by subclasses)
- Postconditions (may be strengthened by subclasses)
- Invariants (may be strengthened by subclasses)
So, in effect, I'm still doing TDD/BDD, but an important part of this is identifying my preconditions, postconditions and invariants. These ideas mesh pretty well with my understanding of BDD and we should be testing those behaviors in our specs. Some people saw in my previous posts that they were afraid I was de-emphasizing TDD/BDD and that couldn't be further from the truth. I'm just using another tool in the toolkit to express my intent for my classes, methods, etc. I'll explain further in a bit down below.
Also, my heavy use of Domain Driven Design patterns help as well. I mentioned those previously when I talked about Side Effects being Code Smells. With the combination of intention revealing interfaces which express to the caller what I am intending to do, and my use of assertions not only in the code but also in the documentation as well. This usually includes using the <exception> XML tag in my code comments. Something like this is usually pretty effective:
/// <exception cref="T:System.ArgumentNullException"><paramref name="action"/> is null.</exception>
If you haven't read Eric's book, I suggest you take my advice and Peter's advice and do so.
Making It Explicit
Once again, the use of Spec# to enforce these as part of the method signature to me makes sense. To be able to put the burden back on the client to conform to the contract or else they cannot continue. And to have static checking to enforce that is pretty powerful as well.
But, what are we testing here? Remember that DbC and Spec# can ensure your preconditions, your postconditions and your invariants hold, but they cannot determine whether your code is correct and conforms to the specs. That's why I think that BDD plays a pretty good role with my use of Spec#.
DbC and Spec# can also play a role in enforcing things that are harder with BDD, such as enforcing invariants. BDD does great things by emphasizing behaviors which I'm really on board with. But, what I mean by being harder is that your invariants may be only private member variables which you are not going to expose to the outside world. If you are not going to expose them, it makes it harder for your specs to control such behavior. DbC and Spec# can fill that role. Let's look at the example of an ArrayList written in Spec#.
public class ArrayList
{
invariant 0 <= _size && _size <= _items.Length;
invariant forall { int i in (_size : _items.Length); _items[i] == null }; // all unused slots are null
[NotDelayed]
public ArrayList (int capacity)
requires 0 <= capacity otherwise ArgumentOutOfRangeException;
ensures _size/*Count*/ == 0;
ensures _items.Length/*Capacity*/ == capacity;
{
_items = new object[capacity];
base();
}
public virtual void Clear ()
ensures Count == 0;
{
expose (this) {
Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
assume forall{int i in (0: _size); _items[i] == null}; // postcondition of Array.Clear
_size = 0;
}
}
// Rest of code omitted
What I've been able to do is set the inner array to the new capacity, but also ensure that when I do that, my count doesn't go up, but only my capacity. When I call the Clear method, I need to make sure the inner array is peer consistent by the way of all slots not in the array must be null as well as resetting the size. We use the expose block to expose to the runtime to have the verifier analyze the code. By the end of the expose block, we should be peer consistent, else we have issues. How would we test some of these scenarios in BDD? Since they are not exposed to the outside world, it's pretty difficult. What it would be doing is leaving me with black box artifacts that are harder to prove. Instead, if I were to expose them, it would then break encapsulation which is not necessarily something I want to do. Instead, Spec# gives me the opportunity to enforce this through the DbC constructs afforded in the language.
The Dangers of Checked Exceptions
But with this, comes a cost of course. I recently spoke with a colleague about Spec# and the instant thoughts of checked exceptions in Java came to mind. Earlier in my career, I was a Java guy who had to deal with those who put large try/catch blocks around methods with checked exceptions and were guilty of just catching and swallowing or catching and rethrowing RuntimeExceptions. Worse yet, I saw this as a way of breaking encapsulation by throwing exceptions that I didn't think the outside world needed to know about. I was kind of glad that this feature wasn't brought to C# due to the fact I saw rampant abuse for little benefit. What people forgot about during the early days of Java that exceptions are meant to be exceptional and not control flow.
How I see Spec# being different is that since we have a static verification tool through the use of Boogie to verify whether those exceptional conditions are valid. The green squigglies give warnings about possible null values or arguments in ranges, etc. This gives me further insight into what I can control and what I cannot. Resharper also has some of those nice features as well, but I've found Boogie to be a bit more helpful with more advanced static verification.
Conclusion
Explicit DbC constructs give us a pretty powerful tool in terms of expressing our domain and our behaviors of our components. Unfortunately, in C# there are no real valuable implementations that enforce DbC constructs to both the caller and the callee. And hence Spec# is an important project to come out of Microsoft Research.
Scott Hanselman just posted his interview with the Spec# team on his blog, so if you haven't heard it yet, go ahead and download it now. It's a great show and it's important that if you find Spec# to be useful, that you press Microsoft to give it to us as a full feature.
-
Command-Query Separation and Immutable Builders
In one of my previous posts about Command-Query Separation (CQS) and side effecting functions being code smells, it was pointed out to me again about immutable builders. For the most part, this has been one area of CQS that I've been willing to let break. I've been following Martin Fowler's advice on method chaining and it has worked quite well. But, revisiting an item like this never hurts. Immutability is something you'll see me harping on time and time again now and in the future. The standard rules I usually do is immutable and side effect free when you can, mutable state where you must. I like the opt-in mutability of functional languages such as F# which I'll cover at some point in the near future instead of the opt-out mutability of imperative/OO languages such as C#.
Typical Builders
The idea of the standard builder is pretty prevalent in most applications we see today with fluent interfaces. Take for example most Inversion of Control (IoC) containers when registering types and so on:
UnityContainer container = new UnityContainer();
container
.RegisterType<ILogger, DebugLogger>("logger.Debug")
.RegisterType<ICustomerRepository, CustomerRepository>();
Let's take a naive medical claims processing system and building up and aggregate root of a claim. This claim contains such things as the claim information, the lines, the provider, recipient and so on. This is a brief sample and not meant to be the real thing, but just a quick example. After all, I'm missing things such as eligibility and so on.
public class Claim
{
public string ClaimId { get; set; }
public DateTime ClaimDate { get; set; }
public List<ClaimLine> ClaimLines { get; set; }
public Recipient ClaimRecipient { get; set; }
public Provider ClaimProvider { get; set; }
}
public class ClaimLine
{
public int ClaimLineId { get; set; }
public string ClaimCode { get; set; }
public double Quantity { get; set; }
}
public class Recipient
{
public string RecipientId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class Provider
{
public string ProviderId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
Now our standard builders use method chaining as shown below. As you note, we'll return the instance each and every time.
public class ClaimBuilder
{
private string claimId;
private DateTime claimDate;
private readonly List<ClaimLine> claimLines = new List<ClaimLine>();
private Provider claimProvider;
private Recipient claimRecipient;
public ClaimBuilder() {}
public ClaimBuilder WithClaimId(string claimId)
{
this.claimId = claimId;
return this;
}
public ClaimBuilder WithClaimDate(DateTime claimDate)
{
this.claimDate = claimDate;
return new ClaimBuilder(this);
}
public ClaimBuilder WithClaimLine(ClaimLine claimLine)
{
claimLines.Add(claimLine);
return this;
}
public ClaimBuilder WithProvider(Provider claimProvider)
{
this.claimProvider = claimProvider;
return this;
}
public ClaimBuilder WithRecipient(Recipient claimRecipient)
{
this.claimRecipient = claimRecipient;
return this;
}
public Claim Build()
{
return new Claim
{
ClaimId = claimId,
ClaimDate = claimDate,
ClaimLines = claimLines,
ClaimProvider = claimProvider,
ClaimRecipient = claimRecipient
};
}
public static implicit operator Claim(ClaimBuilder builder)
{
return new Claim
{
ClaimId = builder.claimId,
ClaimDate = builder.claimDate,
ClaimLines = builder.claimLines,
ClaimProvider = builder.claimProvider,
ClaimRecipient = builder.claimRecipient
};
}
}
What we have above is a violation of the CQS because we're mutating the current instance as well as returning a value. Remember, that CQS states:
- Commands - Methods that perform an action or change the state of the system should not return a value.
- Queries - Return a result and do not change the state of the system (aka side effect free)
Immutable Builders or ObjectMother or Cloning?
When we're looking to reuse our builders, the last thing we'd want to do is allow mutation of the state. So, if I'm working on the same provider and somehow change his eligibility, then that would be reflected against all using the same built up instance. That would be bad. We have a couple options here really. One would be to follow an ObjectMother approach to build up shared ones and request a new one each time, or the other would be to enforce that we're not returning this each and every time we add something to our builder. Or perhaps we can take one at a given state and just clone it. Let's look at each.
public static class RecipientObjectMother
{
public static RecipientBuilder RecipientWithLimitedEligibility()
{
RecipientBuilder builder = new ProviderBuilder()
.WithRecipientId("xx-xxxx-xxx")
.WithFirstName("Robert")
.WithLastName("Smith")
// More built in stuff here for setting up eligibility
return builder;
}
}
This allows me to share my state through pre-built builders and then when I've finalized them, I'll just call the Build method or assign them to the appropriate type. Or, I could just make them immutable instead and not have to worry about such things. Let's modify the above example to take a look at that.
public class ClaimBuilder
{
private string claimId;
private DateTime claimDate;
private readonly List<ClaimLine> claimLines = new List<ClaimLine>();
private Provider claimProvider;
private Recipient claimRecipient;
public ClaimBuilder() {}
public ClaimBuilder(ClaimBuilder builder)
{
claimId = builder.claimId;
claimDate = builder.claimDate;
claimLines.AddRange(builder.claimLines);
claimProvider = builder.claimProvider;
claimRecipient = builder.claimRecipient;
}
public ClaimBuilder WithClaimId(string claimId)
{
ClaimBuilder builder = new ClaimBuilder(this) {claimId = claimId};
return builder;
}
public ClaimBuilder WithClaimDate(DateTime claimDate)
{
ClaimBuilder builder = new ClaimBuilder(this) { claimDate = claimDate };
return builder;
}
public ClaimBuilder WithClaimLine(ClaimLine claimLine)
{
ClaimBuilder builder = new ClaimBuilder(this);
builder.claimLines.Add(claimLine);
return builder;
}
public ClaimBuilder WithProvider(Provider claimProvider)
{
ClaimBuilder builder = new ClaimBuilder(this) { claimProvider = claimProvider };
return builder;
}
public ClaimBuilder WithRecipient(Recipient claimRecipient)
{
ClaimBuilder builder = new ClaimBuilder(this) { claimRecipient = claimRecipient };
return builder;
}
// More code here for building
}
So, what we've had to do is provide a copy-constructor to initialize the object in the right state. And here I thought I could leave those behind since my C++ days. After each assignment, I then create a new ClaimBuilder and pass in the current instance to initialize the new one, thus copying over the old state. This then makes my class suitable for sharing. Side effect free programming is the way to do it if you can. Of course, realizing that it creates a few objects on the stack as you're initializing your aggregate root, but for testing purposes, I haven't really much cared.
Of course I could throw Spec# into the picture once again as enforcing immutability on said builders. To be able to mark methods as being Pure makes it apparent to both the caller and the callee what the intent of the method is. Another would be using NDepend as Patrick Smacchia talked about here.
The other way is just to provide a clone method which would just copy the current object so that you can go ahead and feel free to modify a new copy. This is a pretty easy approach as well.
public ClaimBuilder(ClaimBuilder builder)
{
claimId = builder.claimId;
claimDate = builder.claimDate;
claimLines.AddRange(builder.claimLines);
claimProvider = builder.claimProvider;
claimRecipient = builder.claimRecipient;
}
public ClaimBuilder Clone()
{
return new ClaimBuilder(this);
}
Conclusion
Obeying the CQS is always an admirable thing to do especially when managing side effects. Not all of the time is it required such as with builders, but if you plan on sharing these builders, it might be a good idea to really think hard about the side effects you are creating. As we move more towards multi-threaded, multi-machine processing, we need to be aware of our side effecting a bit more. But, at the end of the day, I'm not entirely convinced that this violates the true intent of CQS since we're not really querying, so I'm not sure how much this is buying me. What are your thoughts?
-
Side Effecting Functions Are Code Smells Revisited
After talking with Greg Young for a little this morning, I realized I missed a few points that I think need to be covered as well when it comes to side effecting functions are code smells. In the previous post, I talked about side effect free functions and Design by Contract (DbC) features in regards to Domain Driven Design. Of course I had to throw the requisite Spec# plug as well for how it handles DbC features in C#.
Intention Revealing Interfaces
Let's step back a little bit form the discussion we had earlier. Let's talk about good design for a second. How many times have you seen a method and had no idea what it did or that it went ahead and called 15 other things that you didn't expect? At that point, most people would take out .NET Reflector (a Godsend BTW) and dig through the code to see the internals. One of the examples of the violators was the ASP.NET Page lifecycle when I first started learning it. Init versus Load versus PreLoad wasn't really exact about what happened where, and most people have learned to hate it.
In the Domain Driven Design world, we have the Intention Revealing Interface. What this means is that we need to name our classes, methods, properties, events, etc to describe their effect and purpose. And as well, we should use the ubiquitous language of the domain to name them appropriately. This allows other team members to be able to infer what that method is doing without having to dig in with such tools as Reflector to see what it is actually doing. In our public interfaces, abstract classes and so on, we need to specify the rules and the relationships. To me, this comes back again to DbC. This allows us to not only specify the name in the ubiquitous language, but the behaviors as well.
Command-Query Separation (CQS)
Dr. Bertrand Meyer, the man behind Eiffel and the author of Object-oriented Software Construction, introduced a concept called Command-Query Separation. It states that we should break our functionality into two categories:
- Commands - Methods that perform an action or change the state of the system should not return a value.
- Queries - Return a result and do not change the state of the system (aka side effect free)
Of course this isn't a 100% rule, but it's still a good one to follow. Let's look at a simple code example of a good command. This is simplified of course. But what we're doing is side effecting the number of items in the cart.
public class ShoppingCart
{
public void AddItemToCart(Item item)
{
// Add item to cart
}
}
Should we use Spec# to do this, we could also check our invariants as well, but also to ensure that the number of items in our cart has increased by 1.
public class ShoppingCart
{
public void AddItemToCart(Item item)
ensures ItemsInCart == old(ItemsInCart) + 1;
{
// Add item to cart
}
}
So, once again, it's very intention revealing at this point that I'm going to side effect the system and add more items to the cart. Like I said before, it's a simplified example, but it's a very powerful concept. And then we could talk about queries. Let's have a simple method on a cost calculation service that takes in a customer and the item and calculates.
public class CostCalculatorService
{
public double CalculateCost(Customer c, Item i)
{
double cost = 0.0d;
// Calculate cost
return cost;
}
}
What I'm not going to be doing in this example is modifying the customer, nor the item. Therefore, if I'm using Spec#, then I could mark this method as being [Pure]. And that's a good thing.
The one thing that I would hold an exception for is fluent builders. Martin Fowler lays out an excellent case for them here. Not only would we be side effecting the system, but we're also returning a value (the builder itself). So, the rule is not a hard and fast one, but always good to observe. Let's take a look at a builder which violates this rule.
public class CustomerBuilder
{
private string firstName;
public static CustomerBuilder New { get { return new CustomerBuilder(); } }
public CustomerBuilder WithFirstName(string firstName)
{
this.firstName = firstName;
return this;
}
// More code goes here
}
To wrap things up, things are not always fast rules and always come with the "It Depends", but the usual rule is that you can't go wrong with CQS.
Wrapping It Up
These rules are quite simple for revealing the true intent of your application while using the domain's ubiquitous language. As with anything in our field, it always comes with a big fat "It Depends", but applying the rules as much as you can is definitely to your advantage. These are simple, yet often overlooked scenarios when we design our applications, yet are the fundamentals.