Archives

Archives / 2008 / May

The Unit Testing Story in F# Revisited

Thursday, May 29, 2008
F# TDDBDD
24 Comments

Last week I posted about some troubles I was having with the unit testing frameworks for F#. Today, Brad Wilson announced the release of xUnit.net 1.0.1 which addressed the change in the F# compiler as well as integration with ASP.NET MVC Preview 3 which was just released. As always you can find the latest bits on CodePlex. There was a change in the way F# was compiling the modules as static classes which was not expected in previous versions.

Running the Tests Again

Now, I'm able to run my functions just as before and the runner will now recognize them. Below is just a simple example of some unit tests to determine whether numbers are prime or not. I'm extending the System.Int32 to add a property to the instance to determine whether it is prime. Just to prove a point on how flexible F# really is, I'm also able to extend the Int32 class using static methods, something that you cannot do with C# and extension methods. More and more, I love the language itself and finding myself trapped sometimes by the limits of C#. But, that's sidetracking, so let's get to the unit tests.

#light

#R @"D:\Tools\xunit-1.0.1\xunit.dll"

open Xunit

let isPrimeNumber(i) =
let limit = int(sqrt(float(i)))
let rec check j =
    j > limit or (i % j <> 0 && check(j + 1))
check 2

type System.Int32 with
member i.IsPrime
    with get () = isPrimeNumber(i)

static member IsPrimeNumber(x) =
    isPrimeNumber(x)

[<Fact>]
let IsPrime_WithPrimeNumber_ShouldReturnTrue() =
Assert.True((7).IsPrime)

[<Fact>]
let IsPrime_WithNonPrimeNumber_ShouldReturnFalse() =
Assert.False((21).IsPrime)
Assert.False(System.Int32.IsPrimeNumber(45))

And then when I run it through the GUI runner, I sure enough get two passing tests. It was asked of me last week at the Philly ALT.NET meeting about TDD with F# and I see no problem with this at all, and in fact I actively encourage it. But, you have to think about this in a different light when talking about objects and behaviors, and then turning around to functions and behaviors.

Getting Going with Gallio

As I mentioned last time, Jeff Brown has been hard at work to support the F# community as well. I was able to get the right build going of Gallio finally after there may have been some mixups with getting the latest code. Anyhow, I am now able to get these same tests to work, but using MbUnit version 3 and through the Gallio Icarus Runner. If you're not familiar with Gallio, it is an open platform of tools and runners that is extensible to all testing frameworks. Jeff Brown talked about it on Hanselminutes with Brad Wilson of xUnit.net fame, Roy Osherove and Charlie Poole of NUnit on the Past, Present and Future of Unit Testing Frameworks.

So, let's just modify the above code to migrate to Gallio with MbUnit version 3 and see how we do:

#light

#R @"D:\Program Files\Gallio\bin\Gallio.dll"
#R @"D:\Program Files\Gallio\bin\MbUnit.dll"

open MbUnit.Framework

let isPrimeNumber(i) =
let limit = int(sqrt(float(i)))
let rec check j =
    j > limit or (i % j <> 0 && check(j + 1))
check 2

type System.Int32 with
member i.IsPrime
    with get () = isPrimeNumber(i)

static member IsPrimeNumber(x) =
    isPrimeNumber(x)

[<Test>]
let IsPrime_WithPrimeNumber_ShouldReturnTrue() =
Assert.IsTrue((7).IsPrime)

[<Test>]
let IsPrime_WithNonPrimeNumber_ShouldReturn_False() =
Assert.IsFalse((21).IsPrime)
Assert.IsFalse(System.Int32.IsPrimeNumber(45))

And we can notice through the Gallio runner that it's only detecting the MbUnit tests right now, unfortunately. Hopefully that issue will be resolved soon.

BDD Specs in F#?

F# is a pretty flexible language for unit testing and even BDD style. I wonder if we could take some lessons from the spec BDD framework for Scala and apply to F#. Just a thought...

If you're not familiar with specs, it's a BDD framework with some interesting syntax that I'm still coming to terms with. But the concept looks interesting. Take a look at a quick example and see if it speaks to you.

package podwysocki.specs

object scalaSpecExample extends Specification {
"A hello world spec" should {
    "return something" in {
       "hello" mustBe "hello"
    }
}
}

As I've played around with Scala, this is a pretty interesting concept. I'm much more a fan of F# as a language, but still there are some interesting pieces to Scala. I'm also interested in using MSpec from Aaron Jensen at some point, but have a bit on my plate and other points of focus right now.

Wrapping It Up

In the mean time, we have another testing framework to consider. Me, personally, I prefer xUnit.net because of the functional aspects of Assert.Throws and so on. But, that option is up to you quite frankly. There is a good story to be told here with regards to unit testing and F# that is not to be overlooked.

Read more...
Static versus Dynamic Languages - Attack of the Clones

Wednesday, May 28, 2008
C# F# Ruby Spec#
No Comments

Very recently there has been an ongoing debate between static and dynamically typed languages. Since it seems that there has been some Star Wars references, I thought I'd add my own. I originally wanted to cover this as part of the future of C#, but I think it deserves its own topic. There have been many voices in the matter and I've read all sides and thought I'd weigh in on the matter. I find myself with my feet right now in the statically typed community right now. I do appreciate dynamic typing and it definitely has its use, but to me the static verification is a key aspect. But, of course I do appreciate dynamic languages, especially those of the past including Lisp, Erlang, etc.

Here are some of the salvos that have been fired so far:
The Salvos Fired

First, Steve Yegge posted a transcript from his talk at Stanford called "Dynamic Languages Strike Back". In this talk, he talks about the history of dynamic languages, the performance, what can be done, and the politics of it all. But at the end of the day, it comes down to the tools used. It was a pretty interesting talk, but of course dredge up some pretty strong feelings. In turn, you had responses from Cedric Beust coming out in favor of statically typed languages, and Ted Neward, Ola Bini and Greg Young analyzing the results of the two of them. I won't get into the me too aspect of it all, but I encourage you to read the posts, but also the responses as well.

I think Cedric lost me on the argument though is when he brought Scala into the argument. To me, it was kind of nonsensical to mention it in this case. And to mention that pattern matching is a leaky abstraction is unfortunate and I think very wrong. The thing that functional languages give us is the ability to express what we want, and not necessarily how to get it. Whether it puts it in a switch statement, an if statement, or anything else doesn't matter, as long as the decision tree was followed. I don't see any leakiness here. So, that was a bad aside on there. I'm not a huge fan of Scala either, but for entirely different reasons. First off, the type inference isn't really as strong as it should be and the syntax to me just doesn't seem to be as functional as I'd like. F# and Scala tackle the problems in vastly different ways.

Ola Bini, who has been advocating the polyglot programmer for some time, summed up the Steve versus Cedric posts very concisely in these two paragraphs:

So let's see. Distilled, Steve thinks that static languages have reached the ceiling for what's possible to do, and that dynamic languages offer more flexibility and power without actually sacrificing performance and maintainability. He backs this up with several research papers that point to very interesting runtime performance improvement techniques that really can help dynamic languages perform exceptionally well.

On the other hand Cedric believes that Scala is bad because of implicits and pattern matching, that it's common sense to not allow people to use the languages they like, that tools for dynamic languages will never be as good as the ones for static ones, that Java generics isn't really a problem, that dynamic language performance will improve but that this doesn't matter, that static languages really hasn't failed at all and that Java is still the best language of choice, and will continue to be for a long time.

It seems that many of the modern dynamic languages are pretty flexible, but also not as performance oriented as the ones in the past. Why is this? It's a good question to ask. And what can be done about it? Of course Ola takes the tact, and I think correctly so that the tooling won't be the same or as rich for dynamic languages as it is for statically typed. It simply can't be. But that doesn't mean that it needs those tools won't exist, they'll just be different. But at the end, Ola argues for the polyglot programmer and each language to its strength. He talks a bit more about this with Mike Moore on the Rubiverse podcast here.

Impedance Mismatch?

There was a topic discussed at the ALT.NET Open Spaces, Seattle fishbowl on the polyglot programmer which talked about the impedance mismatch between statically typed languages and dynamic ones. What's great is that Greg Young got together a session with Rustan Leino and Mike Barnett from Microsoft Research on the Spec# team, John Lam from the IronRuby team, and me. It was a great discussion which revolved around the flexibility that dynamic languages give you versus the static verification that you lack when you do that. And there is a balance to be had. When you look at that flexibility that Ruby and other dynamic languages give you, also creates a bit more responsibility for ensuring its correctness. It's a great conversation and well worth the time invested. But one of the benefits we're seeing from CLR and in turn the DLR is the interop story so that you could have your front end be Ruby, service layer in C#, rules engine in F#, Boo for configuration and so on.

Anders Hejlsberg on C# And Statically Typed Languages

As I noted earlier, Anders Hejlsberg was on Software Engineering Radio Episode 97 to discuss the future of C#. Although Anders has his foot firmly in the statically typed camp, he sees the value of dynamic dispatch. The phrase that was used and quite apt was "Static Programming but Dynamically Generated". I think the metaprogramming story in C# needs to be improved for this to happen. Doing Reflection.Emit isn't the strongest story for doing this, and certainly not easy.

Where I think that C# can go however is more towards making DSL creation much easier. Boo, F# and others on the .NET platform are statically typed, yet go well beyond what C# can do in this arena. Ayende has been doing a lot with Boo and making the language, although statically typed, very flexible and readable. Ruby has a pretty strong story here and C# and other languages have some lessons it can learn.

Another example is that Erlang is a dynamic language, yet very concurrent and pretty interesting. C# and other .NET languages can learn a bit from Erlang. I'm not sure Erlang itself will be taking off, as it would need some sort of sponsorship and some better frameworks before it could. F# has learned some of those lessons in terms of messaging patterns, but no in terms of recovery and process isolation just yet. I covered a bit of that on my previous post.

Wrapping It Up

It's a pretty interesting debate, and at the end of the day, it really comes down to what language meets your needs. The .NET CLR has a pretty strong story of allowing other languages to interoperate that nicely compliments the polyglot. But, I don't think that static typing is going the way of the dodo and I also don't think dynamic typing will win the day. Both have their places. Sounds like a copout, I know, but deal with it. I have a bit more to discuss on this matter, especially about learning lessons from Erlang, one of the more interesting languages that has seen a resurgence lately.

Read more...
DC ALT.NET May Wrapup - Common Lisp and Applying Lessons Learned

Saturday, May 24, 2008
ALT.NET Domain Specific Languages Functional Programming User Groups
1 Comment

Last night's DC ALT.NET meeting was a great success. We had Craig Andera, of PluralSight and FlexWiki fame, talk to us about Common Lisp and some of the lessons he learned. It was great to see the guys from the FringeDC group join us as well. I can definitely see a lot of overlap between the two groups as we both struggle to find new and innovative ideas for solving our hardest problems. We tend to look outside of our community to find what has worked and what hasn't worked for each community. Because at the end of the day, we're all developers with just different backgrounds. I want to thank the Motley Fool for hosting the event for us, and we'd love to come back if you'll have us.

The Presentation

Craig spent much of his summer vacation actually learning Common Lisp. The original idea was to learn Ruby, but why not go back to the grandfather of them all, Lisp. I remembered Lisp from back in the college years, but had forgotten most of it. It's very interesting to watch the presentation and learn how flexible of a language it is since we're just dealing with expression trees. I can definitely see where other languages got their heritage. All good things tend to come back to Lisp and SmallTalk!

Anyhow, the important features we covered were object oriented programming with Common Lisp Object System (CLOS), macros, defining properties and methods, and even the .NET interop story with IronScheme and others. It was a great time and I learned a lot. If you wish to grab his presentation notes, you can find them here.

Next time, I'll be presenting F#, so we'll keep on the functional programming style with some OOP mixed in, so I hope we see another great crowd for that. As always, join the mailing list here if you want to learn more.

Read more...
What Is the Future of C# Anyways?

Saturday, May 24, 2008
.NET C# F# Spec#
3 Comments

It was often asked during some of my presentations on F# and Functional C# about the future direction of C# and where I think it's going. Last night I was pinged about this with my F# talk at the Philly ALT.NET meeting. The question was asked, why bother learning F#, when eventually I'll get these things for free once they steal it and bring it to C#. Being the language geek that I am, I'm pretty interested in this question as well. Right now, the language itself keeps evolving at a rather quick pace as compared to C++ and Java. And we have many developers that are struggling to keep up with the evolution of the language to a more functional style with LINQ, lambda expressions, lazy evaluation, etc. There are plenty of places to go with the language and a few questions to ask along the way.

An Interview With Anders Hejlsberg

Recently on Software Engineering Radio, Anders Hejlsberg was interviewed about the past, present and future of C# on Episode 97. Of course there are interesting aspects of the history of his involvement with languages such as Tubro Pascal and Delphi and some great commentary on Visual Basic and dynamic languages as well. But, the real core of the discussion was focused around what problems are the ones we need to solve next? And how will C# handle some of these features? Will they be language constructs or built-in to the framework itself?

Let's go through some of the issues discussed.

Concurrency Programming

Concurrency programming is hard. Let's not mince words about it. Once we start getting into multiple processors and multiple cores, this becomes even more of an issue. Are we using the machine effectively? It's important because with the standard locks/mutexes it's literally impossible to have shared memory parallelism with more than two processors without at some point being blocking and serial.

The way things are currently designed in the frameworks and the languages themselves are not designed for concurrency to make it easy. The Erlang guys of course would disagree since they started with that idea from the very start. Since things are sandboxed to a particular thread, they are free to mutate state to their heart's content, and then when they need to talk to another process, they pick up the data and completely copy it over, so there is a penalty for doing so. Joe Armstrong, the creator of Erlang, covered a lot of these things in his Erlang book "Programming Erlang: Software for a Concurrent World ".

Mutable State

Part of the issue concerning concurrency is the idea of mutable state. As far back as I remember, we were always taught in CS classes that you can feel free to mutate state as need be. But, that only really works when you've got a nicely serial application where A calls B calls C calls D and all on the same thread. But, that's a fairly limiting thing idea as we start to scale out to multiple threads, machines and so on. Instead, we need to focus on the mutability and control it in a meaningful way through not only the use of our language constructs, but our design patterns as well.

In the C# world, we have the ability to create opt-in immutability through the use of the readonly keyword.   This is really helpful to decide those fields that we don't really need to or want to modify. This also helps the JIT better determine the use of our particular variable. I'm not sure about performance gains, but that's not really the point of it all, anyways. Take the canonical example of the 2D point such as this:

public class Point2D
{
    private readonly double x;
    private readonly double y;

    public Point2D() { }

    public Point2D(double x, double y)
    {
        this.x = x;
        this.y = y;
    }

    public double X { get { return x; } }

    public double Y { get { return y; } }

    public Point2D Add(Size2D size)
    {
        return new Point2D(x + size.Height, y + size.Width);
    }
}

We've created this class as to not allow for mutable state, instead returning a new object that you are free to work with. This of course is a positive thing. But, can we go further in a language than just this? I think so, and I think Anders does too. Spec# and design by contract can take this just a bit further in this regard. What if I can state that my object, as it is, is immutable? That would certainly help the compiler to optimize. Take for example doing Value Objects in the Domain Driven Design world. How would something like that look? Well, let's follow the Spec# example and mark my class as being immutable, meaning that once I initialize it, I cannot change it for any reason:

[Immutable]
public class Point2D
{
   // Class implementation the same
}

This helps make it more transparent to the caller and the callee that what you have cannot be changed. This enforces the behaviors for my member variables in a pretty interesting way. Let's take a look at the actual C# generated in Spec# for the above code. I'll only paste the relevant information about what it did to the properties. I'll only look at the X, but the identical happened for the Y as well.

public double X
{
    [Witness(false, 0, "", "0", ""), Witness(false, 0, "", "1", ""), Witness(false, 0, "", "this@ClassLibrary1.Point2D::x", "", Filename=@"D:\Work\SpecSharpSamples\SpecSharpSamples\Class1.ssc", StartLine=20, StartColumn=0x21, EndLine=20, EndColumn=0x22, SourceText="x"), Ensures("::==(double,double){${double,\"return value\"},this@ClassLibrary1.Point2D::x}", Filename=@"D:\Work\SpecSharpSamples\SpecSharpSamples\Class1.ssc", StartLine=20, StartColumn=20, EndLine=20, EndColumn=0x17, SourceText="get")]
    get
    {
        double return value = this.x;
        try
        {
            if (return value != this.x)
            {
                throw new EnsuresException("Postcondition 'get' violated from method 'ClassLibrary1.Point2D.get_X'");
            }
        }
        catch (ContractMarkerException)
        {
            throw;
        }
        double SS$Display Return Local = return value;
        return return value;
    }
}

What I like about F# and functional programming is the opt-out mutability, which means by default, my classes, lists, structures and so on are immutable by default. So, this makes you think long and hard about any particular mutability you want to introduce into your program. It's not to say that there can be no mutability in your application, but on the other hand, you need to think about it, and isolate it in a meaningful manner. Haskell takes a more hardline stance on the issue, and mutability can only occur in monadic expressions. If you're not aware of what those are, check out F# workflows which are perfectly analogous. But by default, we get code that looks like this and is immutable:

type Point2D = class
val x : double
val y : double

new() = { x = 0.0; y = 0.0 }

new(x, y) =
    {
      x = x
      y = y
    }

member this.X
    with get() = this.x

member this.Y
    with get() = this.y
end

So, as you can see, I'm not having to express the immutability, only the mutability if I so choose. Very important differentiator.

Method Purity

Method purity is another important topic as we talk about concurrent programming and such. What I mean by this is that I'm not going to modify the incoming parameters or cause some side effects, and instead I will produce a new object instead. This has lasting effects if I'm going to be doing things on other threads. Eric Evans talked about this topic briefly in his Domain Driven Design book on Supple Design. The idea is to have side effect free functions as much as you can, and carefully control where you mutate state through intention revealing interfaces and so on.

But, how do you communicate this? Well, Command-Query Separation gets us part of the way there. That's the idea of having the mutation and side effects in your command functions where you return nothing, and then queries which return data but do not modify state. Spec# can enforce this behavior as well. To be able to mark our particular functions as being pure is quite helpful in communicating whether I can expect a change in state. Therefore I know whether I have to manage the mutation in some special way. To communicate something like that in Spec#, all I have to do is something like this:

[Pure]
public Point2D Add(Size2D size)
    requires size != null;
{
    return new Point2D(x + size.Height, y + size.Width);
}

This becomes part of the method contract and some good documentation as well for your system.

Asynchronous Communication and Messaging

Another piece of interest is messaging and process isolation. The Erlang guys figured out a while ago, that you can have mutation as well as mass concurrency, fail safety and so on with process isolation.   Two ideas come to mind from other .NET languages. An important distinction must be made between concurrency memory models between shared-memory and message passing concurrency. Messaging and asynchronous communication are key foundations for concurrent programming.

In F#, there is support for the mailbox processing messaging. This is already popular in Erlang, hence probably where the idea came from. The idea is that a mailbox is a message queue that you can listen to for a message that is relevant to the agent you've defined. This is implemented in the MailboxProcessor class in the Microsoft.FSharp.Control.Mailboxes namespace. Doing a simple receive is pretty simple as shown here:

#light

#nowarn "57"

open Microsoft.FSharp.Control.CommonExtensions
open Microsoft.FSharp.Control.Mailboxes

let incrementor =
new MailboxProcessor<int>(fun inbox ->
    let rec loopMessage(n) =
      async {
              do printfn "n = %d" n
              let! message = inbox.Receive()
              return! loopMessage(n + message)
            }
    loopMessage(0))

Robert Pickering has more information about the Erlang style message passing here.

Now, let's come back just a second. Erlang also introduces another concept that Sing# and the Singularity OS took up. It's a concept called the Software Isolated Process (SIP). The idea is to isolate your processes in a little sandbox. Therefore if you load up a bad driver or something like that, the process can die and then spin up another process without having killed the entire system. That's a really key part of Singularity and quite frankly one of the most intriguing. Galen Hunt, the main researcher behind this talked about this on Software Engineering Radio Episode 88. He also talks about it more here on Channel9 and it's well worth looking at. You can also download the source on CodePlex and check it out.

Dynamic C#?

As you can probably note, Anders is pretty much a static typing fan and I'd have to say that I'm also firmly in that camp as well. But, there are elements that are intriguing such as metaprogramming and creating DSLs which are pretty weak in C# as of now. Sure, people are trying to bend C# in all sorts of interesting ways, but it's not a natural fit as the language stands now. So, I think there can be some improvements here in some areas.

Metaprogramming

Metaprogramming is another area that was mentioned as a particularly interesting aspect. As of right now, it's not an easy fit to do this with C#. But once again, F# has many of these features built-in to do such things as quotations to do some metaprogramming because that's what it was created to do, a language built to create other languages. Tomas Petricek is by far one of the authorities on the subject as he has leveraged it in interesting ways to create AJAX applications. You can read about his introduction to metaprogramming here and his AJAX toolkit here. Don Syme has also written a paper about leveraging Meta-programming with F# which you can find here. But I guess I have to ask the question, does C# need this or shouldn't we just use F# for what it's really good at and not shoehorn yet another piece onto the language? Or the same could be said of Ruby and its power with metaprogramming as well, why not use the best language for the job?

Dynamic Dispatch

The idea of dynamic dispatch is an interesting idea as well. This is the idea that you can invoke a method on an object that doesn't exist, and instead, the system figures out where to send it. In Ruby, we have the method_missing concept which allows us to define that behavior when that method that is being invoked is not found. Anders thought it was an intriguing idea and it was something to look at. This might help in the creation of DSLs as well when you can define that behavior even though that method may not exist at all.

In the Language or the Framework?

Another good question though is do these features belong in the language itself or the in the framework? The argument here is that if you somehow put a lot of constraints on the language syntax, then you might prematurely age the language and as a result, decline in usage. Instead, the idea is to focus on the libraries to make these things available. For example, the MailboxProcessor functionality being brought to all languages might not be a bad idea. Those sorts of concepts around process isolation would be more of a framework concept than a language concept. But, it's an interesting debate as to what belongs where. Because at the end of the day, you do need some language differentiation when you use C#, F#, Ruby, Python, C++, etc or else what's the point of having all of them? To that point I've been dismayed that VB.NET and C# have mirrored themselves pretty well and tried to make themselves equal and I wish they would just stop. Let VB find a niche and let C# find its niche.

Conclusion

Well, I hope this little discussion got you thinking as well about the future of C# and the future of the .NET framework as well. What does C# need in order to better express the problems we are trying to solve? And is it language specific or does it belong in the framework? Shouldn't we just use the best language for the job instead of everything by default being in C#? Good questions to answer, so now discuss...

Read more...
F# and Unit Testing - Some New Developments

Tuesday, May 20, 2008
F# TDDBDD
2 Comments

This past week, I've been focusing a lot of my attention on F# in terms of my presentations that I have been giving. I'm busy preparing for the Philly ALT.NET meeting tomorrow night on the very subject. An important aspect of some of the presentation has been unit testing. There is some good news and some not so good news when it comes to this. For those that have been following my pursuit of good unit tests in F# have known that xUnit.net has been a good option for being able to create static unit tests inside my classes instead of the pomp and circumstance of creating a new class and having member functions.

MbUnit Support for F#

Very recently Jeff Brown announced on his blog that he's now supporting tests without the requirement for the TestFixtureAttribute to be marked on your class in MbUnit. This is quite helpful for F# tests and has joined the ranks of xUnit.net in terms of giving me another tool in my toolbelt. There were other bugs that were filed that also were hindering good unit testing in F# that have been worked out as well.

So, I should be able to do this below and everything should just work:

#light

#R @"D:\Program Files\Gallio\bin\MbUnit2\MbUnit.Framework.dll"
open MbUnit.Framework

let FilterCall protocol port =
match(protocol, port) with
| "tcp", _ when port = 21 || port = 23 || port = 25 -> true
| "http", _ when port = 80 || port = 8080 -> true
| "https", 443 -> true
| _ -> false

[<Test>]
let FilterCall_WithHttpAndPort80_ShouldReturnTrue() =
Assert.IsTrue(FilterCall "http" 80)

But... this, is not the case. It doesn't recognize that my tests exist. Why?

The New F# Release

With the newest release of F#, version 1.9.4.15, there was a change made that took the classes that encapsulated the tests and made it a static class. So, if I were to look through .NET Reflector, it would look like this:

[CompilationMapping(SourceLevelConstruct.Module)]
public static class MbUnitTesting
{
    // Methods
    public static bool FilterCall(string protocol, int port) /// Method under test

    [Test]
    public static void FilterCall_WithHttpAndPort80_ShouldReturnTrue()
    {
        Assert.IsTrue(FilterCall("http", 80));
    }
}

This can be a problem, due to the fact that through reflection, any static class is marked abstract due to the fact you cannot create an instance of these classes. This is a problem for the unit testing frameworks which cannot process abstract classes, yet. So, this is a work in progress, but there has to be some strategy to get around this, as we have no way in reflection to determine if it is a static class easily.

The Workaround

The workaround for the issue is pretty simple, which is to actually use classes when creating your unit tests in F#. I know it's a little bit of a pain, but the unit testing teams are aware of the issue and hopefully we'll have a fix soon enough. But, in the mean time, we'll have to create the classes such as this in MbUnit:

[<TestFixture>]
type MbUnitTests = class
new() = {}

[<Test>]
member x.FilterCall_WithHttpAndPort80_ShouldReturnTrue() =
    Assert.IsTrue(FilterCall "http" 80)

end

or in xUnit.net

type XUnitTests = class
new() = {}

[<Fact>]
member x.FilterCall_WithHttpAndPort80_ShouldReturnTrue() =
    Assert.True(FilterCall "http" 80)

end

Then the Gallio Icarus Runner is free to pick up the results and runs as expected. Like I said, hopefully the issue will be fixed soon.

Read more...
NoVA Code Camp Wrapup and Thoughts

Monday, May 19, 2008
C# F# Functional Programming User Groups
1 Comment

This past weekend was the Northern Virginia Code Camp in Reston, Virginia. There was a pretty good turnout for my two sessions which were the first two of the day. Unfortunately, I could not stay the whole day to attend some of the other sessions including fellow DC ALT.NET'er John Morales on NServiceBus, so I'll have to catch it soon enough because the ideas around it are pretty intriguing and I've played with TIBCO and a few others, so another tool in my toolbelt is not a bad thing. I did two sessions, one of Functional C# and the other was an introduction to F#. I'm not quite ready to post my slides as I have a few more presentations on the subject to give and I'm still tweaking them as I go, so they will be a bit more refined.

Lessons Learned For Me

Some of the things I came away with is that I need to schedule a little better. I would have much preferred to have the F# and Foundations of Functional Programming talk come first as it would give people more of a basis of what functional programming is and how it is expressed in a more pure functional language in F#. Next time I should be a bit more upfront about this and get the schedule changed accordingly. Two sessions in a row is a situation which could be improved as well.

Functional C#

The first talk I gave was on Functional C#. This was to take the ideas of the more pure functional programming of Haskell, OCaml and F#. To bring these ideas and apply them in a C# ish manner. Some of the things in functional programming languages such as pattern matching isn't an easy concept, so, there are things that can apply and some things that don't.

Some of these lessons include:
- Immutable types
  Focus on immutable types and opt-out mutability instead of mutable by default and opt-in immutability such is the case in C# versus F#. Remember, I've been talking about this in context of multi-threaded, parallel programming where this is absolutely crucial to mutate in very controlled circumstances, putting them in isolation. This also applies to the Domain Driven Design world where I was coming from in regards to Value Objects and supple design.
- Side Effect Free (Pure) Functions
  The idea here is to control the side effecting in your system. Ideally in the functional programming world, when you call such a function as myList |> List.map (fun x -> x * x) will return another list and not the list you gave it with mutated state. This is important once again as we get into the concurrent programming paradigm to focus on method purity and follow the Command Query Separation (CQS) principle. Once again, this has roots in Domain Driven Design as well when following Intention Revealing Interfaces and Supple Design.
- Functions as First Class Types
  The delegate in the .NET world has made the function pointer a first class citizen. With the use of extension methods, generics and lambda expressions, we are now able to take full advantage of performing such critical computations as Reduce, Filter, Map and other High Order function operations. Other areas in this realm include Currying and partial application of functions.
- Lazy Evaluation
  In functional programming we have the ability to specify infinite ranges, such as all Fibonacci numbers or some other number sequence. The last thing we'd want is to evaluate that and get the length. Haskell takes the approach of be lazy by default. But that's not practical in a framework like .NET when we want deterministic behavior in the execution order of our code. So, instead, languages like C# and F# are eager evaluators. But, that doesn't mean we cannot take advantage. In fact, when we talk about .NET 2.0 and beyond, such things as IEnumerable<T> is a somewhat lazy execution model when we only calculate when we call the MoveNext() function and so on for each value in the collection. So, when you think about it, LINQ follows that delayed execution model and is pretty powerful for doing large sequences and evaluation.
So, as you can see, there are quite a few lessons the C# developer can learn from functional programming and F#. The key really is when to apply this knowledge and marry the ideas of OOP and FP in a cohesive manner. Speaking of which, Anders Hejlsberg was recently on Software Engineering Radio Episode 97 to talk about the past, present and future of C#. In there, he talked about some of the more functional programming ideas that have been incorporated into C# and a focus on immutability, and how we can make concurrent programming easier. Definitely not time to stick a fork in C# just yet, as I think there are plenty of ideas yet to come to express some of these problems a little bit better. In my ext post, I'll dive a little deeper into Anders' appearance on SE Radio and some of the interesting things going on around static versus dynamic typing.

Introduction to F#

My second talk for the day was on an introduction to functional programming with F#. This was more of my bread and butter presentation on explaining functional programming as I have with my Adventures in F# series. From this presentation, I focused on many of the 101 level aspects of functional programming and how they are implemented in F#. Of course there was some deviation as I explored some of the features that are more library based and exclusive to F# (async workflows, quotations, etc).

Often, the question comes up with the value proposition of F#. Yes, many can get behind many of the ideals of the language and would rather have C# adopt most of these features and not have to learn another language. This to me strikes me as a bit sad that many people are not stretching their wings outside of their comfort zone of the MSDN help files and their language of choice. Learning a new language with a new paradigm is essential to learning. This doesn't mean learning C# coming from VB.NET, but instead, gravitating towards functional programming with a language that fully supports it (F#, OCaml, Haskell, Erlang, Lisp/Scheme, etc), or towards a dynamically typed language (Ruby, Python, etc). Then once you have fully understood and become more fluent in said paradigms, you can learn those lessons and help express your solutions to your problems in more interesting ways.

But, back to F# for a moment here. What is the value of F# and why use it?
- Concurrency Programming Is Hard
  It is hard, and don't let anyone fool you otherwise. With locks, mutexes and so on, it is literally impossible on a dual processor machine to have a concurrent program. Period. Instead, with a focus on immutability, side effect free functions, asynchronous workflows, the ideas of concurrent programming becomes a bit easier. Without the first two, concurrency is quite difficult. Messaging is first class through the use of the Mailbox patterns and lessons learned from Erlang.
- Representing Data Can be Hard
  With the ideas of tuples, records and discriminated unions, F# gives you a powerful new way of representing your data succinctly. Then to be able to use such techniques as pattern matching against them makes for an even more compelling case.
- Creating Other Languages Is Hard
  F# has a firm foundation as a language used to create other languages. With first class support of lexer generators and yacc parsing, tokenizing and parsing becomes a bit easier. Also, the inclusion of quotations as a part of the libraries make it possible for really interesting metaprogramming constructs, such as Tomas Petricek's journey into AJAX and metaprogramming.
Of course there are more than just this simple list of three areas of focus, but the idea is to download it, kick the tires and see if it feels right to you. That's the important part. Spending a good amount of time to become fluent in it will definitely help and there is a thriving community waiting to help. All you have to do is ask...

Teaching Versus Speaking

D'arcy Lussier had an interesting post which took at Scott Bellware tweet about teachers versus speakers. It's a pretty good post, but I enjoy the comments a bit more on the subject. So, when you get up in front of that podium, just think, are you just another speaker, or are you being a teacher? Is it a dialog or death by PowerPoint?

Wrapping It Up

It was another great experience at this code camp, but I think the one hour sessions just aren't enough sometimes to fully get into any particular subject. I sometimes leave a session wanting, not because the presentation wasn't good, but there wasn't enough time to fully express the full intent of it. I could have gone on and on for hours about functional programming and F# for quite some time as I barely scratched the surface. Maybe in the future, there will be a better venue for this, but I hope to get more in depth in future iterations.

Don't forget that I'll be at Philly ALT.NET this Wednesday night for an F# presentation and then Thursday night is the DC ALT.NET meeting in Alexandria on Applying Lessons Learned from Common Lisp with Craig Andera!

Read more...
DC ALT.NET - 5/22/2008 - Applying Lessons Learned from Lisp

Thursday, May 15, 2008
ALT.NET Functional Programming User Groups
1 Comment

The May meeting of DC ALT.NET has been scheduled for May 22nd from 7-9PM. Check out our mailing list and site for more information as it becomes available. If you're in the Washington DC area, come check us out. This month, we're having Craig Andera, of FlexWiki fame, speak about applying lessons learned from learning Lisp and how to be a better programmer because of it. That's one of the true strength's of the DC ALT.NET, or even the ALT.NET movement as a whole, as we look outside our .NET community to the outside world to find better ways to solve problems and apply lessons learned from each community, and Lisp is one of those communities. Dave Laribee, Jeremy Miller and Chad Myers spoke about this on the first episode of the ALT.NET Podcast with Mike Moore. If you haven't listened to it yet, I highly recommend that you do.

Applying Lessons Learned from Lisp

There has been a lot of talk and some hype (deserved and undeserved) around functional programming lately, partly due to looking for ways for expressing parallel applications and multi-core scenarios. Some might find it interesting that functional programming has its roots back in the 1950s, well before Object Oriented Programming, yet has been relegated mostly to the research community mostly.

Back in 1958, John McCarthy from MIT designed Lisp and has been a mainstay in the Artificial Intelligence field for a long time after that. Since that time, there have been quite a few Lisp dialects to pop up due to the fact that many of the universities and labs did not share their information before everyone was connected to ARPANET. Two that have really emerged since then are Common Lisp, an attempt to standardize the Lisp variants into one, and Scheme. Lisp is a strongly typed dynamic language, meaning that if when it is interpreted, the function does not exist, an exception will be thrown. By it's nature, it is a functional language with such elements as lists, lambdas and so on. Some of the interesting additions to Lisp is the Common Lisp Object System (CLOS) which adds OOP functionality to the Common Lisp language. It's a bit different than what we think of OOP in C++, C#, Java and other OO langauges.

In the .NET world, we have IronLisp and IronScheme. IronLisp has been deprecated in favor of IronScheme going forward. That's the beauty of .NET is to build these dynamic languages on top of the DLR with relative ease, truly speaks to how flexible the type system and CLR are. To make OOP and FP first class citizens within the .NET space is also pretty interesting as well.

Back to Lisp, if you want to hear more, you should check out Dick Gabriel's appearance on Software Engineering Radio Episode 84 on Common Lisp. Dick has been a noted authority in the Lisp space for some time and is the organizer for OOPSLA back in 2007. It's one of their better episodes, so I'd encourage you to listen to it. I know I did, but then again, I have a pretty long commute, so I have time to listen to these things.

Who We Are

So, as I said, I run the DC ALT.NET group which meets monthly to discuss ways of bettering ourselves. You won't find us doing what most other user groups do in the area and is more of an intimate environment for learning and discussion. Typically we have the first hour for the topic of discussion, this month being Lisp, and then the second hour is Open Spaces, so it encourages everyone to speak and bring a topic they are passionate about. As always, we're looking for sponsors to help us out along the way. Since we're in the Washington DC area, and traffic can be bad, we tend to move from month to month to accommodate. That may change in the future as we grow, but for now, it works nicely. So, if you're in the DC area, come check us out. And, hopefully I'll get Dave Laribee to stop by before too long as well...

Where I'll Be

In addition to the meeting next week, I will be speaking at the Philly ALT.NET group meeting on May 21st on F# and an introduction to Functional programming. This should be a great session and I hope there will be a good crowd for it. Also, this weekend is the NoVA Code Camp in which I have two sessions, "Introduction to F# and Functional Programming" and "Improve Your C# with Functional Programming Ideas". Look forward to seeing everyone at those events!

Read more...
Concurrency with MPI in .NET

Thursday, May 15, 2008
Concurrency F# Functional Programming
9 Comments

In my previous post, I looked at some of the options we have for concurrency programming in .NET applications. One of the interesting ones, yet specialized is the Message Passing Interface (MPI). Microsoft made the initiative to get into the high performance computing space with the Windows Server 2003 Compute Cluster Server SKU. This allowed developers to run their given algorithms using MPI on a massive parallelized scale. And now with the Windows Server 2008 HPC SKU, it is a bit improved with WCF support for scheduling and such. If you're not part of the beta and are interested, I'd urge you to go through Microsoft Connect.

When Is It Appropriate?

When I'm talking about MPI, I'm talking in the context of High Performance Computing. This consists of having the application run within a scheduler on a compute cluster which can have 10s or hundreds of nodes. Note that I'm not talking about grid computing such as Folding@Home which distributes work over the internet. Instead, you'll find plenty of need for this in the financial sector, insurance sector for fraud detection and data analysis, manufacturing sector for testing and calculating limits, thresholds and whatnot, and even in compiling computer animation in film. There are plenty of other scenarios that are out there, but it's not for your everyday business application.

I think the real value comes with .NET to be able to read from databases, communicate with other servers with WCF or some other communication protocol, instead of being stuck in the C or Fortran world which the HPC market has been relegated. Instead, they can cut down on the code necessary for a lot of these applications by using the built-in functions that we get with the BCL.

MPI in .NET

The problem has been to run these massively parallel algorithms left us limited to Fortran and C systems. This was ok for most things that you would want to do, cobbling together class libraries wasn't my ideal. Instead, we could use a lot of the things that we take for granted in .NET such as strong types, object oriented and functional programming constructs.

The Boost libraries were made available for MPI in C++ very recently by the University of Indiana. You can read more about it here. This allowed the MPI programmer to take advantage of many of the C++ constructs that you can do in regular C, such as OOP. Instead of dealing with functions and structs, there is a full object model for dealing with messaging.

At the same time as the Boost C++ Libraries for MPI were coming out, the .NET implementation has been made available based upon the C++ design through MPI.NET. It's basically a thin veneer over the msmpi.dll which is the Microsoft implementation of the MPICH2 standard. For a list of all operation types supported, check the API documentation here for the raw MSMPI implementation. This will give you a better sense of the capabilities more than the .NET implementation can.

What you can think of this is that several nodes will be running an instance of your program at once. So, if you have 16 nodes assigned through your scheduled job, it will spin up 16 instances of the same application. When you do this on a test machine, you'll notice 16 instances of that in your task manager. Kind of cool actually. Unfortunately, they are missing a lot of the neat features in MPI which includes "Ready Sends", "Buffered Sends", but they have included nice things such as the Graph and Cartesian communicators which are essential in MPI.

You'll need the Windows Server 2003/2008 HPC SDK in order to run these examples, so download them now, and then install MPI.NET to follow along.

Messaging Patterns

With this, we have a few messaging patterns available to us. MPI.NET has given us a few that we will be looking at and how best to use them. I'll include samples in F# as it's pretty easy to do and I'm trying to get through on the fact that F# is a better language for expressing the messaging we're doing instead of C#. But, for these simple examples, they are not hard to switch back and forth.

To execute these, just type the following:

mpiexec - n <Number of Nodes You Want> <Your program exe>

Broadcast

A broadcast is a a process in which a single process (ala a head node) sends the same data to all nodes in the cluster. We want to be efficient as possible when sending out this data for all to use, without having to loop through all sends and receives. This is good when a particular root node has a value that the rest of the cluster needs before continuing. Below is a quick example in which the head node sets the value to 42 and the rest will receive it.

#light

#R "D:\Program Files\MPI.NET\Lib\MPI.dll"

open System
open MPI

let main(args:string[]) =
using(new Environment(ref args))(fun _->
    let commRank = Communicator.world.Rank

    let intValue = ref 0
    if commRank = 0 then
      intValue := 42

    Communicator.world.Broadcast(intValue, 0)
    Console.WriteLine("Broadcasted {0} to all nodes", !intValue)
)
main(Environment.GetCommandLineArgs())

Blocking Send and Receive

In this scenario, we're going to use the blocking send and receive pattern. This will not allow the program to continue until I get the message I'm looking for. This is good for times when you need a particular value before proceeding to your next function from the head node or any other particular node.

#light

#R "D:\Program Files\MPI.NET\Lib\MPI.dll"

open System
open MPI

let main (args:string[]) =
using(new Environment(ref args))( fun _ ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size
    let intValue = ref 0
    match commRank with
    | 0 ->
      [1 .. (commSize - 1)] |> List.iter (fun i ->
        Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
        Console.WriteLine("Result: {0}", !intValue))
    | _ ->
      intValue := 4 * commRank
      Communicator.world.Send(!intValue,0, 0)
)

What I'm doing here is letting the head node, rank 0, to do all the receiving work. Note, that I don't care particularly where the source was, nor what the tag was. I can specify however, if I wish to go ahead and receive from a certain node and of a certain data tag. If it's a slave process, then I'm going to go ahead and calculate the value, and send it back to the head node of 0. The head node will wait until it has received that value from any node and then print out the given value. The methods that I'm using the send and receive are generic methods. Behind the scenes, in order to send, the system will go ahead and serialize your object into an unmanaged memory stream and throw it on the wire. This is one of the fun issues when dealing with marshaling to unmanaged C code.

Nonblocking Send and Receive

In this scenario, we are not going to block as we did before with sending or receiving. We want the ability to continue on doing other things while I sent the value, while the other receivers might need that value before continuing. Eventually we can force getting that value from the node through the communication status, and then at a certain point, we can set up a barrier so that nobody can continue until we've hit that point in our program. The below sample is a quick sending of a multiplied value and letting it continue. The other nodes will have to wait until that broadcast comes, and then we'll wait at the barrier until the job is done.

let main (args:string[]) =
using(new Environment(ref args))( fun _ ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size

    let intValue = ref 0
    if commRank = 0 then
      [1 .. (commSize - 1)] |> List.iter (fun _ ->
        Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
        Console.WriteLine("Result: {0}", !intValue))
    else
      intValue := 4 * commRank
      let status = Communicator.world.ImmediateSend(!intValue,0, 0)
      status.Wait() |> ignore

    Communicator.world.Barrier()
)

main(Environment.GetCommandLineArgs())

Gather and Scatter

The gather process takes values from each process and then sends it to the root process as an array for evaluation. This is a pretty simple operation for taking all values from all nodes and combining them on the head node. What I'm doing is a simple calculation of gathering all values of commRank * 3 and sending it to the head node for evaluation.

let main (args:string[]) =
using(new Environment(ref args))( fun e ->
    let commRank = Communicator.world.Rank
    let intValue = commRank * 3

    match commRank with
    | 0 ->
      let ranks = Communicator.world.Gather(intValue, commRank)
      ranks |> Array.iter(fun i -> System.Console.WriteLine(" {0}", i))
    | _ -> Communicator.world.Gather(intValue, 0) |> ignore
)

main(Environment.GetCommandLineArgs())

Conversely, scatter does the opposite which takes a row from the given head process and splits it apart to be spread out among all processes. In this exercise I will go ahead and create a mutable array that only the head node will modify. From there, I will scatter it across the rest of the nodes to pick up and do with whatever they please.

let main (args:string[]) =
using(new Environment(ref args))( fun e ->
    let commSize = Communicator.world.Size
    let commRank = Communicator.world.Rank
    let mutable table = Array.create commSize 0

    match commRank with
    | 0 ->
      table <- Array.init commSize (fun i -> i * 3)
      Communicator.world.Scatter(table, 0) |> ignore
    | _ ->
      let scatterValue = Communicator.world.Scatter(table, 0)
      Console.WriteLine("Scattered {0}", scatterValue)
)

main(System.Environment.GetCommandLineArgs())

There is an AllGather method as well which performs a similar operation to Gather, but the results are available to all processes instead of the root process.

Reduce

Another collective algorithm similar to scatter and gather is the reduce function. This allows us to combine all values from each process and perform an operation on them, whether it be to add, multiply, find the maximum, minimum and so on. The value is only available at the root process though, so I have to ignore the result for the rest of the processes. The following example shows a simple

let main (args:string[]) =
using(new Environment(ref args))( fun _ ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size

    match commRank with
    | 0 ->
      let sum = Communicator.world.Reduce(Communicator.world.Rank, Operation<int>.Add, 0)
      Console.WriteLine("Sum of all roots is {0}", sum)
    | _ ->
      Communicator.world.Reduce(Communicator.world.Rank, Operation<int>.Add, 0) |> ignore
)

main(Environment.GetCommandLineArgs())

There is another variation called the AllReduce which does very similar operations to the Reduce function, but instead makes the value available to all processes instead of just the root one. There are more operations and more communicators such as Graph and Cartesian, but this is enough to give you an idea of what you can do here.

LINQ for MPI.NET

During my search for MPI.NET solutions, I came across a rather interesting one called LINQ for MP.NET. I don't know too many of the details figuring the author has been pretty aloof as to providing the complete design details. But it has entered a private beta if you do wish to contact them for more information.

The basic idea is to provide provide some scope models which include for the current scope, the world scope, root and so on. Also, it looks like they are providing some sort of multi-threading capabilities as well. Looks interesting and I'm interested in finding out more.

Pure MPI.NET?

Another implementation of the MPI in .NET has surfaced through PureMPI.NET.   This is an implementation of the MPICH2 specification as well, but built on WCF instead of the MSMPI.dll. Instead, this does not rely on the Microsoft Compute Cluster service for scheduling and instead, uses remoting and such for communication purposes. There is a CodeProject article which explains it a bit more here.

More Resources

So, you want to know more, huh? Well, most of the interesting information is out there in C, so if you can read and translate it to the other APIs, you should be fine. However, there are some good books on the subject which not only provide some decent samples, but also some guidance on how to make the most of the MPI implementation. Below are some of the basic ones which will help on learning not only the APIs, but the patterns behind their usage.
Wrapping It Up

I hope you found some of this useful for learning about how the MPI can help for massive parallel applications. The patterns learned here as well as the technologies behind them are pretty powerful to help you think about how to make your programs a bit less linear in nature. There is more to this series to look at thinking of concurrency in .NET, so I hope you stay tuned.

Read more...
Thinking in Concurrently in .NET

Tuesday, May 13, 2008
.NET C# F# Frameworks
5 Comments

In recent posts, you've found that I've been harping on immutability and side effect free functions. There is a general theme emerging from this and some real reasons why I'm pointing it out. One of the things that I'm interested in is concurrent programming on the .NET platform for messaging applications. As we see more and more cores and processors available to us, we need to be cognizant of this fact as we're designing and writing our applications. Most programs we write today are pretty linear in nature, except for say forms applications which use background worker threads to not freeze the user interface. But for the most part, we're not taking full advantage of the CPU and its cycles. We need not only a better way to handle concurrency, but a better way to describe them as well. This is where Pi-calculus comes into the picture... But before we get down that beaten path, let's look at a few options that I chose. Not that these aren't all of them, just a select few I chose to analyze.

Erlang in .NET?

For many people, Erlang is considered to be one of the more interesting languages to come out of the concurrent programming field. This language has received little attention until now when we've hit that slowdown of scaling our processor speed and instead coming into multi-core/multi-processor environments. What's interesting about Erlang is that it's a functional language, much like F#, Haskell, OCaml, etc. But what makes it intriguing as well is that it's not a static typed language like the others, and instead dynamic. Erlang was designed to support distributed, fault-tolerant, non-stop real-time applications. Written by Ericsson in the 1980s, it has been the mainstay of telephone switches ever since. If you're interested in listening to more about it, check out Joe Armstrong's appearance on Software Engineering Radio Episode 89 "Joe Armstrong on Erlang". If you want to dig deeper into Erlang, check out the book "Programming Erlang: Software for a Concurrent World" also by Joe Armstrong, and available on Amazon.

How does that lead us to .NET? Well, it's interesting that someone thought of trying to port the language to .NET on a project called Erlang.NET. This project didn't get too far as I can tell, and for obvious impedance mismatch reasons. First off, there is a bit of a disconnect between .NET processes and Erlang processes and how he wants to tackle them. Erlang processes are cheap to create and tear down, whereas .NET ones tend to be a bit heavy. Also the Garbage Collection runs a bit differently instead of a per process approach, the CLR takes a generational approach. And another thing is that Erlang is a dynamic language running on its own VM, so it would probably sit on top of the DLR in the .NET space. Not saying it's an impossible task, but improbable the way he stated.

Instead, maybe the approach to take with an Erlang-like implementation is to create separate AppDomains since they are relatively cheap to create. This will allow for process isolation and messaging constructs to fit rather nicely. Instead, we get rid of the impedance mismatch by mapping an Erlang process to an AppDomain. Then you can tear down the AppDomain after you are finished or you could restart them in case of a recovery scenario. These are some of the ideas if you truly want to dig any further into the subject. I'll probably cover this in another post later.

So, where does that leave us with Erlang itself? Well, we have the option of integrating Erlang and .NET together through OTP.NET.   The original article from where the idea came from is from the ServerSide called "Integrating Java and Erlang". This allows for the use of Erlang to do the computation on the server in a manner that best fits the Erlang style. I find it's a pretty interesting article and maybe when I have a spare second, I'll check it out a bit more. But, in terms of a full port to .NET? Well, I think .NET languages have some lessons to learn from Erlang, as it tackled concurrent programming as the first topic instead of most imperative languages bolting it on after the fact.

MPI.NET

The Message Passing Interface (MPI) approach has been an interesting way of solving mass concurrency for applications. This involves using a standard protocol for passing messages from node to node through the system by the way of a compute cluster. In the Windows world, we have Windows Compute Cluster Server (WCCS) that handles this need. CCS is available now in two separate SKUs, CCS 2003 and CCS 2008 for Server 2008. The Server 2008 CCS is available in CTP on the Microsoft Connect website. See here for more information. You mainly find High Performance Computing with MPI in the automotive, financial, scientific and academic communities where they have racks upon racks of machines.

Behind the scenes, Microsoft implemented the MPICH2 version of the MPI specification. This was then made available to C programmers and is fairly low level. Unfortunately, that leaves most C++ and .NET programmers out in the cold when it comes to taking advantage. Sure, C++ could use the standard libraries, but instead, the Boost libraries were created to support MPI in a way that C++ could really take advantage of.

After this approach was taken, a similar approach was taken for the .NET platform with MPI.NET. The University of Indiana produced a .NET version which looked very similar to the Boost MPI approach but with .NET classes. This allows us to program in any .NET language now against the Windows CCS to take advantage of the massive scalability and scheduling services offered in the SKU. At the end of the day, it's just a thin wrapper over P/Invoking msmpi.dll with generics thrown in as well. Still, it's a nice implementation.

And since it was written for .NET, I can for example do a simple hello world application in F# to take advantage of the MPI. The value being is that most algorithms and heavy lifting you would be doing through there would probably be functional anyways. So, I can use F# to specify more succinctly what types of actions and what data I need. Here is a simple example:

#light

#R "D:\Program Files\MPI.NET\Lib\MPI.dll"

open MPI

let main (args:string[]) =
using(new Environment(ref args))( fun e ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size
    match commRank with
    | 0 ->
      let intValue = ref 0
      [1 .. (commSize - 1)] |> List.iter (fun i ->
        Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
        System.Console.WriteLine("Hello from node {0} out of {1}", !intValue, commSize))
    | _ -> Communicator.world.Send(commRank,0, 0)
)

main(System.Environment.GetCommandLineArgs())

I'll go into more detail in the future as to what this means and why, but just to whet your appetite about what you can do in this is pretty powerful.

F# to the Rescue with Workflows?

Another topic for discussion is for asynchronous workflows. This is another topic in which F# excels as a language. Async<'a> values are really a way of writing continuation passing explicitly. I'll be covering this more in a subsequent post shortly, but in the mean time, there is good information from Don Syme here and Robert Pickering here.

Below is a quick example of an asynchronous workflow which fetches the HTML from each of the given web sites. I can then run each in parallel and get the results rather easily. What I'll do below is a quick retrieval of HTML by calling the Async methods. Note that these methods don't exactly exist, but F# through its magic, creates that for you.

#light

open System.IO
open System.Net
open Microsoft.FSharp.Control.CommonExtensions

let fetchAsync (url:string) =
async { let request = WebRequest.Create(url)
          let! response = request.GetResponseAsync()
          let stream = response.GetResponseStream()
          let reader = new StreamReader(stream)
          let! html = reader.ReadToEndAsync()
          return html
        }

let urls = ["http://codebetter.com/"; "http://microsoft.com"]
let htmls = Async.Run(Async.Parallel [for url in urls -> fetchAsync url])
print_any htmls

So, as you can see, it's a pretty powerful mechanism for retrieving data asynchronously and then I can run each of these in parallel with parameterized data.

Parallel Extensions for .NET

Another approach I've been looking at is the Parallel Extensions for .NET. The current available version is for the December CTP and is available here. You can read more about it from two MSDN Magazine articles:
- Optimize Managed Code For Multi-Core Machines
- Running Queries On Multi-Core Processors
What I find interesting is Parallel LINQ or PLINQ for short. The Task Parallel library doesn't interest me as much. LINQ in general is interesting to a functional programmer in that it's a lazy loaded function. The actual execution of your LINQ task is delayed until the first yield in GetEnumerator() has been called. That's definitely taking some lessons from the functional world and pretty powerful. And add on top of that the ability to parallelize your heavy algorithms is a pretty powerful concept. I hope this definitely moves forward.

Conclusion

As you can see, I briefly gave an introduction to each of these following areas that I hope to dive into a bit more in the coming weeks and months. I've only scratched the surface on each and each tackle the concurrency problems in slightly different ways and each has its own use. But I hope I whetted your appetite to look at some of these solutions today.

Read more...
Your API Fails, Who is at Fault?

Friday, May 9, 2008
C# DBC Spec# TDDBDD
No Comments

I decided to stay on the Design by Contract side for just a little bit. Recently, Raymond Chen posted "If you pass invalid parameters, then all bets are off" in which he goes into parameter validation and basic defensive programming. Many of the conversations had on the blog take me back to my C++ and early Java days of checking for null pointers, buffer lengths, etc. This brings me back to some recent conversations I've had about how to make it explicit about what I expect. Typical defensive behavior looks something like this:

public static void Foreach<T>(this IEnumerable<T> items, Action<T> action)
{
    if (action == null)
        throw new ArgumentNullException("action");

    foreach (var item in items)
        action(item);
}

After all, how many times have you not had any idea what the preconditions are for a given method due to lack of documentation or non-intuitive method naming? it gets worse when they don't provide much documentation, XML comments or otherwise. At that point, it's time to break out .NET Reflector and dig deep. Believe me, I've done it quite a bit lately.

The Erlang Way

The Erlang crowd takes an interesting approach to the issue that I've really been intrigued by. Joe Armstrong calls this approach "Let it crash" in which you only code to the sunny day scenario, and if the call to it does not conform to the spec, just let it crash. You can read more about that on the Erlang mailing list here.

Some paragraphs stuck out in my mind.

Check inputs where they are "untrusted"
    - at a human interface
    - a foreign language program

What this basically states is the only time you should do such checks is at the bounds when you have possible untrusted input, such as bounds overflows, unexpected nulls and such. He goes on to say about letting it crash:

specifications always say what to do if everything works - but never what to do if the input conditions are not met - the usual answer is something sensible - but what you're the programmer - In C etc. you have to write *something* if you detect an error - in Erlang it's easy - don't even bother to write code that checks for errors - "just let it crash".

So, what Joe advocates is not checking at all, and if they don't conform to the spec, just let it crash, no need for null checks, etc. But, how would you recover from such a thing? Joe goes on to say:

Then write a *independent* process that observes the crashes (a linked process) - the independent process should try to correct the error, if it can't correct the error it should crash (same principle) - each monitor should try a simpler error recovery strategy - until finally the error is fixed (this is the principle behind the error recovery tree behaviour).

It's an interesting approach, but proves to a valuable one for parallel processing systems. As I dig further into more functional programming languages, I'm finding such constructs useful.

Design by Contract Again and DDD

Defensive programming is a key part of Design by Contract. But, in a way it differs. With defensive programming, the callee is responsible for determining whether the parameters are valid and if not, throws an exception or otherwise handles it.   DbC with the help of the language helps the caller better understand how to cope with the exception if it can.

Bertrand Meyer wrote a bit about this in the Eiffel documentation here. But, let's go back to basics. DbC asserts that the contracts (what we expect, what we guarantee, what we maintain) are such a crucial piece of the software, that it's part of the design process. What that means is that we should write these contract assertions FIRST.

What do these contract assertions contain? It normally contains the following:
- Acceptable/Unacceptable input values and the related meaning
- Return values and their meaning
- Exception conditions and why
- Preconditions (may be weakened by subclasses)
- Postconditions (may be strengthened by subclasses)
- Invariants (may be strengthened by subclasses)
So, in effect, I'm still doing TDD/BDD, but an important part of this is identifying my preconditions, postconditions and invariants. These ideas mesh pretty well with my understanding of BDD and we should be testing those behaviors in our specs. Some people saw in my previous posts that they were afraid I was de-emphasizing TDD/BDD and that couldn't be further from the truth. I'm just using another tool in the toolkit to express my intent for my classes, methods, etc. I'll explain further in a bit down below.

Also, my heavy use of Domain Driven Design patterns help as well. I mentioned those previously when I talked about Side Effects being Code Smells. With the combination of intention revealing interfaces which express to the caller what I am intending to do, and my use of assertions not only in the code but also in the documentation as well. This usually includes using the <exception> XML tag in my code comments. Something like this is usually pretty effective:

/// <exception cref="T:System.ArgumentNullException"><paramref name="action"/> is null.</exception>

If you haven't read Eric's book, I suggest you take my advice and Peter's advice and do so.

Making It Explicit

Once again, the use of Spec# to enforce these as part of the method signature to me makes sense. To be able to put the burden back on the client to conform to the contract or else they cannot continue. And to have static checking to enforce that is pretty powerful as well.

But, what are we testing here? Remember that DbC and Spec# can ensure your preconditions, your postconditions and your invariants hold, but they cannot determine whether your code is correct and conforms to the specs. That's why I think that BDD plays a pretty good role with my use of Spec#.

DbC and Spec# can also play a role in enforcing things that are harder with BDD, such as enforcing invariants. BDD does great things by emphasizing behaviors which I'm really on board with. But, what I mean by being harder is that your invariants may be only private member variables which you are not going to expose to the outside world. If you are not going to expose them, it makes it harder for your specs to control such behavior. DbC and Spec# can fill that role. Let's look at the example of an ArrayList written in Spec#.

public class ArrayList
{
    invariant 0 <= _size && _size <= _items.Length;
    invariant forall { int i in (_size : _items.Length); _items[i] == null }; // all unused slots are null

    [NotDelayed]
    public ArrayList (int capacity)
      requires 0 <= capacity otherwise ArgumentOutOfRangeException;
      ensures _size/*Count*/ == 0;
      ensures _items.Length/*Capacity*/ == capacity;
    {
      _items = new object[capacity];
      base();
    }

    public virtual void Clear ()
      ensures Count == 0;
    {
      expose (this) {
        Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
        assume forall{int i in (0: _size); _items[i] == null}; // postcondition of Array.Clear
        _size = 0;
      }
    }

// Rest of code omitted

What I've been able to do is set the inner array to the new capacity, but also ensure that when I do that, my count doesn't go up, but only my capacity. When I call the Clear method, I need to make sure the inner array is peer consistent by the way of all slots not in the array must be null as well as resetting the size. We use the expose block to expose to the runtime to have the verifier analyze the code. By the end of the expose block, we should be peer consistent, else we have issues. How would we test some of these scenarios in BDD? Since they are not exposed to the outside world, it's pretty difficult. What it would be doing is leaving me with black box artifacts that are harder to prove. Instead, if I were to expose them, it would then break encapsulation which is not necessarily something I want to do. Instead, Spec# gives me the opportunity to enforce this through the DbC constructs afforded in the language.

The Dangers of Checked Exceptions

But with this, comes a cost of course. I recently spoke with a colleague about Spec# and the instant thoughts of checked exceptions in Java came to mind. Earlier in my career, I was a Java guy who had to deal with those who put large try/catch blocks around methods with checked exceptions and were guilty of just catching and swallowing or catching and rethrowing RuntimeExceptions. Worse yet, I saw this as a way of breaking encapsulation by throwing exceptions that I didn't think the outside world needed to know about. I was kind of glad that this feature wasn't brought to C# due to the fact I saw rampant abuse for little benefit. What people forgot about during the early days of Java that exceptions are meant to be exceptional and not control flow.

How I see Spec# being different is that since we have a static verification tool through the use of Boogie to verify whether those exceptional conditions are valid. The green squigglies give warnings about possible null values or arguments in ranges, etc. This gives me further insight into what I can control and what I cannot. Resharper also has some of those nice features as well, but I've found Boogie to be a bit more helpful with more advanced static verification.

Conclusion

Explicit DbC constructs give us a pretty powerful tool in terms of expressing our domain and our behaviors of our components. Unfortunately, in C# there are no real valuable implementations that enforce DbC constructs to both the caller and the callee. And hence Spec# is an important project to come out of Microsoft Research.

Scott Hanselman just posted his interview with the Spec# team on his blog, so if you haven't heard it yet, go ahead and download it now. It's a great show and it's important that if you find Spec# to be useful, that you press Microsoft to give it to us as a full feature.

Read more...
Command-Query Separation and Immutable Builders

Tuesday, May 6, 2008
C# DBC F# Spec# TDDBDD
6 Comments

In one of my previous posts about Command-Query Separation (CQS) and side effecting functions being code smells, it was pointed out to me again about immutable builders. For the most part, this has been one area of CQS that I've been willing to let break. I've been following Martin Fowler's advice on method chaining and it has worked quite well. But, revisiting an item like this never hurts. Immutability is something you'll see me harping on time and time again now and in the future. The standard rules I usually do is immutable and side effect free when you can, mutable state where you must. I like the opt-in mutability of functional languages such as F# which I'll cover at some point in the near future instead of the opt-out mutability of imperative/OO languages such as C#.

Typical Builders

The idea of the standard builder is pretty prevalent in most applications we see today with fluent interfaces. Take for example most Inversion of Control (IoC) containers when registering types and so on:

UnityContainer container = new UnityContainer();
container
    .RegisterType<ILogger, DebugLogger>("logger.Debug")
    .RegisterType<ICustomerRepository, CustomerRepository>();

Let's take a naive medical claims processing system and building up and aggregate root of a claim. This claim contains such things as the claim information, the lines, the provider, recipient and so on. This is a brief sample and not meant to be the real thing, but just a quick example. After all, I'm missing things such as eligibility and so on.

    public class Claim
    {
        public string ClaimId { get; set; }
        public DateTime ClaimDate { get; set; }
        public List<ClaimLine> ClaimLines { get; set; }
        public Recipient ClaimRecipient { get; set; }
        public Provider ClaimProvider { get; set; }
    }

    public class ClaimLine
    {
        public int ClaimLineId { get; set; }
        public string ClaimCode { get; set; }
        public double Quantity { get; set; }
    }

    public class Recipient
    {
        public string RecipientId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }

    public class Provider
    {
        public string ProviderId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }

Now our standard builders use method chaining as shown below. As you note, we'll return the instance each and every time.

public class ClaimBuilder
{
    private string claimId;
    private DateTime claimDate;
    private readonly List<ClaimLine> claimLines = new List<ClaimLine>();
    private Provider claimProvider;
    private Recipient claimRecipient;

    public ClaimBuilder() {}

    public ClaimBuilder WithClaimId(string claimId)
    {
        this.claimId = claimId;
        return this;
    }

    public ClaimBuilder WithClaimDate(DateTime claimDate)
    {
        this.claimDate = claimDate;
        return new ClaimBuilder(this);
    }

    public ClaimBuilder WithClaimLine(ClaimLine claimLine)
    {
        claimLines.Add(claimLine);
        return this;
    }

    public ClaimBuilder WithProvider(Provider claimProvider)
    {
        this.claimProvider = claimProvider;
        return this;
    }

    public ClaimBuilder WithRecipient(Recipient claimRecipient)
    {
        this.claimRecipient = claimRecipient;
        return this;
    }

    public Claim Build()
    {
        return new Claim
       {
           ClaimId = claimId,
           ClaimDate = claimDate,
           ClaimLines = claimLines,
           ClaimProvider = claimProvider,
           ClaimRecipient = claimRecipient
       };
    }

    public static implicit operator Claim(ClaimBuilder builder)
    {
        return new Claim
        {
            ClaimId = builder.claimId,
            ClaimDate = builder.claimDate,
            ClaimLines = builder.claimLines,
            ClaimProvider = builder.claimProvider,
            ClaimRecipient = builder.claimRecipient
        };
    }
}

What we have above is a violation of the CQS because we're mutating the current instance as well as returning a value. Remember, that CQS states:
- Commands - Methods that perform an action or change the state of the system should not return a value.
- Queries - Return a result and do not change the state of the system (aka side effect free)
But, we're violating that because we're returning a value as well as mutating the state. For the most part, that hasn't been a problem. But what about sharing said builders? The last thing we'd want to do is have our shared builders mutated by others when we're trying to build up our aggregate roots.

Immutable Builders or ObjectMother or Cloning?

When we're looking to reuse our builders, the last thing we'd want to do is allow mutation of the state. So, if I'm working on the same provider and somehow change his eligibility, then that would be reflected against all using the same built up instance. That would be bad. We have a couple options here really. One would be to follow an ObjectMother approach to build up shared ones and request a new one each time, or the other would be to enforce that we're not returning this each and every time we add something to our builder. Or perhaps we can take one at a given state and just clone it. Let's look at each.

public static class RecipientObjectMother
{
    public static RecipientBuilder RecipientWithLimitedEligibility()
    {
        RecipientBuilder builder = new ProviderBuilder()
            .WithRecipientId("xx-xxxx-xxx")
            .WithFirstName("Robert")
            .WithLastName("Smith")
            // More built in stuff here for setting up eligibility

        return builder;
    }
}

This allows me to share my state through pre-built builders and then when I've finalized them, I'll just call the Build method or assign them to the appropriate type. Or, I could just make them immutable instead and not have to worry about such things. Let's modify the above example to take a look at that.

public class ClaimBuilder
{
    private string claimId;
    private DateTime claimDate;
    private readonly List<ClaimLine> claimLines = new List<ClaimLine>();
    private Provider claimProvider;
    private Recipient claimRecipient;

    public ClaimBuilder() {}

    public ClaimBuilder(ClaimBuilder builder)
    {
        claimId = builder.claimId;
        claimDate = builder.claimDate;
        claimLines.AddRange(builder.claimLines);
        claimProvider = builder.claimProvider;
        claimRecipient = builder.claimRecipient;
    }

    public ClaimBuilder WithClaimId(string claimId)
    {
        ClaimBuilder builder = new ClaimBuilder(this) {claimId = claimId};
        return builder;
    }

    public ClaimBuilder WithClaimDate(DateTime claimDate)
    {
        ClaimBuilder builder = new ClaimBuilder(this) { claimDate = claimDate };
        return builder;
    }

    public ClaimBuilder WithClaimLine(ClaimLine claimLine)
    {
        ClaimBuilder builder = new ClaimBuilder(this);
        builder.claimLines.Add(claimLine);
        return builder;
    }

    public ClaimBuilder WithProvider(Provider claimProvider)
    {
        ClaimBuilder builder = new ClaimBuilder(this) { claimProvider = claimProvider };
        return builder;
    }

    public ClaimBuilder WithRecipient(Recipient claimRecipient)
    {
        ClaimBuilder builder = new ClaimBuilder(this) { claimRecipient = claimRecipient };
        return builder;
    }

    // More code here for building
}

So, what we've had to do is provide a copy-constructor to initialize the object in the right state. And here I thought I could leave those behind since my C++ days. After each assignment, I then create a new ClaimBuilder and pass in the current instance to initialize the new one, thus copying over the old state. This then makes my class suitable for sharing. Side effect free programming is the way to do it if you can. Of course, realizing that it creates a few objects on the stack as you're initializing your aggregate root, but for testing purposes, I haven't really much cared.

Of course I could throw Spec# into the picture once again as enforcing immutability on said builders. To be able to mark methods as being Pure makes it apparent to both the caller and the callee what the intent of the method is. Another would be using NDepend as Patrick Smacchia talked about here.

The other way is just to provide a clone method which would just copy the current object so that you can go ahead and feel free to modify a new copy. This is a pretty easy approach as well.

public ClaimBuilder(ClaimBuilder builder)
{
    claimId = builder.claimId;
    claimDate = builder.claimDate;
    claimLines.AddRange(builder.claimLines);
    claimProvider = builder.claimProvider;
    claimRecipient = builder.claimRecipient;
}

public ClaimBuilder Clone()
{
    return new ClaimBuilder(this);
}

Conclusion

Obeying the CQS is always an admirable thing to do especially when managing side effects. Not all of the time is it required such as with builders, but if you plan on sharing these builders, it might be a good idea to really think hard about the side effects you are creating. As we move more towards multi-threaded, multi-machine processing, we need to be aware of our side effecting a bit more. But, at the end of the day, I'm not entirely convinced that this violates the true intent of CQS since we're not really querying, so I'm not sure how much this is buying me. What are your thoughts?

Read more...
Adventures in F# - F# 101 Part 9 (Control Flow)

Friday, May 2, 2008
F#
No Comments

Taking a break from the Design by Contract stuff for just a bit while I step back into the F# and functional programming world. If you followed me at my old blog, you'll know I'm pretty passionate about functional programming and looking for new ways to solve problems and express data.

Where We Are

Before we begin today, let's catch up to where we are today:
Today's topic will be covering more imperative code dealing with control flow. But first, the requisite side material before I begin today's topic.

A Survey of .NET Languages And Paradigms

Joel Pobar just contributed an article to the latest MSDN Magazine (May 2008) called "Alphabet Soup: A Survey of .NET Languages And Paradigms". This article introduces not only the different languages that are supported in the .NET space, but the actual paradigms that they operate in. For example, you have C#, VB.NET, C++, F# and others in the static languages space and IronRuby, IronPython among others in the dynamic space. But what's more interesting is the way that each one tackles a particular problem. The article covers a little bit about functional programming and its uses as well as dynamic languages. Of course the mention is made that C# and VB.NET are slowly adopting more functional programming aspects over time. One thing I've lamented is the fact that VB.NET and C# are too similar for my tastes so I'm hoping for more true differentiation come the next spin. Instead, VB would be really interesting as a more dynamic language and not just one that many people just look down their noses at. Ok, enough of the sidetracking and let's get back to the subject at hand.

Control Flow

Since F# is a general purpose language in the .NET space, it supports all imperative ways of approaching problems. This of course includes control flow. F# takes a different approach than most functional programming languages in that the evaluation of a statement can happen in any order. Instead, in F#, we have a very succinct way of doing it in F# with the if, elif, else statements. Below is a quick example of that:

#light

let IsInBounds (x:int) (y:int) =
if x < 0 then false
elif x > 50 then false
elif y < 0 then false
elif y > 50 then false
else true

What I was able to do is to check the bounds of the given integer inputs. Pretty simple example. As opposed to many imperative languages, when you are returning a value from the if, all subsequent elif or elses must also return values. This makes for balanced equations. Also, if you return a value from an if, then you are also forced to have an else which returns a value.

Although F# is using type inference to determine what my IsInBounds method returns, I cannot go ahead and return one type in an if and another different type in the elif or else. F# will complain violently, as it should because that's really not a good design of a function. Below is some code that will definitely throw an error.

#light

let IsInBounds (x:int) (y:int) =
if x < 0 then "Foo"
elif x > 50 then false
elif y < 0 then false
elif y > 50 then false
else true

As I said before, the equations must be balanced. But of course if your if expression returns a unit (void type for those imperative folks), then you aren't forced to have and else statement. Pretty self explanatory there.

Let's move onto the for loops. The standard for loop is to start at a particular index value, check for the terminate condition and then increment or decrement the index. F# supports this of course in a pretty standard way, but by default, the index is incremented by 1. You must note though that the body of the for loop is a unit type (void once again) so, if you return a value, F# won't like it. Below is a simple for loop to iterate through all lowercase letters.

#light

let chars = [|'a'..'z'|]

let PrintChars (c:array<char>) =
for index = 0 to chars.Length - 1 do
    print_any c.[index]

PrintChars chars

But, if I tried to return c from the for loop, F# will complain, but it will allow it to happen. It's just a friendly reminder that it's not going to do anything with that value you specified. I could also specify the for loop with a decrementer, so let's reverse our letters this time.

#light

let chars = [|'a'..'z'|]

let PrintChars (c:array<char>) =
for index = chars.Length - 1 downto 0 do
    print_any c.[index]

PrintChars chars

F# also supports the while construct as well. This of course is the exact same as any imperative construct, but with the caveat of once again, the while loop should not return a value because it is of the unit type.

#light

let chars = ref ['a'..'z']

while (List.nonempty !chars) do
print_any (List.hd !chars)
chars := List.tl !chars

This time we're just printing out a char and then removing it from the list collection. Note that we're using the ref keyword and reference cells as we talked about before. Lastly, let's cover one last construct, the foreach statement. This is much like we have in most other languages, just the wording is a bit different. As always, the foreach statement has the unit type, so returning values is a warning.

#light

let nums = [0..99]

for n in nums do
print_any n

Wrapping It Up

Just a quick walkthrough of just some of the imperative control statements allowed by F#. As you can see, it's not a huge leap here from one language to the next. I have a couple of upcoming talks on F#, so if you're in the Northern VA area on May 17th, come check it out at the NoVA Code Camp.

Read more...
Side Effecting Functions Are Code Smells Revisited

Thursday, May 1, 2008
C# DBC DDD Spec#
No Comments

After talking with Greg Young for a little this morning, I realized I missed a few points that I think need to be covered as well when it comes to side effecting functions are code smells. In the previous post, I talked about side effect free functions and Design by Contract (DbC) features in regards to Domain Driven Design. Of course I had to throw the requisite Spec# plug as well for how it handles DbC features in C#.

Intention Revealing Interfaces

Let's step back a little bit form the discussion we had earlier. Let's talk about good design for a second. How many times have you seen a method and had no idea what it did or that it went ahead and called 15 other things that you didn't expect? At that point, most people would take out .NET Reflector (a Godsend BTW) and dig through the code to see the internals. One of the examples of the violators was the ASP.NET Page lifecycle when I first started learning it. Init versus Load versus PreLoad wasn't really exact about what happened where, and most people have learned to hate it.

In the Domain Driven Design world, we have the Intention Revealing Interface. What this means is that we need to name our classes, methods, properties, events, etc to describe their effect and purpose. And as well, we should use the ubiquitous language of the domain to name them appropriately. This allows other team members to be able to infer what that method is doing without having to dig in with such tools as Reflector to see what it is actually doing. In our public interfaces, abstract classes and so on, we need to specify the rules and the relationships. To me, this comes back again to DbC. This allows us to not only specify the name in the ubiquitous language, but the behaviors as well.

Command-Query Separation (CQS)

Dr. Bertrand Meyer, the man behind Eiffel and the author of Object-oriented Software Construction, introduced a concept called Command-Query Separation. It states that we should break our functionality into two categories:
- Commands - Methods that perform an action or change the state of the system should not return a value.
- Queries - Return a result and do not change the state of the system (aka side effect free)
Of course this isn't a 100% rule, but it's still a good one to follow. Let's look at a simple code example of a good command. This is simplified of course. But what we're doing is side effecting the number of items in the cart.

public class ShoppingCart
{
    public void AddItemToCart(Item item)
    {
        // Add item to cart
    }
}

Should we use Spec# to do this, we could also check our invariants as well, but also to ensure that the number of items in our cart has increased by 1.

public class ShoppingCart
{
    public void AddItemToCart(Item item)
        ensures ItemsInCart == old(ItemsInCart) + 1;
    {
        // Add item to cart
    }
}

So, once again, it's very intention revealing at this point that I'm going to side effect the system and add more items to the cart. Like I said before, it's a simplified example, but it's a very powerful concept. And then we could talk about queries. Let's have a simple method on a cost calculation service that takes in a customer and the item and calculates.

public class CostCalculatorService
{
    public double CalculateCost(Customer c, Item i)
    {
        double cost = 0.0d;

        // Calculate cost

        return cost;
    }
}

What I'm not going to be doing in this example is modifying the customer, nor the item. Therefore, if I'm using Spec#, then I could mark this method as being [Pure]. And that's a good thing.

The one thing that I would hold an exception for is fluent builders. Martin Fowler lays out an excellent case for them here. Not only would we be side effecting the system, but we're also returning a value (the builder itself). So, the rule is not a hard and fast one, but always good to observe. Let's take a look at a builder which violates this rule.

public class CustomerBuilder
{
    private string firstName;

    public static CustomerBuilder New { get { return new CustomerBuilder(); } }

    public CustomerBuilder WithFirstName(string firstName)
    {
        this.firstName = firstName;
        return this;
    }

    // More code goes here
}

To wrap things up, things are not always fast rules and always come with the "It Depends", but the usual rule is that you can't go wrong with CQS.

Wrapping It Up

These rules are quite simple for revealing the true intent of your application while using the domain's ubiquitous language. As with anything in our field, it always comes with a big fat "It Depends", but applying the rules as much as you can is definitely to your advantage. These are simple, yet often overlooked scenarios when we design our applications, yet are the fundamentals.

Read more...
Side Effecting Functions are Code Smells

Thursday, May 1, 2008
C# DDD F# Spec#
1 Comment

I know the title might catch a few people off guard, but let me explain. Side effecting functions, for the most part, are code smells. This is a very important concept in Domain Driven Design (DDD) that's often overlooked. For those who are deep in DDD, this should sound rather familiar. And in the end, I think Spec# and some Design by Contract (DbC) constructs can mitigate this, or you can go the functional route as well.

What Is A Side Effect?

When you think of the word side effect in most languages, you tend to think of any unintended consequence. Instead, what we mean by it is having any effect on the system from an outside force. What do I mean by that? Well, think of these scenarios, reading and writing to a database, reading or writing to the console, or even modifying the state of your current object. Haskell and other functional languages take a pretty dim view of side effects, hence why they are not allowed, unless through monads. F# also takes this stance, as "variables" are immutable unless otherwise specified.

Why Is It A Smell?

Well, let's look at it this way. Most of our operations call other operations which call even more operations. This deep nesting is then created. From this deep nesting, it becomes quite difficult to predict the behaviors and consequences of calling all of those nested operations. You, the developer might not have intended for all of those operations to occur because A modified B modified C modified D. Without any safe form of abstraction, it's pretty hard to test as well. Imagine that any mock objects that you create would have to suddenly know in 5 levels deep that it is modified in some function. Not necessarily the best thing to do.

Also, when it comes to multi-threaded processing, this becomes even more of an issue. If multiple threads have a reference to the same mutable object, and one thread changes something on the reference, then all other threads were just side effected. This may not be something that you'd want to do. Then again, if working on shared memory applications, that might be. But, for the most part, the unpredictability of it can be a bad thing.

Let's take a quick example of a side effecting an object like implementation of a 2 dimensional Point. We're going to go ahead and allow ourselves to add another Point to the system.

public class Point2D
{
    public double X { get; set; }

    public double Y { get; set; }

    public void Add(Size2D other)
    {
        X += other.Height;
        Y += other.Width;
    }
}

public class Size2D
{
    public double Height { get; set; }

    public double Width { get; set; }
}

What's wrong with the above sample is that I just side effected the X, and Y. Why is this bad? Well, like I said, most objects like these are fire and forget. Anyone who had a reference to this Point now has a side effected one, that they might not have wanted. Instead, I should probably focus on retrieving a new one at this point, since this is pretty much a value object.

What Can You Do About It?

Operations that return results without side effects are considered to be pure functions.   These pure functions when called any number of times will return the same result given the same parameters time and time again. Pure functions are much easier to unit test and overall a pretty low risk.

There are several approaches to being able to fix the above samples. First, you can keep your modifiers and queries separated. Make sure you keep the methods that make changes to your object separate from those that return your domain data. Perform those queries and associated calculations in methods that don't change your object state in any way. So, think of a method that calculates price and then another method that actually sets the price on the particular object.

Secondly, you could also just not modify the object at all. Instead, you could return a value object that is created as an answer to a calculation or a query. Since value objects are immutable, you can feel free to hand them off and forget about them, unlike entities which are entirely mutable. Let's take the above example of the Coordinate and switch it around. Think of the DateTime structure. When you want to add x number of minutes, do you side effect the DateTime, or do you get a new one? The answer is, you get a new one? Why, well, because it's a structure, and they are immutable, but not only that, it solves a lot of those side effecting problems.

public class Point2D
{
    private readonly double x;
    private readonly double y;

    public Point2D() {}

    public Point2D(double x, double y)
    {
        this.x = x;
        this.y = y;
    }

    public double X { get { return x; } }

    public double Y { get { return y; } }

    public Point2D Add(Size2D other)
    {
        double newX = x + other.Height;
        double newY = y + other.Width;

        return new Point2D(newX, newY);
    }
}

Spec# is a tool that can help in this matter. Previously I stated why Spec# matters, well, let's get into more detail why. We can mark our side effect free methods as being pure with the [Pure] attribute. This allows the system to verify that indeed we are not side-effecting the system, and any time I call that with the same parameters, I will get the same result. It's an extra insurance policy that makes it well known to the caller that I'm not going to side effect myself when you call me. So, let's go ahead and add some Spec# goodness to the equation.

[Pure]
public Point2D Add(Size2D other)
{
    double newX = x + other.Height;
    double newY = y + other.Width;

    return new Point2D(newX, newY);
}

But, now Spec# will warn us that our other might be null, and that could be bad.... So, let's fix that to add some real constraints for the preconditions.

[Pure]
public Point2D Add(Size2D other)
    requires other != null
{
    double newX = x + other.Height;
    double newY = y + other.Width;

    return new Point2D(newX, newY);
}

Of course I could have put some ensures as well to ensure the result will be the addition, but you get the point.

Turning To Design by Contract

Now of course we have to be a pragmatist about things. At no point did I say that we can't have side effects ever. That would be Haskell and they put themselves into a nasty corner with that and the only way around it was with monads, that can be a bit clumsy. Instead, I want to refocus where we do them and be more aware of what you're modifying.

In our previous examples, we cut down on the number of places where we had our side effects. But, this does not eliminate them, instead gather them in the appropriate places. Now when we deal with entities, they are very much mutable, and so we need to be aware when and how side effects get introduced. To really get to the heart of the matter, we need to verify the preconditions, the postconditions and mostly our invariants. In a traditional application written in C#, we could throw all sorts of assertions into our code to make sure that we are in fact conforming to our contract. Or we can write our unit tests to ensure that they conform to them. This is an important point in Eric Evans' book when talking about assertions in the Supple Design chapter.

Once again, Spec# enters again as a possible savior to our issue. This allows us in our code, to model our preconditions and our postconditions as part of our method signature. Invariants as well can be modeled as well into our code as well. These ideas came from Eiffel but are very powerful when used for good.

Let's make a quick example to show how invariants and preconditions and postconditions work. Let's create an inventory class, and keep in mind it's just a sample and not anything I'd ever use, but it proves a point. So let's lay out the inventory class and we'll set some constraints. First, we'll have the number of items remaining. That number of course can never go below zero. Therefore, we need an invariant that enforces that. Also, when we remove items from the inventory, we need to make sure that we're not going to dip below zero. Very important things to keep in mind.

public class Inventory
{
    private int itemsRemaining;
    private int reorderPoint;

    invariant itemsRemaining >= 0;

    public Inventory()
    {
        itemsRemaining = 200;
        reorderPoint = 50;
        base();
    }

    public void RemoveItems(int items)
        requires items <= ItemsRemaining;
        ensures ItemsRemaining == old(ItemsRemaining) - items;
    {
        expose(this)
        {
            itemsRemaining -= items;
        }

        // Check reorder point
    }

    public int ItemsRemaining { get { return itemsRemaining; } }

    // More stuff here in class
}

What I was able to express is that I set up my invariants in the constructor. You cannot continue in a Spec# program unless you set the member variable that's included in the invariant. Also, look at the RemoveItems method. We set one precondition that states that number of items requested must be less than or equal to the number left. And we set the postcondition which states that the items remaining must be the difference between the old items remaining and the items requested. Pretty simple, yet powerful. We had to expose our invariant while modifying it so that it could be verified, however. But, doesn't it feel good to get rid of unit tests that prove what I already did in my method signature?

Wrapping Things Up

So, I hope after reading this, you've thought more about your design, and where you are modifying state and that you have intention revealing interfaces to tell the coder what exactly you are going to do. The Design by Contract features of Spec# also play a role in this to state in no uncertain terms what exactly the method can do with the preconditions and postconditions and through my class with my invariants. Of course you can use your regular C#, or language of choice to model the same kind of things, yet not as intention revealing.

So, where to go from here? Well, if you've found Spec# interesting, let Microsoft know about it. Join the campaign that Greg and I are harping on and say, "I Want Spec#!"

Read more...