November 2007 - Posts

ParallelFX: multi-processing extensions for .NET Framework

It is interesting how new solutions bring new problems: when Intel noticed that Moore's Law was loosing steam, it had to look for a new way of producing ever more powerful computers ¿their solution? Put more CPU's on every chip (the famous multi-core), they started shyly (2 with Core Duo) then they got up to speed (4, 8 CPU's) and now nobody's laughing when someone says that in 10 years home computers will have 32 or 64 CPU's.

But this solution brought a serious problem: most programming languages used nowadays not only ignore but are rigthdown inadequate to leverage these multi-core chips, to understand why, let's consider this really simplistic example:

x = f() + g();

One idea that comes quickly to our mind is to send the execution of f() to one CPU and the execution of g() to another and we'll catch up at the finish, but there are big issues here: what if f() and g() touch certain global variable (heck, any shared state, like a database connection), note that in such a case if f() goes faster than g() you may get a different result than if g() is faster than f(), in other words f() + g() is not deterministic anymore (whereas it used to be when we had just one CPU).

This is just a simple example of what can go wrong if we try to "just use" a multi-core chip with Java, VB or COBOL. Thus, whereas thanks to Moore's law we programmers got ever more computing power for free, with multi-core chips -powerful as they are- Moore's paradise is lost. Now we have to write (or re-write) specific code to leverage the power of parallelism.

But, lucky as we are, every problem seems to have a solution, and in this case we have a few options (in order of "radicality"):

  1. Use languages intentionally crafted for multiprocessing, like Erlang, where processes, message passing and synchronization are part of the primitives of the language
  2. Use languages whose same nature isolates one function call from another (i.e. where calling f() neither depends nor affects calling g()), languages where side-effects are not allowed or are strictly controlled. Like Haskell or F#. In fact, parallelism is one of the reasons functional languages are drawing so much attention these days.
  3. Extend existing platforms with classes and APIs that allow higher level constructs (i.e. your favorite programming language) to leverage parallelism in an easy (albeit not transparent) way.

This latter option is the less radical but maybe this is exactly the reason why it will become the most prevalent (so that what's already written don't need to be re-written). And that's why Microsoft, a couple of years ago already, started to play with the idea of extending the CLR with classes and APIs for parallel processing, the fruits of these labors is the first CTP of the Parallel Extensions to the .NET Framework (aka ParallelFX). ParallelFX defines classes and methods in .NET that allows C# 3.0 (or VB 9) to express things like this:

Parallel.For(0, 100, delegate(int i) {
  a[ i ] = a[ i ] * a[ i ];
});

This parallel loop triggers many processes (up to 100 if there are the resources available) to execute a method (in this case, just to square a[ i ] ), synchronization, exception handling, etc. is done by ParallelFX. Intriguing, isn't it? To get a first idea of ParallelFX I suggest this article on MSDN Magazine and for those of you who want to dive deeper, there is the Parallel Computing Development Center. Beware, that much of the material is still experimental and change will happen, but as we say in Latin America "el que se mete más temprano a la cocina le toca una presa más grande" (no idea how you translate that).

Posted by Edgar Sánchez with 1 comment(s)
Filed under: ,

F# basic function definition syntax

In the comments to this post, Anon and Josh complain about the syntax of F#. On one hand, they've got a point: most programmers are used to notations similar to those of Visual Basic, C# or Java, and for them many of the syntactic details of F# will look weird (or just plainly annoying ;-) ). On the other hand, it's not the case -as Josh suggests- that Microsoft is creating a new language with a purposedly cryptic syntax :-D; actually, F# was designed to follow as much as possible the syntax -and semantics- of OCaml (1996), a popular language in the functional world. OCaml in turn basically offers object-oriented extensions to Caml (1985), which inherits most of its syntax -and semantics- from ML (circa 1973). So, as much as I would like to say that Microsoft has created a whole new language, its more like it is moving the spotlight to a tradition as old (LISP anyone?) as imperative languages themselves, by providing an implementation nicely integrated into .NET Framework.

But, history and traditions apart, as Anon says, F# could look like greek to someone fluent in C#, so I'll try to give you one or two tips on the basic fsharpian vocabulary:

F# function definition basic syntax

In line 2, you can see how you define a function in F#, a few points worth mentioning:

  1. You don't need parenthesis to surround the parameters
  2. You don't use commas to separate parameters, a simple blank space will do (and beware that the comma is used to denote a totally different thing: a tuple, but let's not get ahead of ourselves).
  3. To separate the "head" (my term) of the function from its "body" you put an equal sign
  4. To the right of the equal sign you put the expression that is evaluated every time you call the function (just one expression, mind you, although it can perfectly grow to fill several lines, and it usually does)
  5. There are no methods or procedures in F#, just functions, everything you call must return something
  6. The function definition ends when:
    1. The line finishes, as in my example in line 2
    2. If we are using the so called "light" syntax (which will be so most of the time), the body indents one or more spaces to the right of the let keyword, in this case the function finishes when the indentation finishes, as in the other 3 examples of the figure above
  7. One curious thing is the fact that, most of the time, you don't need to declare the type of the parameters or the result, but beware:
    1. F# is a totally strongly typed language (like C# or Java) so everything has a type
    2. What's going on is that F# has a powerful type inference mechanism, so the compiler (yes, F# is compiled not interpreted) deducts the type of the parameters and the results from the operations you do with them. For example, when you do oper1 * oper2, you already know that oper1 and oper2 must be numbers. This works like a charm and trasparently in most scenarios, but sometimes you have to give a hand to the compiler by writing down the types explicitly, but again let's live that for (very) later

All in all, if you think about it, F# syntax is simpler than that of C#, what with:

let addSquares a b = a * a + b * b

Instead of:

int AddSquares(int a, int b) { return a * a + b * b; }

So in the end we have a very clean result, but, whew, now that I tried to explain it, may be Josh has got a point indeed: there's a lot to be said about F# syntax, so I'll live the explanation of the dreaded match expression for other night ;-). Furthermore, you'll have to trust me: after a few nights toying with the language you'll become comfortable with the syntax.

One last interesting point to note is that, syntax aside, the real power of languages like F# lays in its semantics and power of expression, that's why a sizable portion of that power has been implemented in C# 3.0 and VB 9 (LINQ being its more visible offspring) so, even if you run from the F# syntax, you won't be able to hide from its semantics... ;-)

Posted by Edgar Sánchez with 8 comment(s)

A better way of getting the average salary

Related to my post yesterday in which I tried to show an appealing business sample in F#, David Taylor commented that this:

can't possibly be more appealing than this:

var averageSalary = companyRoll.Average(e=>e.salary);

David is using LINQ of course and his example kills mine by a long shot. I thought a bit about why he did so much better and there are two main reasons:

  1. David is leveraging the framework (LINQ in this case) by using the pre-defined Average() function
  2. Furthermore, he is using a lambda expression (e=>e.salary) to get the salary of every employee

I was really silly at not using this last trick as lambda expressions are *the* core component of functional languages like F#, so I rewrote the function that calculates the average salary using a lambda expression:

I simply stole David's idea and used the lambda function (fun x -> x.salary) to get the salary of an employee, List.map then applies that function to every item on employees and so we get the salaries list in a more compact and efficient way than in my original salaries function. You may also have noticed that I used the betterAverage function:

Which is more compact and readable (for an imperative eye, anyways) than my old average function, the gist of the work (using fold1_left to add up the numbers) is the same as before, but using the if-then-else expression to avoid the division by zero is better than the original match (which I abused, I guess just to show that I can do pattern matching too).

With these new functions betterAverageSalary and betterAverage, you just fill tempted to join them to get something like this:

And so we can get the average salary of a list of employees in one line of code, but without doubt this is too much, because now it's too hard to understand what's going on (this reminds me of the competitions we had at college to see who can possibly solve a problem with the fewer lines of C or Fortran, no matter how uncomprehensible was the code). 

So we could try to combine both definitions in a more controlled way:

Where we define a local function nonEmptySalaryAverage that uses a more local definition (salaries). It is worth noting that these two functions have employees as an implicit parameter by way of the closure mechanism -but I digress. On second thought, I'll stick to my first refinement, among other reasons, because I'll most certainly reuse betterAverage in other places. Thanks to Dave, I'm already a better functional programmer (and yes, Dave, I know what you meant was that it would be better to show an example that couldn't be easily replaced by pre-defined LINQ functions like Average(), more on that in a few nights).

Visual Studio 2008 and F#

Once I downloaded and installed Visual Studio 2008 Team System in this laptop, one of the first things I did was to install F# to see how well it worked in the shiny new IDE. It worked without any hassle and to celebrate I wrote a typical business example: get the average salary of a group of employees.

First some preparatoy work: lines 3 to 7 define a F# record with 4 fields; lines 9 to 13 fill an employee list (not an array, mind you, but a variable length list).

The interesting stuff starts at line 15, the average function takes a list of numbers and returns, well, its average, in this way:

  • If the list is empty it just returns 0
  • In any other case it adds up all the elements in the list (the fold1_left function accumulates all the elements using '+') and divides the result by the length of the list.

Finally, the averageSalary functions takes a list of employees and:

  • Extract their salaries into a list (this is done by the salaries function which is defined locally inside averageSalary)
  • Apply the average function to the salary list

A conosseur may argue that there are better ways of implementing the salaries function, but I didn't want to burden you with more functional delicacies. So, what do you think of this way of programming as compared to C#, VB.NET or Java?

More Posts