Functional .NET 4.0 - Tuples and Zip
Previously, when covering some of the additions to the .NET 4.0 Framework such as optional and named parameters, some of the other additions have caught my eye from the perspective of functional programming. Unlike .NET 3.5, this release is not as targeted towards functional programming as it is more towards dynamic programming and COM interoperability. But, there are a few items to note that we can soon take advantage of, including the Tuple type and the Zip operator function among other items.
Looking at Tuples
To define a tuple, it comes from the mathematics field, and is simply an ordered list of values, which are components of that tuple. These components can be of any type, whether it be string, integer, or otherwise. In order to refer to these components, we retrieve references to them by absolute position in that sequence.
In F#, the tuple is a fully supported data type and perhaps one of the most useful. To define a tuple is to define a number of expressions grouped together with comma separation to form a new expression such as the following:
// val blogger : string * string
let blogger1 = "Matthew", "Podwysocki"
let blogger2 = "Jeremy", "Miller"
let blogger3 = "David", "Laribee"
// val bloggers :
// (string * string) * (string * string) * (string * string)
let bloggers = blogger1, blogger2, blogger3
From there, tuples can be decomposed into their components in either of two ways. For pair tuples, a tuple with exactly two components, can be deconstructed using the fst and snd functions such as the following:
let pageHitCount = "http://www.codebetter.com/", 25500
let page = fst pageHitCount
let hitCount = snd pageHitCount
However, it is more common to use a pattern expression to retrieve values from a tuple, such as the following code:
let request =
"http://codebetter",
new DateTime(2008, 11, 15),
"Firefox/3.0.4 (.NET CLR 3.5.30729)"
let host, date, userAgent = request
What we're able to do is break down the given request into three pieces, the host, date and user-agent. This makes pattern matching against tuples really powerful such as the following:
// val permit_request : string * string * int -> bool
let permit_request = function
| "http", "google.com", 80 -> true
| _, "microsoft.com", _ -> true
| "ftp", _, 21 -> true
| _ -> false
Just as well, we could even use them in active patterns so that we could pattern match against parts of a FileVersionInfo in order to determine which action to take such as the following.
open System.Diagnostics
let (|FileVersionSections|) (f:FileVersionInfo) =
(f.FileName,f.ProductName,f.ProductVersion)
let parse_files = function
| FileVersionSections(fn, "Parallel Extensions for the .NET Framework", _)
-> printfn "Parallel Extensions file %s" fn
| FileVersionSections(fn, "Microsoft Office Communicator 2007", _)
-> printfn "Communicator file %s" fn
| _ -> printfn "Unknown file"
You're saying, great, but what has this to do with .NET 4.0?
Tuples in .NET 4.0
Earlier this month, Justin Van Patten, on the BCL Team Blog announced some the base class library changes coming to .NET 4.0. Among them was Tuples in which was stated:
We are providing common tuple types in the BCL to facilitate language interoperability and to reduce duplication in the framework. A tuple is a simple generic data structure that holds an ordered set of items of heterogeneous types. Tuples are supported natively in languages such as F# and IronPython, but are also easy to use from any .NET language such as C# and VB.
What does that mean exactly? Does this mean we'll get some form of syntactic sugar around them as well? If we could decompose them in such a fashion as we do in F#, and to initialize them in an easy fashion without a lot of pomp and circumstance. Gazing from the intent in the .NET libraries, something like this might be an option:
var i1 = t1.Item1;
var i2 = t1.Item2;
But, unless there were a nicer way to tease apart the data, it just becomes a simple holder of data, instead of something that we could decompose easily into patterns. I would love to some sort of syntactic sugar in C# to allow for this to happen. I am, however, encouraged that they are working with the F# team and other teams to ensure compatibility between the libraries.
If you open Reflector, you will find the type definitions in mscorlib.dll version 4.0 under the System namespace. You may also be surprised to find them as internal classes only at this point, which is unfortunate and we find ourselves not able to take advantage of these things such as the CodeContracts and BigInteger/BigNumber in the .NET 3.5 libraries.
As you can see from the above screen catpure, we have up to 7 arguments for a given tuple and the rest as defined by another tuple. I, however, haven't seen tuples quite that large before, but it's always possible...
The Zip Operator Function
Another functional programming item that has been added to the System.Linq.Enumerable class in System.Core.dll. This method allows us to combine two collections together using a function to calculate the new element given the element from each collection.
In F#, we have two ways of doing this in F# in the Seq module. They are defined as:
val zip : seq<'a> -> seq<'b> -> seq<'a * 'b>
The map2 function takes a function which takes the two items from the list to produce the calculated item, and two collections and returns a new collection of the computed items. If one collection is shorter than the other, the rest of the computations are not completed. The zip function is a simple function which takes the items from the first and second collection and combines them in a tuple.
An example of each follows:
let l1 = seq[1 .. 26]
let l2 = seq['a' .. 'z']
let mapped = Seq.map2
(fun a b -> sprintf "%d%c" a b) l1 l2
// Zip
let z1 = seq['a' .. 'z']
let z2 = seq['A' .. 'Z']
let zipped = Seq.zip z1 z2
Now, to use the C# version is rather simple. The signature of this method is simply:
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> func)
As you can see, this follows the same exact pattern as the map2 function from F#, which allows us to compose the two items together via a function to compute the hew item. Let's define a few tests that will pass given our knowledge of the Zip function. We can also include the Tuple class, well, at least the F# version to show that behavior as well. These are basically tests that I wrote for my Functional C# library, but I hadn't published my tests, and I probably should.
public void ZipWithAdding_ShouldAddRanges()
{
// Arrange
var range1 = Enumerable.Range(1, 10);
var range2 = Enumerable.Range(11, 10);
// Act
var range3 = range1.Zip(range2, (i, j) => i + j);
// Assert
Assert.True(range3.Count() == 10);
Assert.True(range3.ElementAt(0) == 12);
Assert.True(range3.ElementAt(9) == 30);
}
[Fact]
public void ZipWithIntAndChar_ShouldCombine()
{
// Arrange
var range1 = Enumerable.Range(1, 5);
var range2 = new[] { 'a', 'b', 'c', 'd', 'e' };
// Act
var range3 = range1.Zip(range2, (i, j) => i + j.ToString());
// Assert
Assert.True(range3.Count() == 5);
Assert.True(range3.ElementAt(0) == "1a");
Assert.True(range3.ElementAt(4) == "5e");
}
[Fact]
public void ZipWithTuples_ShouldCombineLists()
{
// Arrange
var range1 = Enumerable.Range(1, 5);
var range2 = new[] { 'a', 'b', 'c', 'd', 'e' };
// Act
var range3 = range1.Zip(range2, (i, j) => new Tuple<int, char>(i, j));
// Assert
Assert.True(range3.Count() == 5);
Assert.True(range3.ElementAt(0).CompareTo(
new Tuple<int, char>(1, 'a')) == 0);
Assert.True(range3.ElementAt(4).CompareTo(
new Tuple<int, char>(5, 'e')) == 0);
}
Because the Parallel Extensions for .NET are becoming part of the BCL in .NET 4.0, the ParallelEnumerable also has the Zip method as well, which allows us to take advantage of data parallelism, should our machine allow such as this:
public void ParallelZipWithAdding_ShouldAddRanges()
{
// Arrange
var range1 = ParallelEnumerable.Range(1, 10);
var range2 = ParallelEnumerable.Range(11, 10);
// Act
var range3 = range1.Zip(range2, (i, j) => i + j);
// Assert
Assert.True(range3.Count() == 10);
Assert.True(range3.ElementAt(0) == 12);
Assert.True(range3.ElementAt(9) == 30);
}
With the including of the Parallel Extensions for .NET, it's going to be a lot of more functional fun built-into the BCL. I can only hope for more of this coming down the road.
Wrapping It Up
Although the .NET Framework 4.0 release didn't give too many items for functional programming as they had in the .NET 3.5 release, the .NET 4.0 Framework has a few interesting items with the inclusion of the Tuple and Zip method. Even more intrigue of course comes from adding the Parallel Extensions for the .NET Framework to the base class library as a first-class citizen.
I'd wish they would have made the Tuple class public already so that we could at least play around with some of those features, but I as many understand why some of the decisions were made, and I also realize it's early in the cycle still. Hopefully in the next release or so, it will become available to us to use, including some syntactic sugar about their creation and decomposition, but ultimately, that's a language decision and not necessarily a framework decision.
It's great to see a language convergence in terms of libraries now being available to all. F# will continue to be my language of choice, but with C# gaining some of these libraries, it makes the transition switch much easier without having to reinvent the wheel with either re-implementing the feature or converting Func delegates to F# FastFunc types and back again.