Functional .NET 4.0 - Tuples and Zip

Previously, when covering some of the additions to the .NET 4.0 Framework such as optional and named parameters, some of the other additions have caught my eye from the perspective of functional programming.  Unlike .NET 3.5, this release is not as targeted towards functional programming as it is more towards dynamic programming and COM interoperability.  But, there are a few items to note that we can soon take advantage of, including the Tuple type and the Zip operator function among other items.

 

Looking at Tuples

To define a tuple, it comes from the mathematics field, and is simply an ordered list of values, which are components of that tuple.  These components can be of any type, whether it be string, integer, or otherwise.  In order to refer to these components, we retrieve references to them by absolute position in that sequence.

In F#, the tuple is a fully supported data type and perhaps one of the most useful.  To define a tuple is to define a number of expressions grouped together with comma separation to form a new expression such as the following:

#light 

// val blogger : string * string
let blogger1 = "Matthew", "Podwysocki"
let blogger2 = "Jeremy", "Miller"
let blogger3 = "David", "Laribee"

// val bloggers :
// (string * string) * (string * string) * (string * string)
let bloggers = blogger1, blogger2, blogger3
 

From there, tuples can be decomposed into their components in either of two ways.  For pair tuples, a tuple with exactly two components, can be deconstructed using the fst and snd functions such as the following:

#light

let pageHitCount = "http://www.codebetter.com/", 25500
let page = fst pageHitCount
let hitCount = snd pageHitCount
 

However, it is more common to use a pattern expression to retrieve values from a tuple, such as the following code:

#light

let request = 
  "http://codebetter"
  new DateTime(2008, 11, 15),
  "Firefox/3.0.4 (.NET CLR 3.5.30729)"

let host, date, userAgent = request
 

What we're able to do is break down the given request into three pieces, the host, date and user-agent.  This makes pattern matching against tuples really powerful such as the following:

#light

// val permit_request : string * string * int -> bool
let permit_request = function
  | "http", "google.com", 80 -> true
  | _, "microsoft.com", _ -> true
  | "ftp", _, 21 -> true
  | _ -> false

Just as well, we could even use them in active patterns so that we could pattern match against parts of a FileVersionInfo in order to determine which action to take such as the following.

#light

open System.Diagnostics

let (|FileVersionSections|) (f:FileVersionInfo)
  (f.FileName,f.ProductName,f.ProductVersion)
    
let parse_files = function
  | FileVersionSections(fn, "Parallel Extensions for the .NET Framework", _) 
      -> printfn "Parallel Extensions file %s" fn
  | FileVersionSections(fn, "Microsoft Office Communicator 2007", _) 
      -> printfn "Communicator file %s" fn
  | _ -> printfn "Unknown file"

You're saying, great, but what has this to do with .NET 4.0?

 

Tuples in .NET 4.0

Earlier this month, Justin Van Patten, on the BCL Team Blog announced some the base class library changes coming to .NET 4.0.  Among them was Tuples in which was stated:

We are providing common tuple types in the BCL to facilitate language interoperability and to reduce duplication in the framework.  A tuple is a simple generic data structure that holds an ordered set of items of heterogeneous types.  Tuples are supported natively in languages such as F# and IronPython, but are also easy to use from any .NET language such as C# and VB.

What does that mean exactly?  Does this mean we'll get some form of syntactic sugar around them as well?  If we could decompose them in such a fashion as we do in F#, and to initialize them in an easy fashion without a lot of pomp and circumstance.  Gazing from the intent in the .NET libraries, something like this might be an option:

var t1 = Tuple.Create(1, 'a'); 
var i1 = t1.Item1; 
var i2 = t1.Item2;
 

But, unless there were a nicer way to tease apart the data, it just becomes a simple holder of data, instead of something that we could decompose easily into patterns.  I would love to some sort of syntactic sugar in C# to allow for this to happen.  I am, however, encouraged that they are working with the F# team and other teams to ensure compatibility between the libraries.

If you open Reflector, you will find the type definitions in mscorlib.dll version 4.0 under the System namespace.  You may also be surprised to find them as internal classes only at this point, which is unfortunate and we find ourselves not able to take advantage of these things such as the CodeContracts and BigInteger/BigNumber in the .NET 3.5 libraries.

net40_tuple

As you can see from the above screen catpure, we have up to 7 arguments for a given tuple and the rest as defined by another tuple.  I, however, haven't seen tuples quite that large before, but it's always possible...

 

The Zip Operator Function

Another functional programming item that has been added to the System.Linq.Enumerable class in System.Core.dll.  This method allows us to combine two collections together using a function to calculate the new element given the element from each collection. 

In F#, we have two ways of doing this in F# in the Seq module.  They are defined as:

val map2 : ('a -> 'b -> 'c) -> seq<'a> -> seq<'b> -> seq<'c>
val zip : seq<'a> -> seq<'b> -> seq<'a * 'b>
 

The map2 function takes a function which takes the two items from the list to produce the calculated item, and two collections and returns a new collection of the computed items.  If one collection is shorter than the other, the rest of the computations are not completed.  The zip function is a simple function which takes the items from the first and second collection and combines them in a tuple.

An example of each follows:

// Map2
let l1 = seq[1 .. 26]
let l2 = seq['a' .. 'z']
let mapped = Seq.map2 
  (fun a b -> sprintf "%d%c" a b) l1 l2

// Zip
let z1 = seq['a' .. 'z']
let z2 = seq['A' .. 'Z']
let zipped = Seq.zip z1 z2
 

Now, to use the C# version is rather simple.  The signature of this method is simply:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
  this IEnumerable<TFirst> first, 
  IEnumerable<TSecond> second, 
  Func<TFirst, TSecond, TResult> func)
 

As you can see, this follows the same exact pattern as the map2 function from F#, which allows us to compose the two items together via a function to compute the hew item.  Let's define a few tests that will pass given our knowledge of the Zip function.  We can also include the Tuple class, well, at least the F# version to show that behavior as well.  These are basically tests that I wrote for my Functional C# library, but I hadn't published my tests, and I probably should. 

[Fact]
public void ZipWithAdding_ShouldAddRanges()
{
    // Arrange
    var range1 = Enumerable.Range(1, 10);
    var range2 = Enumerable.Range(11, 10);

    // Act
    var range3 = range1.Zip(range2, (i, j) => i + j);

    // Assert
    Assert.True(range3.Count() == 10);
    Assert.True(range3.ElementAt(0) == 12);
    Assert.True(range3.ElementAt(9) == 30);
}

[Fact]
public void ZipWithIntAndChar_ShouldCombine()
{
    // Arrange
    var range1 = Enumerable.Range(1, 5);
    var range2 = new[] { 'a', 'b', 'c', 'd', 'e' };

    // Act
    var range3 = range1.Zip(range2, (i, j) => i + j.ToString());

    // Assert
    Assert.True(range3.Count() == 5);
    Assert.True(range3.ElementAt(0) == "1a");
    Assert.True(range3.ElementAt(4) == "5e");
}

[Fact]
public void ZipWithTuples_ShouldCombineLists()
{
    // Arrange
    var range1 = Enumerable.Range(1, 5);
    var range2 = new[] { 'a', 'b', 'c', 'd', 'e' };

    // Act
    var range3 = range1.Zip(range2, (i, j) => new Tuple<int, char>(i, j));

    // Assert
    Assert.True(range3.Count() == 5);
    Assert.True(range3.ElementAt(0).CompareTo(
        new Tuple<int, char>(1, 'a')) == 0);
    Assert.True(range3.ElementAt(4).CompareTo(
        new Tuple<int, char>(5, 'e')) == 0);
}
 

Because the Parallel Extensions for .NET are becoming part of the BCL in .NET 4.0, the ParallelEnumerable also has the Zip method as well, which allows us to take advantage of data parallelism, should our machine allow such as this:

[Fact]
public void ParallelZipWithAdding_ShouldAddRanges()
{
    // Arrange
    var range1 = ParallelEnumerable.Range(1, 10);
    var range2 = ParallelEnumerable.Range(11, 10);

    // Act
    var range3 = range1.Zip(range2, (i, j) => i + j);

    // Assert
    Assert.True(range3.Count() == 10);
    Assert.True(range3.ElementAt(0) == 12);
    Assert.True(range3.ElementAt(9) == 30);
}
 

With the including of the Parallel Extensions for .NET, it's going to be a lot of more functional fun built-into the BCL.  I can only hope for more of this coming down the road.

 

Wrapping It Up

Although the .NET Framework 4.0 release didn't give too many items for functional programming as they had in the .NET 3.5 release, the .NET 4.0 Framework has a few interesting items with the inclusion of the Tuple and Zip method.  Even more intrigue of course comes from adding the Parallel Extensions for the .NET Framework to the base class library as a first-class citizen. 

I'd wish they would have made the Tuple class public already so that we could at least play around with some of those features, but I as many understand why some of the decisions were made, and I also realize it's early in the cycle still.  Hopefully in the next release or so, it will become available to us to use, including some syntactic sugar about their creation and decomposition, but ultimately, that's a language decision and not necessarily a framework decision. 

It's great to see a language convergence in terms of libraries now being available to all.  F# will continue to be my language of choice, but with C# gaining some of these libraries, it makes the transition switch much easier without having to reinvent the wheel with either re-implementing the feature or converting Func delegates to F# FastFunc types and back again.



kick it on DotNetKicks.com

8 Comments

  • You didn't mention that tuples are fixed length.

  • @Alexey

    Well, it might seem kind of obvious though from their usage that they were fixed length as well as the fact that once assigned are read-only.

    Matt

  • I meant that they are not heterogeneous functional lists such as HList in Haskell.

  • Great post.

    It really is too bad that Tuples aren't public. I mean, come on...

    Glad to see that Zip made it in. Now I won't need to use my own implementation. :)

    I'd really like to see some syntactic sugar around Tuples, for sure. It would be great to be able to return multiple values from a method (and get rid of "out" parameters alltogether.) Also, support for the "I don't care" symbol _ would be nice, too.

  • @Alexey

    Ah, I see what you mean, no it isn't like an HList, but instead just the same as a Haskell tuple.

    Matt

  • @Matt

    Thanks. Agreed, now we can dump our own implementations, although I still keep my overloaded one to automatically Tuple the return should I feel fit, as they didn't. You'll notice the difference between the Map2 and Zip.

    I agree there needs to be the syntactic sugar around the use of tuples in order for them to become useful. Some decomposition ways above and beyond tuple.Item1 and tuple.Item2. It's helpful, but not really all that helpful when I want to break down into patterns.

    I already have the "I Don't Care Symbol" in a limited fashion, because _ is a valid variable name...

    Matt

  • Functional net 4 0 tuples and zip.. Huh, really? :)

  • Wow! I could not even guess about it)) Not bad.

Comments have been disabled for this content.