The F# PowerPack Released on CodePlex

As announced yesterday, the new February 2010 release of F# is out.  For those using Visual Studio 2008 and Mono, you can pick up the download here.  This release is much more of a stabilization release instead of adding a lot of features including improvements in tooling, the project system and so on.

One of the more interesting announcements today was not only the release, but that the F# PowerPack, a collection of libraries and tools, was released on CodePlex under the MS-PL license.  By releasing the PowerPack on CodePlex, it allows the F# team to have the PowerPack grow more naturally, free of major release cycles such as Visual Studio releases.  What’s included in this release?

What’s in the Box?

The PowerPack includes such tools as:

  • FsLex – a Lexical Analyzer, similar in nature to OCamlLex
  • FsYacc – a LALR parser generator which shares the same specification as OCamlYacc
  • FsHtmlDoc – an HTML document generator for F# code

Just as well, there are quite a few libraries which include:

  • FSharp.PowerPack.dll – includes additional collections such as the LazyList, extension methods for asynchronous workflows, native interoperability, mathematical structures units of measure and more
  • FSharp.PowerPack.Compatibility.dll – support for OCaml compatibility
  • FSharp.PowerPack.Linq.dll – provides support for the LINQ provider model
  • FSharp.PowerPack.Parallel.Seq.dll – perhaps the most interesting in that it provides support for Parallel LINQ and the Task Parallel Library

Just to name a few…

Let’s look briefly though at some of the features. 

LINQ Support

One piece that’s not particularly new, but under-looked is the support for LINQ providers through the FSharp.PowerPack.Linq.dll library.  This means we could support providers such as NHibernate, LINQ to SQL, Entity Framework, MongoDB or any other provider.  To enable this behavior, simply use the query function with an F# expression.  An F# quotation is much like the .NET BCL Expression but with a few extra added goodies and represented in the <@ ... @> form.

let result = query <@ ... @>

Just as well, there are additional query functions that are necessary when dealing with data including:

  • contains
  • groupBy
  • groupJoin
  • join
  • maxBy
  • minBy

Each one of those are rather self explanatory in how it gets transformed back into F# quotations/expressions.  Let’s look at a simple example of grouping customers in California by their customer level from a LINQ to SQL provider.

#if INTERACTIVE
#r "FSharp.PowerPack.Linq.dll"
#endif

open Microsoft.FSharp.Linq
open Microsoft.FSharp.Linq.Query

let context = DbContext()

let groupedCustomers =
  query <@ seq { for customer in context.Customers do
                   if customer.BillingAddress.State = "CA" then
                     yield customer }
               |> groupBy (fun customer ->  customer.Level) @>

As you can see, inside the query function, we have a sequence expression to iterate through our customers looking for all in California, and then outside of the sequence expression, we call the groupBy function which allows us to Key off the customer level.

Perhaps there is one more interesting piece than LINQ support, which is what we find in the FSharp.PowerPack.Parallel.Seq.dll library.

Parallel Extensions via PSeq

In a previous post, I went over how you could use F# and Parallel LINQ (PLINQ) as well as the Task Parallel Library (TPL) together nicely.  There was a bit of a translation layer needed at the time due to the inherent mismatch between .NET Func and Action delegates and F# functions.  What’s nice now is the support for PLINQ and TPL now comes out of the box with the F# PowerPack through the aforementioned library and in particular the PSeqModule.  This module contains many of the same combinators as the SeqModule such as filter, map, fold, zip, iter, etc but with the backing of both PLINQ and the TPL.  For a quick example, we can do the Parallel Grep sample using F# and the PSeq module.

open System
open System.IO
open System.Text.RegularExpressions
open System.Threading

let regexString = @"^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$"
let searchPaths = [@"C:\Tools";@"C:\Work"]
let regexOptions = RegexOptions.Compiled ||| 
                   RegexOptions.IgnoreCase
let regex = new ThreadLocal<Regex>(Func<_>(fun () -> Regex(regexString, regexOptions)))

let searchOption = SearchOption.AllDirectories

let files = seq {
  for searchPath in searchPaths do
    for file in Directory.EnumerateFiles(searchPath, "*.*", searchOption) do
      yield file }

type FileMatch = { Num : int; Text : string; File : string }

let matches =
  files
  |> PSeq.withMergeOptions(ParallelMergeOptions.NotBuffered)
  |> PSeq.collect 
       (fun file ->
          File.ReadLines(file)
            |> Seq.map2
                 (fun i s -> { Num = i; Text = s; File = file } )
                 (seq { 1 .. Int32.MaxValue })
            |> Seq.filter (fun line -> regex.Value.IsMatch(line.Text)))

This above sample looks in the Tools and Work directory of my C drive and determines whether there are any email addresses in there, in parallel.  We’ll cover more of this in depth in the near future, but this is enough to whet your appetite.

Conclusion

With the new release of the F# language, we also have a welcome surprise in the F# PowerPack now finding a home on CodePlex.  This move by the F# team allows the PowerPack to grow more naturally and not be confined to major cycles such as .NET Framework or Visual Studio releases.  Sometimes, the best way to learn the language is to just learn how the libraries were written, and given it is on CodePlex, we now easily have that opportunity.

No Comments