Tales from the Evil Empire

Bertrand Le Roy's blog

News


Bertrand Le Roy

BoudinFatal's Gamercard

Tales from the Evil Empire - Blogged

Blogs I read

My other stuff

Archives

FluentPath: a fluent wrapper around System.IO

(c) Bertrand Le Roy 2005 .NET is now more than eight years old, and some of its APIs got old with more grace than others. System.IO in particular has always been a little awkward. It’s mostly static method calls (Path.*, Directory.*, etc.) and some stateful classes (DirectoryInfo, FileInfo). In these APIs, paths are plain strings.

Since .NET v1, lots of good things happened to C#: lambda expressions, extension methods, optional parameters to name just a few. Outside of .NET, other interesting things happened as well. For example, you might have heard about this JavaScript library that had some success introducing a fluent API to handle the hierarchical structure of the HTML DOM. You know? jQuery.

Knowing all that, every time I need to use the stuff in System.IO, I cringe. So I thought I’d just build a more modern wrapper around it. I used a fluent API based on an essentially immutable Path type and an enumeration of such path objects. To achieve the fluent style, a healthy dose of lambda expressions is being used to act on the objects.

Without further ado, here’s an example of what you can do with the new API. In that example, I’m using a Media Center extension that wants all video files to be in their own folder. For that, I need a small tool that creates directories for each video file and moves the files in there. Here’s the code for it:

Path.Get(args[0])
    .GetFiles(p =>
        new string[] {
            ".avi", ".m4v", ".wmv",
            ".mp4", ".dvr-ms", ".mpg", ".mkv"
        }.Contains(p.Extension))
    .CreateDirectory(p => 
        p.Parent
         .Combine(p.FileNameWithoutExtension))
    .Previous()
    .Move(p =>
        p.Parent
         .Combine(p.FileNameWithoutExtension)
         .Combine(p.FileName));

FluentPath1 This code creates a Path object pointing at the path pointed to by the first command line argument of my executable. It then selects all video files. After that, it creates directories that have the same names as each of the files, but without their extension. The result of that operation is the set of created directories. We can now get back to the previous set using the Previous method, and finally we can move each of the files in the set to the corresponding freshly created directory, whose name is the combination of the parent directory and the filename without extension.

The new fluent path library covers a fair part of what’s in System.IO in a single, convenient API. Check it out, I hope you’ll enjoy it. Suggestions are more than welcome. For example, should I make this its own project on CodePlex or is this informal style just OK? Anything missing that you’d like to see? Is there a specific example you’d like to see expressed with the new API? Bugs?

The code can be downloaded from here (this is under MS-PL license):
http://weblogs.asp.net/blogs/bleroy/Samples/FluentPath.zip

UPDATE: updated the license; modified the sample code to use Contains; renamed and refactored Select to be overloads of GetFiles, GetDirectories and GetFileSystemEntries. I did not rename Select to Where because this is *not* filtering the current set, it is creating a new one from the contents of the current path.

Comments

Henri said:

Please create a CodePlex project for it. It makes it easier to get the source.

# March 10, 2010 4:54 AM

shapper said:

Hello,

This is great.

I often use System.IO when managing files on a CMS in my MVC project.

I think you should create a Codeplex project.

I think it is easier for everybody to follow and contribute with suggestions, code, etc.

Any plans for that?

Cheers,

Miguel

# March 10, 2010 8:43 AM

Aaron said:

Codeplex.  ALWAYS!

THanks, nice work..

# March 10, 2010 9:36 AM

Paul Knopf said:

That is so cool! I wonder how many other set of APIs can benefit from a fluent api. stream readers/writers?

# March 10, 2010 9:47 AM

Pete Blair said:

Very cool, sure makes working with System.IO nicer and maintains continuity with the other code. Every time I needed to work with System.IO I feel like I am back coding in .net 1.1

I would say toss it up on Codeplex, you will get more feedback if people don't always have to come back to this post and put up a comment.

# March 10, 2010 10:11 AM

Justin said:

This is seriously cool, working with System.IO is always such a beat down, nice work.

# March 10, 2010 10:58 AM

Hightechrider said:

Nice.  

Any reason you called it .Select() instead of .Where() or perhaps Find()?  Looks more like a LINQ Where() operation than a projection.

Instead of throwing InvalidOperationException for null arguments,  ArgumentNullException might be more appropriate.

Also any reason for using HashSet<string>() instead of List<string> or even instead of IEnumerable with lazy execution?

e.g.

public PathCollection ChangeExtension(Func<Path, string> extensionTransformation)

{

  var result = _paths.Select(path => new Path(path)).Select(p => p.ChangeExtension(extensionTransformation(p)).ToString());

   return new PathCollection(result, this);

}

When dealing with very large directories it would be nice if it didn't enumerate the whole structure unless it needs to.

# March 10, 2010 11:58 AM

michael herndon said:

I like Fluent Interfaces for the readability

would it be possible to do the following for the code sample above?

var exts = List<string>() {

     ".avi",

     ".mv4",  

     // etc

};

Path.Get(args[0])

       .Select(p => exts.Contains(p.Extension));

# March 10, 2010 12:03 PM

Jeff said:

Why not Ms-PL? BSD makes this hard on agile product teams,

# March 10, 2010 1:27 PM

Joerg Battermann said:

Awesome! A Codeplex site would be nice!

# March 10, 2010 2:06 PM

Phil said:

Interesting and I hope it catches on fire in the community.

# March 10, 2010 4:03 PM

Sean Kearon said:

Like it a lot!  (+1 for CodePlex too.)

# March 11, 2010 7:51 AM

Bertrand Le Roy said:

Thanks for all the kind comments. I made some updates, more to come later.

First, I'm actively looking into putting it on CodePlex but it's not going to be easy.

I renamed Select, but not to where, because that would indicate filtering the current set. I renamed and refactored it to be overloads of GetFiles, GetDirectories and GetFileSystementries, which is what it really is.

I did rename the Filter method on the collection class to be Where though because that is much closer to the existing Where semantics.

I updated the sample after Michael's suggestion.

@Hightechrider: I'm using a HashSet because I don't want duplicates in the sets. Now any lazy version of this would require some more work. Maybe later.

# March 13, 2010 3:16 AM

Bob Cravens said:

I appreciate that the wrapper makes the System.IO methods more fluent. The problem that I have with System.IO is that it is very common to need to test code that serializes to disk. It is easy enough to create an interface so that these methods can be faked. Because your wrapper calls directly to the System.IO methods, it will make testing more difficult. Would love to see your wrapper use an instance of an ISystemIO interface instead of the real System.IO.

I do appreciate the fluent nature of the work.

# March 13, 2010 8:48 AM

Bertrand Le Roy said:

@Bob: totally agree :) With an already done mock version was what I was thinking.

# March 14, 2010 12:14 AM

Mitsu said:

Codeplex of course ! and nice job Bertrand :)

A few comments:

- I like the mix of multiple informations in a single class: File, Directory, Attributes, etc.

- A lot of static methods are wrapped in property getters (like CreationTime for example). This new property syntax is great but it gives me the feeling that the property is stored and will not move. Calling GetCreationTime() makes clearly understand that the value is retrieved at each time. Maybe you should keep it in cache or make a clearly visible Update feature. Personnaly I would prefer to have all informations retrieved once during constructor call for example and then remain immutable.

- Why a PathEnumerator class instead of using yield return ?

- If I am right, the PathCollection is based on an IEnumerable<string> and never stores "children" of the collection... Actually, if it's not a collection, maybe you should rename it. Why don't you write PathCollection methods like CreateDirectory() in a Linq way ? You are internally using a HashSet<string> that makes your transformation result being stored in memory. If you are using a Linq query instead, and then pass it create the new PathCollection(q, this) you will get a real deferred execution of your pipeline and that's what we think it is when using your FluentPath syntax. (well that's what I thought is was :) )

# March 29, 2010 10:59 AM

Bertrand Le Roy said:

@Mitsu: thanks for the excellent comments.

The mix of multiple types under a single class is just monkey see monkey do from jQuery and the HTML DOM itself.

I think I disagree about keeping what's under the properties immutable. Seems to me like that would be the  most surprising of the two. Is there a rule that a property should return the same value if got twice without sets?

I started with yield return but if you try it I think you'll see why it doesn't quite work with the fluent pattern. Didn't try that hard though.

I agree I should rename as it's not a collection. Pure laziness on my part: I started building it as a collection and then walked back and didn't rename.

Deferred execution would be really cool, I agree. Not something I'm very likely to have time for anytime soon though.

# March 29, 2010 6:09 PM

Mitsu said:

Cool !

No, there's no rule on properties :)

But if you look at the IO Api returning FileInfo for example, we have loading methods and then classes that are keeping data. File information are shared and can be very accessed extremely concurrently, that's also why, in this case, I prefer having a snapshot of it.

Moreover, if the property is accessed many times just because of the code or maybe the data binding of a list, you could have some data refreshed while scrolling the list. You can consider it to be an advantage..or not.

# March 30, 2010 4:13 AM

Bertrand Le Roy said:

@Mitsu: fair point about concurrency. Maybe you could address both scenarios by having a lock method of sorts on the path class. Or yeah, maybe you're right and it should just be purely immutable from the start. Yeah, I'm beginning to think you're right.

Actually, adding file monitoring events might be an interesting extension to the API, which could be a solution to the rare cases where you want the info to remain accurate.

# March 30, 2010 1:39 PM