FluentPath: a fluent wrapper around System.IO

(c) Bertrand Le Roy 2005 .NET is now more than eight years old, and some of its APIs got old with more grace than others. System.IO in particular has always been a little awkward. It’s mostly static method calls (Path.*, Directory.*, etc.) and some stateful classes (DirectoryInfo, FileInfo). In these APIs, paths are plain strings.

Since .NET v1, lots of good things happened to C#: lambda expressions, extension methods, optional parameters to name just a few. Outside of .NET, other interesting things happened as well. For example, you might have heard about this JavaScript library that had some success introducing a fluent API to handle the hierarchical structure of the HTML DOM. You know? jQuery.

Knowing all that, every time I need to use the stuff in System.IO, I cringe. So I thought I’d just build a more modern wrapper around it. I used a fluent API based on an essentially immutable Path type and an enumeration of such path objects. To achieve the fluent style, a healthy dose of lambda expressions is being used to act on the objects.

Without further ado, here’s an example of what you can do with the new API. In that example, I’m using a Media Center extension that wants all video files to be in their own folder. For that, I need a small tool that creates directories for each video file and moves the files in there. Here’s the code for it:

Path.Get(args[0])
    .GetFiles(p =>
        new string[] {
            ".avi", ".m4v", ".wmv",
            ".mp4", ".dvr-ms", ".mpg", ".mkv"
        }.Contains(p.Extension))
    .CreateDirectory(p => 
        p.Parent
         .Combine(p.FileNameWithoutExtension))
    .Previous()
    .Move(p =>
        p.Parent
         .Combine(p.FileNameWithoutExtension)
         .Combine(p.FileName));

FluentPath1 This code creates a Path object pointing at the path pointed to by the first command line argument of my executable. It then selects all video files. After that, it creates directories that have the same names as each of the files, but without their extension. The result of that operation is the set of created directories. We can now get back to the previous set using the Previous method, and finally we can move each of the files in the set to the corresponding freshly created directory, whose name is the combination of the parent directory and the filename without extension.

The new fluent path library covers a fair part of what’s in System.IO in a single, convenient API. Check it out, I hope you’ll enjoy it. Suggestions are more than welcome. For example, should I make this its own project on CodePlex or is this informal style just OK? Anything missing that you’d like to see? Is there a specific example you’d like to see expressed with the new API? Bugs?

The code can be downloaded from here (this is under MS-PL license):
http://weblogs.asp.net/blogs/bleroy/Samples/FluentPath.zip

UPDATE: updated the license; modified the sample code to use Contains; renamed and refactored Select to be overloads of GetFiles, GetDirectories and GetFileSystemEntries. I did not rename Select to Where because this is *not* filtering the current set, it is creating a new one from the contents of the current path.

19 Comments

  • Please create a CodePlex project for it. It makes it easier to get the source.

  • Hello,

    This is great.
    I often use System.IO when managing files on a CMS in my MVC project.

    I think you should create a Codeplex project.
    I think it is easier for everybody to follow and contribute with suggestions, code, etc.

    Any plans for that?

    Cheers,
    Miguel

  • Codeplex. ALWAYS!

    THanks, nice work..

  • That is so cool! I wonder how many other set of APIs can benefit from a fluent api. stream readers/writers?

  • Very cool, sure makes working with System.IO nicer and maintains continuity with the other code. Every time I needed to work with System.IO I feel like I am back coding in .net 1.1

    I would say toss it up on Codeplex, you will get more feedback if people don't always have to come back to this post and put up a comment.

  • This is seriously cool, working with System.IO is always such a beat down, nice work.

  • Nice.

    Any reason you called it .Select() instead of .Where() or perhaps Find()? Looks more like a LINQ Where() operation than a projection.

    Instead of throwing InvalidOperationException for null arguments, ArgumentNullException might be more appropriate.

    Also any reason for using HashSet() instead of List or even instead of IEnumerable with lazy execution?

    e.g.

    public PathCollection ChangeExtension(Func extensionTransformation)
    {
    var result = _paths.Select(path => new Path(path)).Select(p => p.ChangeExtension(extensionTransformation(p)).ToString());
    return new PathCollection(result, this);
    }


    When dealing with very large directories it would be nice if it didn't enumerate the whole structure unless it needs to.

  • I like Fluent Interfaces for the readability

    would it be possible to do the following for the code sample above?

    var exts = List() {
    ".avi",
    ".mv4",
    // etc
    };

    Path.Get(args[0])
    .Select(p => exts.Contains(p.Extension));

  • Why not Ms-PL? BSD makes this hard on agile product teams,

  • Awesome! A Codeplex site would be nice!

  • Interesting and I hope it catches on fire in the community.

  • Like it a lot! (+1 for CodePlex too.)

  • Thanks for all the kind comments. I made some updates, more to come later.
    First, I'm actively looking into putting it on CodePlex but it's not going to be easy.
    I renamed Select, but not to where, because that would indicate filtering the current set. I renamed and refactored it to be overloads of GetFiles, GetDirectories and GetFileSystementries, which is what it really is.
    I did rename the Filter method on the collection class to be Where though because that is much closer to the existing Where semantics.
    I updated the sample after Michael's suggestion.
    @Hightechrider: I'm using a HashSet because I don't want duplicates in the sets. Now any lazy version of this would require some more work. Maybe later.

  • I appreciate that the wrapper makes the System.IO methods more fluent. The problem that I have with System.IO is that it is very common to need to test code that serializes to disk. It is easy enough to create an interface so that these methods can be faked. Because your wrapper calls directly to the System.IO methods, it will make testing more difficult. Would love to see your wrapper use an instance of an ISystemIO interface instead of the real System.IO.

    I do appreciate the fluent nature of the work.

  • @Bob: totally agree :) With an already done mock version was what I was thinking.

  • Codeplex of course ! and nice job Bertrand :)

    A few comments:

    - I like the mix of multiple informations in a single class: File, Directory, Attributes, etc.
    - A lot of static methods are wrapped in property getters (like CreationTime for example). This new property syntax is great but it gives me the feeling that the property is stored and will not move. Calling GetCreationTime() makes clearly understand that the value is retrieved at each time. Maybe you should keep it in cache or make a clearly visible Update feature. Personnaly I would prefer to have all informations retrieved once during constructor call for example and then remain immutable.
    - Why a PathEnumerator class instead of using yield return ?
    - If I am right, the PathCollection is based on an IEnumerable and never stores "children" of the collection... Actually, if it's not a collection, maybe you should rename it. Why don't you write PathCollection methods like CreateDirectory() in a Linq way ? You are internally using a HashSet that makes your transformation result being stored in memory. If you are using a Linq query instead, and then pass it create the new PathCollection(q, this) you will get a real deferred execution of your pipeline and that's what we think it is when using your FluentPath syntax. (well that's what I thought is was :) )

  • @Mitsu: thanks for the excellent comments.
    The mix of multiple types under a single class is just monkey see monkey do from jQuery and the HTML DOM itself.
    I think I disagree about keeping what's under the properties immutable. Seems to me like that would be the most surprising of the two. Is there a rule that a property should return the same value if got twice without sets?
    I started with yield return but if you try it I think you'll see why it doesn't quite work with the fluent pattern. Didn't try that hard though.
    I agree I should rename as it's not a collection. Pure laziness on my part: I started building it as a collection and then walked back and didn't rename.
    Deferred execution would be really cool, I agree. Not something I'm very likely to have time for anytime soon though.

  • Cool !
    No, there's no rule on properties :)
    But if you look at the IO Api returning FileInfo for example, we have loading methods and then classes that are keeping data. File information are shared and can be very accessed extremely concurrently, that's also why, in this case, I prefer having a snapshot of it.
    Moreover, if the property is accessed many times just because of the code or maybe the data binding of a list, you could have some data refreshed while scrolling the list. You can consider it to be an advantage..or not.

  • @Mitsu: fair point about concurrency. Maybe you could address both scenarios by having a lock method of sorts on the path class. Or yeah, maybe you're right and it should just be purely immutable from the start. Yeah, I'm beginning to think you're right.
    Actually, adding file monitoring events might be an interesting extension to the API, which could be a solution to the rare cases where you want the info to remain accurate.

Comments have been disabled for this content.