Deriving from a fluent class

Here’s an interesting one, and maybe you can help me make my design less crappy. I have this library that I’m a little proud of, called FluentPath. It’s a fluent wrapper around System.IO that enables you to manipulate files not unlike how jQuery manipulates the DOM, with operations over sets, and lots of functional concepts:

Path.Get(args.Length != 0 ? args[0] : ".")
.Files(
p => new[] {
".avi", ".m4v", ".wmv",
".mp4", ".dvr-ms", ".mpg", ".mkv"
}.Contains(p.Extension))
.CreateDirectories(
p => p.Parent()
.Combine(p.FileNameWithoutExtension))
.End()
.Move(
p => p.Parent()
.Combine(p.FileNameWithoutExtension)
.Combine(p.FileName));

In fact, it’s totally inspired by jQuery, and was born from the frustration of having to work with the antiquated monstrosity of an API that System.IO is.

One user of the library wanted to extend it. The way I had designed things, extensions are made through extension methods. One example is available in the source code that adds zip/unzip capabilities.

This user however wanted to derive from the Path class. I’m not a fan of inheritance, but who am I to judge? The problem comes from the way fluent APIs work: their methods return an instance of the same class, in order to enable chaining. For example, here’s the signature of the ChangeExtension method:

public Path ChangeExtension(Func<Path, string> extensionTransformation)

If you want to derive DerivedPath from this class, one big problem you’re going to have is that this method will return a Path, not a DerivedPath, so your extensions won’t be available on the results of chainable methods. This, for example, won’t work (DoStuff is a method on DerivedPath that doesn’t exist on the Path that Parent() returns):

new DerivedPath("foo").Parent().DoStuff();

Whereas this works just fine:

new DerivedPath("foo").DoStuff()

That’s pretty lame and awkward. So how do we ensure that derived classes’ methods vary their parameters and return types so their own types are used in place of the base Path class?

The solution I implemented and checked in was suggested by PlasmaSoft, the user who exposed the problem in the first place: change Path into a base generic class, then have the generic type parameter be an alias of sorts for the class itself that we can use as a parameter or return type.

Here is what the class declaration for the new base class looks like:

public class PathBase<T> : IEnumerable<T> where T : PathBase<T>, new()

Method declarations have been changed to use the T generic parameter in place of Path for almost the whole public surface of the API. For example, ChangeExtension becomes:

public T ChangeExtension(Func<T, string> extensionTransformation)

The Path class itself has been changed into a derived class of that base class, and sealed:

public sealed class Path : PathBase<Path>

If you want to derive from Path, well, don’t. Instead, derive from PathBase:

public class DerivedPath : PathBase<DerivedPath>

I really don’t like how this looks (convoluted, redundant), but I don’t see another solution. There will be a few awkward places as well. For example, some methods can take an arbitrary derived type for a parameter, but what the return type of the method should be is not always clear: should it be the type of “this”, or the type of the parameter?

public T Combine<TOther>(PathBase<TOther> relativePath) where TOther: PathBase<TOther>, new()

Another problem is that the compiler gets a little confused and doesn’t know that the current class and T are one and the same. There are many places where we’ll want to instantiate a new PathBase<T>, and return it as a T. We can’t just cast the PathBase<T> after instantiation, because as far as the compiler and runtime are concerned, T derives from PathBase<T>, not the other way around, so we need a way to instantiate a T directly (and that instance will behave as a PathBase<T>). All the compiler knows about T’s constructors is provided by the new() constraint: it has a parameterless constructor, which is not the one we need. In order to work around that, I had to implement a number of Create factory methods and use them internally where I would normally have used a regular constructor. This bled into the quality of the implementation as I had to remove “readonly” qualifiers from private fields for entirely technical reasons. I hate when the language doesn’t let you do the right thing (or doesn’t steer you to it).

Finally, derived classes have to implement a bunch of constructors and cast operators that do nothing but delegate to base.

In the end, everything behaves as it should, all tests pass, and we have the additional derivation feature, but I’m really unhappy about the hoops I had to jump through, and about the disagreeable form of derived classes. I prefer the extension method way of extending the library even more than I did before today.

I do have a question for you, my readers, however: can any of you think of or point me to a better way of writing base fluent classes?

22 Comments

  • You could put this responsibility to those that are extending your API where the responsibility really should belong. To do this, they could change the DerivedPath class to return casted version of their classes instead of the base class of your original API. For example, if this is your base Path class:

    public class Path
    {
    public Path Parent()
    {
    // do some stuff
    }
    }

    The DerivedPath class would look like this:

    public class DerivedPath : Path
    {
    public DerivedPath DoStuff()
    {
    // do some other stuff
    }

    public new DerivedPath Parent()
    {
    return (DerivedPath)base.Parent();
    }
    }

    The trick is in using the "new" keyword while overriding the API methods with the same signature (except the return type, of course).

  • @Ivan: thanks for the feedback. Yes, this is a possibility that I should have mentioned in the article. While it's the standard way of doing this in .NET, in this particular case, it would put an excessive burden on the derived class author. This class is sort of a monster (which is a problem in itself), and doing so would force any author to write dozens of new method wrappers, just to get the privilege of writing an additional method. The signal to noise ratio in those classes would be vanishingly small. This, plus I must not be the only one to feel dirty every time he uses "new" to hide a base method. As Eric Lippert said (http://blogs.msdn.com/b/ericlippert/archive/2008/05/21/method-hiding-apologia.aspx) this would probably be best solved with a derived type covariance feature baked into the language (maybe some This keyword could work to represent the current type), but we don't have that at the moment.

  • The solution you've described is called the curiously recurring template pattern. It allows you to survive one level of inheritance... for each additional level of inheritance that allows you to define extra methods that need to be inheretid you need an additional type parameter. This obviously also gets unwieldy quite quickly.

  • @Arthur: yup, you're exactly right. The additional template parameter per inheritance level is an excellent argument against this pattern. I don't see a good solution except with additional language features. I will continue to recommend extension methods to extend the API.

  • I had the exact same situation with my fluent interface and solved it in a similar but simpler way. I didn't need the PathBase class at all. I allowed the Path class to be inherited and changed the public Api to use a generic inheriting from Path

    public T ChangeExtension(Func<T, string> extensionTransformation) where T : Path

    You can have a look at my Fluent Interface in practice at http://navigation.codeplex.com/SourceControl/latest#NavigationSample/App_Start/FluentStateInfoConfig.cs
    If you want any more details let me know. I kept it short in case I'd got the wrong end of the stick.

  • @Graham: so T is still a synonym for Path unless you're using a derived class. I don't like that the public API is exposing such an implementation detail, and a very confusing one at that. This is why I provided a sealed Path class that doesn't have a type parameter. The new class is 100% compatible with the previous version, but there's a technical base class that you only have to know about if you insist on creating a derived type.

  • I didn't realise your ChangeExtension method was on the base class. In my Fluent interface there aren't any methods on the Path class and there's no base class. All the public Api is done through extension methods. I should have written

    public static T ChangeExtension(this T path, Func<T, string> extensionTransformation) where T : Path

    The consumer never knows about the T. They just create a new Path() and then Path flows through the public Api. Even with the DerivedPath class the public Api is still enhanced using extesion methods.

    public static T DerivedMethod(this T path, string param) where T : DerivedPath

  • So are the extension methods an option? For example:

    namespace The.SameNamespace.As.Path;

    public static DerivedPath
    {
    public static Path DoStuff(this Path p)
    {
    // do some stuff

    return p;
    }
    }

  • @Graham: that's interesting, but putting all the implementation on extension methods seems to negate the whole idea of inheritance: can derived classes override extension methods? What happens if the user code doesn't include the implementation namespace? Where is the contract? And I'm sure it calls for a lot of other questions. But that's an interesting and unexpected take for sure.

    @Eric: yes, that's how I recommend you extend the API, except that you wouldn't call that DerivedPath, you'd call it SomethingPathExtensions. I have a set of Zip compression-related extensions as a sample in the source code if you want to take a look at that.

  • I've run into this same situation a few times myself and regretted design decisions that involved inheritance using generic template parameters. It's gets unwieldy very quickly especially if you want to have custom constructors or if you need to end up inheriting multiple levels.

    See this old post of mine: http://weblog.west-wind.com/posts/2009/Aug/18/Generic-Types-and-Inheritance

    A single template parameter can be OK, but more than one and the whole thing comes apart very quickly. I think for what you have here with your fluent path, extension methods are the ideal way to do this, but it doesn't work for everything - especially if you need more complex structures that require properties.

  • It's based on one of the best Fluent Interfaces, Linq. I put the extension methods in the same namespace as Path for simplicity. You do get behavioural inheritance and if you want to allow overriding you could still have virtual methods on the Path class that the extension methods call into.

  • Thanks @Rick, that's very helpful.

    @Graham: good point about LINQ, but I'm not sure how to use the comparison for FluentPath. In particular, I'm not sure where you'd want to use inheritance in LINQ. Another difference is that LINQ is more about having many different implementations of the same interfaces. In the case of FluentPath, the implementation is mostly fixed: it's System.IO, and extensions are more about extending the interface than about creating different implementations of the same interface. There could be a category of extensions that would swap the file system implementation, of course, but that's not what we're talking about here, I think.

  • I'm not trying to compare Linq and FluentPath, just suggesting that the lingua franca of Fluent Apis is extension methods. They allow type information to flow more fluently than do methods on a base class. Give it a go and see if you warm to it. It should eradicate those convoluted situations you mention and would solve the initial problem more elegantly with a DerivedPath inheriting from Path and just the following extension method added

    public static T DoStuff(this T path) where T : DerivedPath

  • @Graham: thanks for the explanation. But you'd still have to declare DerivedPath as a generic class with a new generic parameter that's not on the base class (and that for each inheritance level)? This is actually reinforcing my initial impression, which was that the extension mechanism should be exclusively extension methods, and that I should just seal the class. I'm not sure what scenario justifies using inheritance.

  • Graham correctly points out the only reasonable solution: put all actual functionality in extension methods and have a teensy interface that a class can provide that the actual functionality uses. So in this case you'd have something like:

    public interface IPath { }

    public static class FluentPath {
    public static T DoStuff<T> (this T path) where T: IPath
    }

    public class Path : IPath { }

    public class DerivedPath : Path { }

    public static class DerivedPathOverrides {
    public static T DoStuff<T> (this T derivedPath) where T : DerivedPath
    }

    This pulls apart the methods and the classes, much as in the typeclasses in Haskell, which is where LINQ got at least some of its inspiration from.

  • I forked your project to give you a better idea of what I'm driving at, https://fluentpath.codeplex.com/SourceControl/network/forks/GrahamMendick/FluentDerived.
    I deleted the PathBase and moved the code back into Path. I've only changed the Combine method but similar work could be done on the others.

    You'll see the Combine(params string[] pathTokens) method of Path doesn't create or return a Path anymore, instead it returns IEnumerable<string>. This is an important step in a Fluent api that wants to support derived classes.

    But if you don't create a new Path in Combine how is Previous going to work? The trick is to store the state in the Path object and to use the same Path throughout the Fluent api calls. I've added a Stack of paths property to Path to keep track of the 'history'. You can see how this is used in the Combine extension method in PathExtensions. When Previous is called you can Pop the paths off the Stack and set them into the Paths property.

    You can see the end result in the DerivedPathWorks Test method. You've got a DerivedPath inheriting from Path, no generics in sight, and it's using the new Combine extension method.

  • @Arthur & @Graham: thanks a lot, this is extremely helpful. It's incredibly nice of you to spend all this time spelling it out for me. I'm not sure I understand the point of returning IEnumerable<string> instead of Path: I don't see in what sense the API remains fluent if we do this.

    But the more important question I have at this point is: why would I support derivation in the first place? It seems to me like all useful scenarios are efficiently covered by extension methods.

    In any case, I have a major re-refactoring ahead of me to apply some of the great suggestions that have been given in this thread. Thanks again to all who pitched in.

  • As soon as you create a new Path object in your Fluent Api then you've immediately lost the starting type. If the consumer started with a DerivedPath then you've lost that information as soon as you return a Path from any of your methods.
    Derivation is important in a Fluent Api because some extension methods might be relevant for a DerivedPath but not for a Path. Without supporting derivation you can't achieve this separation.
    Anyway, I don't want to hammer the point home and I'm happy you've got a way forward.


  • Right, but why would I want a DerivedPath in the first place, in the specific case of FluentPath?

  • I guess you should look back at the original Issue raised. That was where it all started, right?

  • That's correct, and I did ask that question to the user. I have yet to receive an answer.

  • And if your not going to support DerivedPath why do you need to refactor at all? Just revert your recent change that added the PathBase

Comments have been disabled for this content.