Hierarchical Linq Queries

After reading Bill Wagner's excellent introduction to Linq, I really liked what I saw and decided that I wanted to create an extension method as an exercise. I sat and thought what I would have wanted from Linq and decided I wanted to write a method that lets you select hierarchical data.

Here's a mockup of how a query would look like:
from   node in tree.ByHierarchy<HierarchicalData>(startWith, connectBy)
select node;
Each 'node' in the query will be an object holding the original item, its level of depth in the hierarchy and its parent (for convenience's sake):
public class Node<T>
{
    public int Level;
    public Node<T> Parent;
    public T Item;
}
This node will be used to determine what the item is and where it's located.
Let's take an example:
Tree t7 = new Tree { Value = 7 };
Tree t6 = new Tree { Value = 6 };
Tree t5 = new Tree { Value = 5, Left = t6, Right = t7 };
Tree t4 = new Tree { Value = 4 };
Tree t3 = new Tree { Value = 3 };
Tree t2 = new Tree { Value = 2, Left = t3, Right = t4 };
Tree t1 = new Tree { Value = 1, Left = t2, Right = t5 };
 
Tree[] treeNodes = new Tree[] { t1, t2, t3, t4, t5, t6, t7 };
 
var nodes = from   node in treeNodes.ByHierarchy<Tree>(
                               t => t == t1,
                               (parent, child) => (parent.Left == child) ||
                                                  (parent.Right == child))
            select node;
 
foreach (LinqExtensions.Node<Tree> n in nodes)
{
    for (int i = 0; i < n.Level; i++) Console.Write(' ');
 
    Console.WriteLine("Node {0}, child of {1}.",
                      n.Item.Value,
                      (n.Parent != null ? n.Parent.Item.Value.ToString() : "no one"));
}
This example will walk on the tree (which so happens to be a three-tiered complete binary tree) and print it in the order of hierarchy:
Node 1, child of no one.
Node 2, child of 1.
Node 3, child of 2.
Node 4, child of 2.
Node 5, child of 1.
Node 6, child of 5.
Node 7, child of 5.
And here's the code that does this:
using System;
using System.Collections.Generic;
using System.Query;

namespace DotNetZen.Linq
{
    public static partial class LinqExtensions
    {
        public class Node<T>
        {
            internal Node()
            {
            }
 
            public int Level;
            public Node<T> Parent;
            public T Item;
        }
 
        public static IEnumerable<Node<T>> ByHierarchy<T>(
            this IEnumerable<T> source,
            Func<T, bool> startWith,
            Func<T, T, bool> connectBy)
        {
            return source.ByHierarchy<T>(startWith, connectBy, null);
        }

        
        private static IEnumerable<Node<T>> ByHierarchy<T>(
            this IEnumerable<T> source,
            Func<T, bool> startWith,
            Func<T, T, bool> connectBy,
            Node<T> parent)
        {
            int level = (parent == null ? 0 : parent.Level + 1);
 
            if (source == null)
                throw new ArgumentNullException("source");
 
            if (startWith == null)
                throw new ArgumentNullException("startWith");
 
            if (connectBy == null)
                throw new ArgumentNullException("connectBy");
 
            foreach (T value in from   item in source
                                where  startWith(item)
                                select item)
            {
                Node<T> newNode = new Node<T> { Level = level, Parent = parent, Item = value };
 
                yield return newNode;
 
                foreach (Node<T> subNode in source.ByHierarchy<T>(possibleSub => connectBy(value, possibleSub),
                                            connectBy, newNode))
                {
                    yield return subNode;
                }
            }
        }
    }
}

2 Comments

  • Looks very nice, Way to go omer!
    This is a very creative solution to a very old problem. Nice.

    Quick notes.
    I think that this line:
    return source.ByHierarchy(startWith, connectBy, null);
    should be:
    return ByHierarchy(source, startWith, connectBy, null);

    The fact that you yield the results in the middle cause what's called an inner scan of this tree (or in hebrew: סריקה תוכית). you might want to consider how this manages with final and first scans (or in hebrew: סריקה תחילית וסריקה סופית).

  • The lines are actually the same, since the extension method will end up looking like you suggested, only in the resulting IL.

    I didn't really intend on creating a prefix, postfix or infix scan, since this isn't aimed at trees (I only used a binary tree because it's one of the simplest examples of hierarchy), but rather at hierarchial relational entities.

    Actually, it could be nice to try creating those types of scans, but since there's no default implementation of a binary tree, I'll have to resort to a lot of delegates, which will kind of make it uglier.
    Perhaps the Power Collections bits would prove a good default.

Comments have been disabled for this content.