Recursing into Recursion - Memoization
Let's catch up to where we are today:
- Part 1 - Basic Recursion Techniques
- Part 2 - Linear, Tail, and Binary Recursion
- Part 3 - Recursion on Lists
Memoization
When doing heavy computations through recursion, memoization becomes a pretty important topic. The idea behind memoization is to speed up your programs by avoiding repetitive calculations for previously processed function inputs. Let's go ahead and try a simple example using the ever so trite Fibonacci sequence as a quick example.
F#C#
{
return n <= 2 ? 1 : Fibonacci(n - 2) + Fibonacci(n - 1);
}
But, if you call these above functions time and time again with nearly the same input, they have to go through the same computation again and again, which isn't necessary. So, instead, let's try through the technique of memoization to speed up this computation.
F#
let t = new Dictionary<_,_>()
let rec fibCached n =
if t.ContainsKey(n) then t.[n]
elif n <= 2 then 1
else
let res = fibCached(n - 2) + fibCached(n - 1)
t.Add(n, res)
res
fun n -> fibCached n
C#
{
var t = new Dictionary<int, int>();
Func<int, int> fibCached = null;
fibCached = n =>
{
if (t.ContainsKey(n)) return t[n];
if (n <= 2) return 1;
var result = fibCached(n - 2) + fibCached(n - 1);
t.Add(n, result);
return result;
};
return n => fibCached(n);
}
As you can see from the above code, we're using a Dictionary<TKey, TValue> as our storage for already computed values. Then we go ahead and compute the values if we have not seen them before. We'll see a pretty impressive speed boost when calling with repetitive values. But, that's not a generic solution, and I'd rather have the ability to take a generic function and pass any function to this and compute. So, let's go ahead and make this generic.
F#
let t = new System.Collections.Generic.Dictionary<'a, 'b>()
fun n ->
if t.ContainsKey(n) then t.[n]
else
let res = f n
t.Add(n, res)
res
let rec fibonacci=
memoize(fun n -> if n <= 2 then 1 else fibonacci(n - 1) + fibonacci(n - 2))
C#
{
var t = new Dictionary<T, TResult>();
return n =>
{
if (t.ContainsKey(n)) return t[n];
var result = func(n);
t.Add(n, result);
return result;
};
}
static int Fibonacci(int n)
{
return n <= 2 ? 1 : Fibonacci(n - 2) + Fibonacci(n - 1);
}
var fibonacciMemoized = fibonacci.Memoize();
var fibFast1 = fibonacciMemoized(30);
As you can see, I was able to create a generic memoize function that takes a function as its input, checks whether the item is in the cache and if not, adds it to the cache. Using the C# version, we created an extension method for our given Func<TArg, TResult>. This allows us to memoize the function correctly. Alternatively, I could have written the memoized function this way:
C#
{
return Extensions.Memoize((int n) => n <= 2 ? 1 : Fibonacci(n - 2) + Fibonacci(n - 1));
}
var fibFast1 = fibonacci(30);
To me, this reads a bit more of functional programming and the true power of it. I basically took the above methods and combined them so that my Fibonacci function now returns the function instead taking an integer and returning an integer.
Beware of the Bad Memoization
In order to get memoization to work properly, you must remember not to include the value to the function when declaring the function. This will cause a fresh memoization table to be created each and every time. Such functions such as these should be avoided as below:
F#
memoize (fun n -> if n <= 2 then 1 else fibonacciIncorrect(n - 2) + fibonacciIncorrect(n - 1)) n
C#
{
return Extensions.Memoize((int x) => x <= 2 ? 1 : FibonacciIncorrect(n - 2) + FibonacciIncorrect(n - 1))(n);
}
Generic Memoization Service
Due to the subtlety of problems mentioned above, it may be better to return an object instead of a function. These can be pretty simple objects with no more functionality than to return values and discard the given collection. Let's go ahead and do this sample in F# first. What I'm going to do is define the holder type for the calculated value:
inherit System.IDisposable
abstract Item : 'a -> 'b with get
This defines for me an interface with a Discard method and an Item method which accepts 'a and returns 'b via a get accessor. Pretty simple holder for my data. Now, let's define the memoize function which does the work:
let outerTable = new Dictionary<_,_>()
{ new ITable<_,_> with
member t.Item
with get(n) =
if outerTable.ContainsKey(n) then outerTable.[n]
else
let res = f n
outerTable.Add(n, res)
res
member t.Dispose() =
outerTable.Clear()
}
What you can see from the above is that I'm creating an inner class of type ITable which I defined above. Then I implement the given methods and properties I defined on the interface for Item and Discard(). During the Item property, I'm checking whether I have the given item. If I do, I return it from the collection, else I execute the given function and store the result. During my discard, I give myself the ability to reset the cached values. Now, let's implement the Fibonacci sequence function once again with our new function that we defined above.
memoize
(fun n ->
if n <= 2 then 1 else fibonacci.[n - 1] + fibonacci.[n - 2])
Now with this function, I can execute the given methods directly on the class that was returned from the memoize function. As you can see, I am getting values from the fibonacci object and calling the function once again on itself. I can then calculate the given numbers while using this code below:
printfn "%i" result
That's pretty slick, isn't it? Now, let's try in C#. It's not going to be as elegant nor to the point. But, let's define our Table class first.
{
private readonly Dictionary<T, U> dictionary;
private readonly Func<T, U> func;
public Table(Dictionary<T, U> dictionary, Func<T, U> func)
{
this.dictionary = dictionary;
this.func = func;
}
public U this[T n]
{
get
{
if (dictionary.ContainsKey(n))
return dictionary[n];
var result = func(n);
dictionary.Add(n, result);
return result;
}
}
public void Dispose()
{
dictionary.Clear();
}
}
Now that this is defined, then the rest of our methods should come nicely as noted here.
{
return new Table<T, U>(new Dictionary<T, U>(), func);
}
{
get
{
return Memoize((int n) => n <= 2 ? 1 : Fibonacci[n - 2] + Fibonacci[n - 1]);
}
}
var fibFast1 = fibonacci[30];
Not as elegant in the F# version because C# won't let me create an anonymous inner class as F# does. If you look around the F# codebase, you'll find them using this plenty to define such things that return IEnumerable<T> for Map and so on. Not the prettiest solution in C#, as you can well imagine.
Some Notes on Memoization
Don't forget that proper memoization requires the functions that you are writing to be pure. What that means is that given the same input, the same output will always be calculated and returned. Also note that none of the code I wrote was particularly thread safe, so I would definitely recommend if you were going to worry about concurrency here with our shared mutable state, that you would use the appropriate locks and so on.
Wrapping It Up
Now I was able to show you how to speed up some of those recursive algorithms using memoization. This allows for a bit better efficiency when doing large and pure calculations. Next time, I'm going to start going back to my concurrency topics as I have a lot yet untouched. With my recent presentations on the matter, there is plenty to talk about.
/script>/script>/script>/script>/script>/script>/script>