Let’s review the bookends of our so-far successful foray into recursion elimination. The general theme has been replacement of recursive calls by self-application of functions adhering to recursive types. (Now, I think we can even eliminate the recursive types, but we still have a way to go before that.) The first, successful recursive type was
Delegate Function dg(ByVal g As dg, ByVal n As Long) As Long
But we didn’t have closures and currying, so we had to keep an ugly argument slot open for the self-applied function. ‘Inventing’ closures and currying led us to invent a pair of function types:
Delegate Function df(ByVal n As Long) As Long
Delegate Function dh(ByVal h As dh) As df
Along the way, we gained something, but hardly noticed it, and then we lost it, again hardly noticing it. In this installment, we are going to get it back permanently. But what was it? Remember, at the very beginning, we wanted to write
Function fact(f As df, ByVal n As Long) As Long
If n = 0 Then : Return 1
Else : Return n * f(n - 1)
End If
End Function
but we were forced to write the following because we couldn’t get the declarations and the types uncoiled:
Function preFac(ByVal g As dg, ByVal n As Long) As Long
If n = 0 Then : Return 1
Else : Return n * g(g, n - 1)
End If
End Function
So, we channeled closures and currying to get rid of the open argument slot for g, but lost track of a primary wish: to have the function call in the final position NOT exhibit self application so blatantly. Brings to mind gagging on lukewarm bouillabaisse. While building up closures, we drove right by this little roadside poissonnerie:
Function lambdaF(ByVal n As Long) As Long
If n = 0 Then : Return 1
Else : Return n * f(n - 1)
End If
End Function
because this was only half of a real closure, of course. While it got the location of f right, its value didn’t come from the ambient environment. Instead, it was illicitly loaded at creation time, using secret knowledge, the AddressOf lambdaF itself. So, we got the hiding-of-recursion-through-one-level-of-indirection we wanted, but at the cost of defining lambdaF with reference to its own definition. We were trying to escape exactly that with the whole exercise! So we moved down the road to
Delegate Function dh(ByVal h As dh) As df
Private h As dh
Private Function lambdaF(ByVal n As Long) As Long
If n = 0 Then : Return 1
Else : Return n * h(h)(n - 1)
End If
End Function
Public Function lambdaH(ByVal adh As dh) As df
Me.h = adh
Return AddressOf lambdaF
End Function
Now, this new lambdaF doesn’t refer to itself in any way. One more level of indirection fixed it: the creator is responsible for loading the value of h from the ambient environment, where it can OPTIONALLY refer to h itself. The restoration of that option legitimizes the self-application:
Function fact3(ByVal n As Long) As Long
Dim acf3 = New cf3
Return acf3.lambdaH(AddressOf acf3.lambdaH)(n)
End Function
We made it! A place that fixes bouillabaisse, but, alas, it’s lukewarm again. LambdaF is ugly because it must blatantly repeat the self-application trick again, via h(h). At least there is enough curry in the soup to prevent gagging. Or is there?
Notching up the abstraction of the discourse a tiny bit, we keep trying to write something like this
fact = f(f)
and getting our knickers in a twist because f has to do something unnatural for fact. fact wants to talk about numbers and f has to keep talking about functions, sometimes functions that take functions as arguments, and sometimes functions that take numbers as arguments. Let’s just back off and suppose we could pass fact itself to f:
fact = f(fact)
Well, this would do the trick, wouldn’t it? Now, fact could just call itself in the final position, legitimately divorced from self-definition, without carrying around the ugly divorce decree in plain sight as blatant self-application of a free variable.
This is a fixed-point equation: it says that the fact we really want is a fixed point of f, meaning, a ‘point’ in the domain of f that maps to itself. If we were dealing with ordinary function graphs from school math, like y = f(x), such an idea wouldn’t scare us at all. In fact, for many functions, it’s easy to solve such fixed-point problems. Consider the linear function
y = m x + b
which has the fixed-point solution
y = f(y) = b / (1 – m)
In other words, there is a unique fixed point unless m = 1, in which case either no points are fixed points, if b ≠ 0; or all points are fixed points, if b = 0.
But in our world of functions, we’re definitely feeling queasy. We’ve left the secure, dry land of functions on numbers and are pitching along the rolling seas of functions of functions. But there is much adventure and treasure ahead, even if only a properly cooked bouillabaisse.
We have only f to work with. Think, think, think. Ok (flash of genius), we’re already on the high seas of higher-order functions, so, maybe we can imagineer yet-an0ther-function, call it y, that can convert f into fact. We would get the mind-twisting equation:
fact = y(f) = f(y(f)) = f(fact)
Does this have a solution for y (we already know what f is)? Think, think, think. Ok (flash of genius), maybe self-application could help. Think, think, think. Maybe y(f) equals some other function applied to itself. Ok (rolling thunderstorm of genius), how about:
let g(x) = f(x(x)) for any x (notice the free variable, f)
let y(f) = g(g), which equals f(g(g)) = f(y(f))
By gum, we’ve got it! A y that converts any function f into its fixed point, assuming that the fixed point exists. The condition for existence seems to be just that f take a function as its first argument, and we can curry off the rest of the arguments. Of course, this is the famous Y combinator and all that genius is borrowed from Haskell Curry himself.
There is one more little thing we must do before we boil this down into concrete realization in VB. If you actually called y(f) above, it would call g(g), which would call f(g(g)), which would call g(g), and the whole thing would diverge till we get a stack overflow. We must make sure that the g(g) in f(g(g)) is not called yet, so we replace the argument of f with a function of one variable that applies g(g) to that variable. Mathematically, a function of one variable, say j(a), that applies another function to that variable, as in k(a), is just plain equal to that function. In other words j = k, so the equations above hold: a function of a that applies g(g) to a is just g(g), so is mathematically equal to f(g(g)), just as before (this is η conversion). It just delays the evaluation one step: in the argument-position of f, it evaluates to a function that will apply g(g) to its argument later. So we don’t get a runaway loop unless the underlying function, f, has one in it. Here are a bunch of articles on this technique.
The current notation for discussing the mathematical meaning of all this function hacking is very clumsy, however, and gets in the way of understanding, and it will make it REALLY difficult to understand the upcoming VB code. It forces us to name every function, and the whole point of the project-at-large is to avoid naming functions by writing lambdas instead. We’ve got to graduate to the real thing. So here goes:
λ x → Q
Read this as “lambda of x returning Q.” It means the function of x that does (or returns) Q, where Q stands for anything that may depend on x and upon other variables. This is abstraction of x. If x were a free variable in Q, then it is bound in λ x → Q (reminder, a free variable is any variable that is not a function argument at any level of nesting in an expression) We can give this function a name as follows:
P = λ x → Q
This associates to the right, so that
Q → R → S = (Q → (R → S))
(if we think about it, it really only can be read that way). Abstraction relates back to the traditional notation as follows:
P(x) = Q
In the lambda notation, we do NOT need to write the parentheses around the function arguments and this is very helpful in more involved expressions. So, the above is the same as this:
P x = Q
This is function application, and it associates to the left, so
f g h = ((f g) h) function f applied to g, and the result applied to h
To evaluate a function (called β reduction), we substitute an actual argument for a variable,
(λ x → Q) R ~> Q[x ← R] if all free variables in R remain free
Read this as “lambda of x returning Q applied to R becomes Q with R replacing x everywhere.” We can avoid ‘capturing’ free variables in R by just renaming any bound variables in Q that collide with them. See the wiki entry for more.
In the new notation, write the first version of y as
y f = [λ x → f (x x)] [λ x → f (x x)]
Read this as “y of f equals the function of x that returns f applied to x of x, applied to (a copy of) that same function.” Notice that we explicitly overrode the left-associativity in f(x x); we don’t want (f x) x— “f of x applied to x” —which would be the interpretation of just f x x. To evaluate y f, just follow β reduction, replacing x in f(x x) with λx→f(x x), resulting in
f ( [λ x → f (x x)] [λ x → f (x x)] ) = f (y f)
just as required. Doing the η conversion to delay evaluation changes nothing:
y f = [λ x → f (λ a → x x a)] [λ x → f (λ a → x x a)]
because (λ a → x x a) = x x, mathematically.
I’m just warming you up with this notation, because I’m going to put it in the comments when I write the VB version of this, next installment.