Much Ado About Monads – State Edition

Over the past year, I’ve had the opportunity to be a technical reviewer of a number of books including Chris Smith’s Programming F# book.  During my recent trips, I had the chance to sit down and read through the book once again, and in particular Chapter 10 on Computation Expressions (aka Monads).  A section worth noting in here is the use of the State Monad in order to create a library for web scripting.  In the past, I’ve had a series on Much Ado About Monads where I look at the basic Monads such as Maybe and List, but this time, let’s look at what we can do with the State Monad.

Keeping Track of State

Before we get started looking at the state monad and what it entails, let’s look at the overall goal.  What are we trying to solve?  Taking the example of the web scripting DSL, as we perform actions such as clicking a button, we change the state of the system and this state is returned to use each and every call.  This state then needs to passed along to the next step for processing.  This state becomes cumbersome as we have to manage it for each and every call in our function.  What we end up with is a messy API that looks like this:

[<Fact>]
let ``Can find CodeBetter on Bing``() =
  let state1 = openPage "http://bing.com"
  let state2 = enterText "q" "CodeBetter" state1
  let state3 = clickButton "go" state2
  let (result, state4) = containsText "CodeBetter" state3
  isTrue result
  let state5 = closePage state4 

But what I really want is this state to be managed for me underneath the covers so that I don’t have to be reminded of it every call I make.  In this example, we’ll manage our browser object underneath the covers, so that any time we have a call with our bang ( ! ) notation, we’re really invoking some function on our state.

[<Fact>]
let ``Can find CodeBetter on Bing``() =
  runScript ( 
    state {
      do! openPage "http://bing.com"
      do! enterText "q" "CodeBetter"
      do! clickButton "go"
      let! result = containsText "CodeBetter"
      isTrue result
      do! closePage })

In the book, Chris used the State Monad to model this state.  When we say State Monad, what we really mean is an abstraction for a function which takes a state and returns an immediate value and some new state value.  In this case, each action we perform such as entering text or clicking a button would alter our state object and return a new one which must be persisted for the next call.. 

Now that we defined the problem and generally speaking what the State Monad is, let’s go ahead and define it.  To do this, we’ll need an overarching container to describe our state.  In this case, we have the State<’a, ‘s> type which the ‘a is our result type and the ‘s is our state type.  This State constructor takes a function which takes a state parameter and returns a tuple of our result and our modified state.

type State<'a,'s> = State of ('s -> 'a * 's)

In addition, we’ll need a way to run our State so that we can provide it with an initial state and return both our result and our new state.  Let’s create a function called runState which takes our State and an initial state, and returns a tuple of our result and new state.

let runState (State s) initialState = s initialState

To make this useful, we need the ability to both get and set our state.  The getState function should return a State with our constructing function takes our state and returns a tuple pair containing our state.  Conversely, our putState function should take a state value and construct a State with the function which returns a tuple state of unit and our state.  These functions will be important as we need to obtain and modify our state.

let getState = State (fun s -> (s,s))
let putState s = State (fun _ -> ((),s))

Next, we’ll need the ability to both evaluate and execute the state.  What I mean by evaluate is that we want the end result and throw away the resulting state.  Conversely, the execute state returns the modified state and ignores the result.

let evalState m s = runState m s |> fst
let execState m s = runState m s |> snd

Now that we have a way to run our computations, let’s get to the monad part.  As you may remember from previous posts, the monad must support three basic rules.  First, a type construction, for each underlying type, we must be able to get the corresponding monadic type.  Second, we must have a function that maps an underlying value to a value in the corresponding monadic type, which is called the return function.  Third, we must have a binding operation where the first argument is a value in a monadic type, its second argument is a function that maps from the underlying type of the first argument to another monadic type, and its result is in that other monadic type.

Let’s first tackle the return function.  You should recall that our return function must have the following signature, keeping in mind that M defined below is our monadic type.

val returnF : 'a -> M<'a>

In the case of the State Monad, it should look like this where we take in a value and return a State of ‘a and our state.

val returnF : 'a -> State<'a,'s>

Now, let’s look at the implementation.  In this case we have a parameter a, which when given an initial state will produce a tuple of that value and our state.

let returnF a = State (fun s -> (a,s))

Next, we need to define the bind operation.  As you may recall, our bind operation must look like the following code, keeping in mind that M defined below is our monadic type.

val bind : M<'a> -> ('a -> M<'b>) -> M<'b>

In the case of our State Monad, it should look like this:

val bind : State<'a> -> ('a -> State<'b>) -> State<'b>

Let’s break this down a little more to show what is really going on underneath the covers.

val bind : ('s -> ('a,'s))       -> 
           ('a -> 's -> ('b,'s)) -> 
           ('s -> ('b,'s))

What we see is that the first argument is a function which takes some initial state and produces a tuple of a value and our new state.  Our second argument is a function that takes a value and our new states and then generates a tuple of our new state and newer state.  What our goal is to combine these two functions into a larger function from an initial state to a tuple of our new value and newest state. 

let bind m k = 
  State (fun s -> let (a,s') = runState m s in runState (k a) s')

Taking these two functions together, we can now create a builder which can provide us some syntactic sugar when writing functions using the state monad. 

type StateBuilder() =
  member this.Return a = State (fun s -> (a, s))
  member this.Bind(m, k) = 
    State (fun s -> let (a,s') = runState m s in runState (k a) s')
let state = new StateBuilder()

To try it out, let’s take a simple example of the state monad in action.  We’ll create a function call tick which when given an initial state, will bind n to that state.  Then, we will take our existing state, increment it by one and then persist again.  Finally, we’ll return our n, which is our initial state.  Below is what the code will look like once we’re through.

let tick = state {
  let! n = getState
  do! putState (n + 1)
  return n }

We can run this by calling the runState function which will give us the tuple of our value and our new state.

> runState tick 3;;
val it : int * int = (3, 4)

Let’s take a look at another example, this time of a managing the state of a queue.

Queue Example

In this example, we’re going to implement a queue, which is a standard FIFO collection.  Using a structure like an F# list, we can model both the enqueue, which is to add to the end of the list, and the dequeue, which is to take from the front of the list. 

let enqueue a lst = ((), lst @ [a])
let dequeue (hd::tl) = (hd, tl)

The idea behind these functions is that when you enqueue an item, you get back a tuple of a unit and our value appended to our list.  Conversely, when we dequeue an item, we simply return a tuple of our head of our list and the rest of the values.  Using these functions, we could write a simple workflow to add an item to the list, then grab the first item, multiply it by three, enqueue it back to the list, and finally return the head that we dequeued earlier as well as our queue.

let workflow initialState =
  let ((), state1) = enqueue 4 initialState
  let (a, state2) = dequeue state1
  let ((), state3) = enqueue (a * 3) state2
  (a, state3)

As you can see, the code is quite messy and cumbersome as we have to keep binding to a new state to pass along to the next function.  That gets unwieldy quite quickly.  Instead, if we apply some of the lessons learned from above, we can clean this up quite a bit.  What if we could rewrite our enqueue and dequeue using the State Monad and then have our stateful computation abstracted for us?  Going back to the drawing board, we could rewrite our functions as so:

let enqueue a = State (fun s -> ((), s @ [a]))
let dequeue = State (fun (hd::tl) -> (hd, tl))

The enqueue function takes a value, and when given an initial queue state, returns a tuple of unit and the value appended to our queue state.   Our dequeue function, when given a state of our list with a head and tail, returns a tuple of our head and tail.  Now, we can rewrite our workflow from above like the following to use the state builder.

let workflow = state {
  let! queue = getState
  do! enqueue 4
  let! hd = dequeue
  do! enqueue (hd * 3)
  return hd }

Like the previous example, we’re first getting passed an initial state, which we bind to queue.  Then we enqueue four to the list, take the head of the list, then add it back to the list after multiplying the value by three.  Finally, we return hd, which behind the scenes will return to us a tuple of that head value and our queue.  Let’s run it through real quickly to take a look at what it does:

> runState workflow [3..6];;
val it : int * int list = (3, [4; 5; 6; 4; 9])

As you can see, if we give an initial state of a list from 3 to 6, the function will return the head of 3 and the rest of our list with both the 4 and our head of 3 multiplied by 3.  This is quite useful for knowing when we find ourselves having to carry around some form of state from each function call manually.  Instead, using this technique, we can abstract this out and have it managed for us.

But what about this browser example that I showed in the beginning?  Could we really do something with it?

The Browser Example

The first example I gave above of using a web browser to somehow manage the state behind the scenes to create some sort of DSL is interesting.  In fact, we could take some of the lessons learned here and apply them immediately to our solution.  Taking a library such as WatiN, we could manage the state of our browser behind the scenes and then use custom functions to manipulate the state of the browser.

Let’s take a few of the functions listed above and implement them in the state monad for things such as opening and closing a browser, entering text into a field and clicking a button.  As you’ll see, each time we get the Browser out of state and then return our action, whether it be closing our browser, entering text or finding out whether some text exists on a page.

open WatiN.Core

let openPage (url:string) = state {
  let! (browser : Browser) = getState
  return browser.GoTo(url) }

let enterText (textBox : string) (text : string) = state {
  let! (browser : Browser) = getState
  return browser.TextField(Find.ByName(textBox)).TypeText(text) }
 
let containsText (text : string) = state {
  let! (browser : Browser) = getState
  return browser.ContainsText(text) }
 
let clickButton (buttonText : string) = state {
  let! (browser : Browser) = getState
  return browser.Button(Find.ByName(buttonText)).Click() }

let closePage = state {
  let! (browser : Browser) = getState
  return browser.Dispose() }

Then we can behind the scenes manage our Browser instance, whichever Browser we so choose.  We can create a function called runScript which calls our overall function with an initial state of a new Browser.

let runScript (script:State<'a,Browser>) =
  evalState script (new IE() :> Browser)

Once this is defined we should be able to run our test as before and actually have it work while using xUnit.net and some custom assertions that I made.

[<Fact>]
let ``Can find CodeBetter on Bing``() =
  runScript ( 
    state {
      do! openPage "http://bing.com"
      do! enterText "q" "CodeBetter"
      do! clickButton "go"
      let! result = containsText "CodeBetter"
      isTrue result
      do! closePage })

This works great and seems like an interesting abstraction for a DSL, but, is this monad abuse?

The Practicality of It All

What you’ve noticed from my runScript and those other functions which deal with manipulating the Browser, you’ll notice that I never did persist state again once I had the initial state.  The reason for this is that the mutable state is really being managed by the Browser object behind the scenes anyways, and any subsequent call does not return me a new browser, but instead mutates the existing instance.  Instead, we should more focus on an implementation of an abstraction which deals with persisting the initial state only once and only allows reads, which of course brings us to the Reader Monad which we’ll talk about next time in this series.

Another question does come up though about whether in a multi-paradigm language such as F# that this abstraction is needed at all?  After all, we have many other patterns we could apply as well.  It’s an interesting question and well open to debate.  But knowing of these abstractions can give you more options as you consider how we manage state in our applications.

Conclusion

By looking at Monads, we can discover what abstractions can be accomplished with them, but just as well, what they are not about.  Certainly, functional languages don’t need them, but they certainly can be quite useful.  When we find repeated behavior, such as maintaining state, this abstraction can be quite powerful.  Practically though in languages such as F#, they can be useful, but other patterns and abstractions can as well.  Find the right abstraction and use it.

2 Comments

Comments have been disabled for this content.