Functional Programming Unit Testing - Part 1
As you noticed from my last post regarding functional programming and unit testing, there is a bit to be discussed. Important to any programming language is not only the language, but the frameworks and tooling around it, such is the case with functional languages. Let's focus on the tooling around testing with functional languages.
What kind of options do we have? In the Haskell world just as the F# world, there are several tools at our disposal to do this.
- HUnit
A traditional xUnit testing framework for unit testing. Analogous to such frameworks as xUnit.net, NUnit and MbUnit in the .NET world. - QuickCheck
A program in which the developer provides a specification of the program, in the form of properties which functions should satisfy, and then tests that the properties hold in a large number of randomly generated cases that QuickCheck provides. There are many variants of this tool for most functional languages including F# (FsCheck), Erlang, Scala, Java, Python, Standard ML and others.
Today we're going to focus on HUnit as part of developing an API in Haskell. Some of these lessons apply well to any functional language, but is told well using Haskell.
Starting with HUnit
HUnit is a fairly simple and yet easy to use xUnit based testing framework for Haskell. It's so bare bones in fact that it only has two main assertion functions that people use, assertEqual and assertBool. The APIs are straight forward and easy to extend. I'll do that in a subsequent post to get some of the functionality on par with that of say xUnit.net.
Let's walk through an example of creating an API for performing calculations on a list. Since I have a background in quantitative methods, I'll start with some of those. The first function we need to create is the average function. This function takes a list and calculates an average over them. In order to do this, let's define a test to set the behavior.
module HUnitTests (main) where
import Test.HUnit
import HUnitExample(average)
averageExpected :: Test
averageExpected =
TestCase (assertEqual "Should get Just averaged number from list" x (average xs))
where xs = [1,2,3]
x = 2.0
main :: IO Counts
main =
runTestTT $ TestList [averageExpected]
Now that we've defined our criteria for success, now, let's turn our attention to implementing this function.
module HUnitExample (average) where
import Data.List
average :: (Fractional a) => [ a ] -> a
average xs = (sum l) / fromIntegral (length xs)
When running this test from the GHC Interactive (ghci.exe), I get the following results.
But, wait! What happens when we pass an empty list. That would cause an error due to a divide by zero exception. What we need to do is add another pattern to our average function to trap that and report a standard error. Let's define a test case for that.
import Test.HUnit
import HUnitExample(average)
averageExpected :: Test
averageExpected =
TestCase (assertEqual "Should average number from list" x (average xs))
where xs = [1,2,3]
x = 2.0
averageEmpty :: Test
averageEmpty =
TestCase $ do
handleJust errorCalls (\_ -> return ()) performCall where
performCall = do
evaluate ( average [] )
assertFailure "average [] must throw an error"
main :: IO Counts
main =
runTestTT $ TestList [averageExpected, averageEmpty]
Now to define that failure pattern in my average function. My test will already succeed because I'm not checking whether it is a divide by zero exception or something else, and I could filter that exception, but I'll do that in another post.
module HUnitExample (average) where
import Data.List
average :: (Fractional a) => [ a ] -> a
average _ = error "empty list"
average xs = (sum xs) / fromIntegral (length xs)
Running our tests again, we find that both of them now pass. Thinking to myself, I think I can generalize this a little bit. Say for example, I have a list of tuples or record types. I can't average them exactly as is, but instead, would have to provide a way to extract that value that I do care about. Let's define a test for that to take a list of tuples and grab the second value and average that one. I'll omit the rest of the file as it stays the same except for adding our test function to the main function's test list.
averageByExpected =
TestCase $ assertEqual
where xs = [("One", 1), ("Two", 2), ("Three", 3)]
x = 2.0
f = (\x -> snd x)
Now the code to implement this should be rather straight forward. I'll omit the rest of the file and just concentrate on the new averageBy function.
averageBy f xs = (sum . map f) xs / fromIntegral (length xs)
Instead of using the standard sum, I need to add a map projection to this. This allows me to add my own custom function to the mix. Once we get this code implemented, another test then suddenly passes. But once again, we forgot about the empty list yet again. Let's write a test for that case and make it fail.
averageByEmpty =
TestCase $ do
handleJust errorCalls (\_ -> return ()) performCall where
performCall = do
evaluate ( averageBy f xs )
assertFailure "averageBy f [] must throw an error"
where xs = []
f = (\x -> snd x)
Now the test succeeds, because once again, not checking whether it is a divide by zero exception or something else. But, let's put the guard in there to feel better about ourselves.
averageBy _ [] = error "empty list"
averageBy f xs = (sum . map f) xs / fromIntegral (length xs)
But looking at this code, I think it's time for a refactoring. The average and averageBy are very similar and could be generalized. Why? Because the averageBy takes a function, we can then supply a projection. Let's redo our average function to instead just use the averageBy function with an extra input.
average = averageBy (\x -> x)
We can use currying to our favor here to only supply the arguments we need to and leave the rest for the system to figure out. Running things once again, we see that all four tests pass nicely still. But, I'm still not satisfied. Why not? Because I don't like dealing with errors sometimes, and would like to give a safe alternative to the error prone average and averageBy functions. Let's use the Maybe type to define failure this time around using a new function called tryAverageBy.
tryAverageByExpected =
TestCase $ assertEqual "Should get Just averaged number from list" x (tryAverageBy f xs)
where xs = [("One", 1), ("Two", 2), ("Three", 3)]
x = Just 2.0
f = (\x -> snd x)
And the implementation will then look like this to get it to pass.
tryAverageBy f xs = Just ( (sum . map f) / fromIntegral (length xs) )
And the dance continues until I have fully flushed out the tryAverageBy with both cases as well as the tryAverage functions. But it looks like I could generalize the averageBy function as well, to call our try instead to see whether to throw an error. We only want to write the algorithm once and use it over and over if we can. Maybe something like this might work.
averageBy f x =
case triedAvg of
Nothing -> error "Empty list"
Just value -> value
where triedAvg = tryAverageBy f x
When all is said and done, we then have 8 passing tests and a nicer code base because we took the time to refactor. Not that these implementations are perfect, but they show you the evolving code base of using HUnit and TDD within the Haskell environment.
Conclusion
Building our systems means caring about design, quality and correctness. When dealing with a language such as Haskell, where purity, polymorphism and an expressive type system helps us write code that is very modular, refactorable and testable. Along the way, there are tools to help such as HUnit and QuickCheck. Next time, I'll be covering type-based property checking using QuickCheck as well as how we can extend HUnit to fit a few more to suit our needs.