Pex - A Tool in Search of an Identity

Tuesday, February 5, 2008

A cohort turned my attention to something from Microsoft Research called "Pex: Dynamic Analysis and Test Generation for .NET".

I only took a quick glance at it (there doesn't seem to be any downloads available, just whitepapers and a screencast), but from what I see I already don’t like it.

First off, I have an issue with a statement almost right off the bat “By automatically generating unit tests, it helps to find bugs early”. First, I don’t believe “automatically generating unit tests” is of very much value. TDD (and more recently BDD) is about building a system that meets a business need with a solution and driving that solution out with executable specifications that can be understood by anyone. With the release of VS2005 Microsoft gave us “automatically generated unit tests” by pointing it at code and creating a bunch of crap tests that more or less really only tested the .NET framework (make sure string x is n long, lather, rinse, repeat). Also I'm not sure how automatically generating unit tests can find bugs early (which is what Pex claims). That seems to be a mystical conjuration I missed out on.

Pex claims to be taking test driven development to the next level. I don't think it even knows what level it itself is at yet.

Pex feels to me like it's trying to be like an automated FxCop (and we all know what that might be like). Looking at the walkthrough you still write a test (now called a "Parameterized Unit Test"). This smells to me like a RowTest in MbUnit terms but doesn't look like one and is used to generate more tests (it seems as partial classes to your own test class). Then you run Pex against it from inside the IDE. Here's where it gets a little fuzzy. Pex generates test cases and reports from them, with suggestions as to how to fix the failing code. For example in the walkthrough the test case suggestion is to validate the length of a string before trying to extract a substring. What is a little obscure is what exactly that suggested snippet is for, the test case or the code you're testing?

"High Code Coverage". The test cases generated by Pex "give high code coverage". Again a monkey's paw here. High code coverage means very little in the real world. Just because you have some automated "thing" hitting your code, doesn't mean it's right. Or that your code is really doing what you intended it to. I can have 100% code coverage on a lot of crap code and still have a buggy system. At best you'll catch stupid programmer errors like bounds checking and null object references. While this is a good thing, just writing a little code you can accomplish the same task a lot quicker than writing a specific unit test to generate other tests for you. Maybe it's grunt work and silly unit test code to write and maybe that's the benefit of Pex.

"Integrates with Unit Testing Frameworks". This is another red herring. What it really means is "Integrates with VSTS Unit Testing Framework". Nowhere in the documentation or site can I see it integration with MbUnit or NUnit. It does however mention it can run with MbUnit or NUnit so I assume something can be done here (maybe through template generation), but little substance is available right now.

Then there's the mock objects, [PexMock]. Again, no meat here as these are early looks but Pex supports mocking interfaces and virtual methods. Yes, in addition to building it's own NUnit clone (MSTest), NDoc clone (SandCastle), Castle.Windsor (DIAB), and NAnt (MSBuild), you can now get your very own Rhino clone in the form of PexMock! It looks a little more complex to setup and use than Rhino, but then who says Microsoft tools are simple. If it's simple to use, it can't be powerful can it?

I watched the screencast which walks through the chunker demo (apparently the only demo code they have as everything is based around it). It starts innocently enough with someone writing a test, decorated with the [PexTest] attribute. Once enough code is written to make it compile (red) you "Pex It" from the context menu. This generates some unit tests, somehow giving 73% coverage and failing (because at this point the Chunker class returns null). Pex suggests how to fix your business code along with suggestions for modifying the test.

From the error list you can jump to the generated test code (there's also an option to "Fix it" which we'll get to in a sec). The developer then implements the logic code to try to fix the test. By selecting the "Fix it" option, Pex finds the place where the null reference might occur (in the constructor) and injects code into your logic (by surrounding it with "// [Pex]" tags, ugh, horror flashbacks of Rational Rose come to my mind).

The problem with the tool is that generated tests come out like "DomainObjectValueTypeOperation_70306_211024_0_01" and "DomainObjectValueTypeOperation_70306_211024_0_02". One of the values of TDD and unit tests is for someone to look at a set of unit tests and know how the domain is supposed to behave. I know for example exactly what a spec or test called "Should_update_customer_balance_when_adding_a_new_item_to_an_existing_order" does. I don't have to crack open my Customer.cs, Order.cs and CustomerOrder.cs files to see what's going on. "CustomerStringInt32_1234_102965_0_01" means nothing to me. Okay, these are generated tests so why should I care?

This probably gets to the crux of what Pex is doing. It's generating tests for code coverage. Nothing more. I can't tell what my Pex system does from test names or maybe even looking at the tests themselves. Maybe there's an option in Pex to template the naming but even that's just going to make it a little more readable, but far from soluble to a new developer coming onto the project. Maybe I'm wrong, but if all Pex is doing is covering my butt for having bad developers, then I would rather train my developers better (like checking null references) than to have them rely on a tool to do their job for them.

A lot of smart dudes (much smarter than me) have worked on this and obviously Microsoft is putting a lot of effort into it. So who am I to say this is good, bad, or ugly. I suppose time will tell as it gets released and we see what we can really do with it. These are casual observations from a casual developer who really doesn't have any clout in the grand scheme of things. For me, I'm going to continue to write executable specs in a more readable BDD form that helps me understand the problems I'm trying to solve and not focus on how much code coverage I get from string checking, but YMMV.

@Michael: I tend to blather on and sometimes miss the point in my blog posts. What one of the main points I wanted to get across was, is that Pex isn't a tool to help you with TDD or design. It's being touted as an "extension" to TDD "taking it to the next level" but I disagree. Unit testing and TDD are orthogonal concepts. TDD is a matter of design whereas unit testing is a means. I will agree that maybe Pex is a good tool to cover those edge cases, but I personally wouldn't start there when building a system. Is it valuable? In a way yes, but the prime value to me is in what the user does and how the software delivers that, not 100% code coverage from unit tests.

Bil Simser - Wednesday, February 6, 2008 6:25:10 AM

I saw Peli demo Pex at Lang.NET 2008. All I can say is, wow. I want it!

Jason Bock - Wednesday, February 6, 2008 6:56:00 AM

Even with TDD and BDD there's a couple of aspects (at least) with unit testing. With BDD, yes, that includes behaviour-based testing. But, unit testing of methods based upon the implied contract of their interface is important too. Tests like that don't test the expected behaviour, but I believe both should be tested. If Pex supported some contracts, boundary/corner cases could be unit tested as well. But, as it stands edge case unit tests of method parameters is a good thing, in addition to testing behaviour.

If you use Pex, and continue to do you regular BDD, your quality can only be better.

peter ritchie - Wednesday, February 6, 2008 8:38:03 AM

If you're doing TDD and want higher coverage, aren't DbC tools like Spec# a much better solution?

Hopefully (but not likely) the message will come out "this is how you can increase coverage, find edge cases" but not "this is how you unit test or do TDD".

Jimmy Bogard - Wednesday, February 6, 2008 9:11:50 AM

@Jimmy/Peter: That's exactly the message I hope gets out. Pex isn't a replacement, it's a supplement and don't treat it as a design tool. Thanks!

Bil Simser - Wednesday, February 6, 2008 10:18:06 AM

@Jimmy: yes, DbC tools are step in the more correct direction; but we're stuck without contract features in most of the .NET languages right now...

But, DbC really only covers the boundary cases (i.e. passing a value outside the contract range of a method's argument is a compile error), it does nothing to say that any value with the valid range won't cause a problem.

As Greg Young said (I don't remember where and I'm paraphrasing) you don't have to test what's been guaranteed correct by the compiler.

peter ritchie - Wednesday, February 6, 2008 1:30:32 PM

i've heard that Pex is pretty awesome and will make all the chicks swoon. So don't go knockin what you don't yet know and all that jazz ;-)
i think this is an attempt to make up for that vs2005 monstrosity.

secretGeek - Friday, February 8, 2008 7:09:47 AM

7 Comments