When I originally published my small rant on testing, it generated a spirited discussion about a number of different topics. It even lead Bryan Liles to post a great set of testing guidelines to balance out my unfocused rant. But the topic that overshadowed almost everything else was that of best practices regarding mock objects. In this two part article, we’ll try to shine some light on that topic, because it is clearly still a point of confusion and occasionally even controversy within our community.

In Issue #20, I will go over some examples of when I use mock objects and when I don’t, and try to come up with some guidelines for building test suites that do their job without becoming too brittle. But before we can really discuss practices, we need to establish a baseline level of theory and background knowledge, which is what this post is all about.

Rather than doing the heavy lifting myself, I will point you to the article to read to better understand mock objects. It was written in 2004 (which is about the time that I first read it), and then revised in 2007. It is of course, Martin Fowler’s essay Mocks Aren’t Stubs.

The article is long, somewhat dry, and includes large amounts of Java code. Don’t let that discourage you from reading the whole thing from end to end, and if necessary, reading it again. Despite the title, Fowler goes into much deeper topics than mocks vs. stubs, and hits on many of the key ideas that separate ‘mockists’ from ‘classicist’. Personally, I feel this is a false dichotomy, but you’ll still be hard pressed to find a better article that gives the historical background of the design ideas that motivated the creation of testing and mocking frameworks in the first place.

I find Fowler’s assessment to be reasonable fair, incredibly comprehensive, and a very useful place to start from if you are to form any argument about one approach vs. another when it comes to mocking. That having been said, I am critical of certain aspects of this essay, partly because I am looking at it with a 2011 perspective, and partly because I didn’t come to Ruby from Java. For this reason, I’ve included my commentary on Fowler’s article below. I encourage you to read his article in full before reading my comments, as they’ll make much more sense that way.

Commentary on Fowler’s “Mocks Aren’t Stubs”

Fowler explores two different concepts in this article: behavior vs. state based verification, and classical vs. mockist TDD. While he doesn’t directly draw the lines between them, he sort of implies that mockists are always focusing on behavior verification and that classical TDD leans heavily towards state based verification. There are some issues with this line of thinking.

Claiming that mockists inherently focus on behavior is valid. The idea of mocking everything except the object under test means that purists would not be able to work with ‘real objects’ to perform state verification on. But this sort of practice does not actually require mocking everything except the object under test, what it requires is more carefully written tests.

Fowler claims that classicists tend towards writing single tests that explicitly test large clusters of code simultaneously, which requires them to produce a large amount of fixture data just to get their tests to run. But in a post-BDD world, most people know how to isolate their test cases so that they focus on one behavior at a time, whether or not they’re utilizing mock objects. We also know to write comprehensive tests at both the higher and lower levels of our project, and so it isn’t necessary to worry about exercising all the possible paths through our low level objects when calling them through a high level interface.

Personally, when I’m testing a feature that is towards the top layer of my stack, I try to make it so it requires as little configuration as possible to initialize. It shouldn’t be necessary to load up fixture data for low level features I won’t use, so really, I only need to trace a single path of execution and provide the right data to make it a valid path. I weigh the cost of this against using a mock object, and whenever the two are comparable, I prefer the former. Clearly this doesn’t make me a mockist, but does it fit with Fowler’s definition of a classicist? I don’t know.

I was never deeply involved in Java programming, but from my limited experience with it, I feel that a lot of the arguments Fowler formed in this essay were and probably still are more relevant in the Java world. In Java, because you don’t have things like mixins, indirection is much more common than in Ruby. You might need to create 6 objects just to do one small simple thing. In such an environment, mock objects must seem like a godsend, as when you multiply that phenomena across your entire project, the cost of maintaining mocks would be far less than the cost of building complex setups for all those objects. But if you’re experiencing the same sorts of problems in Ruby, you have a horrible design for your project.

In Ruby, it is possible and often recommendable to build systems that don’t have very deep object nesting. For this reason, the ability to focus only on mocking direct neighbors of an object under test isn’t as much of a selling point. If we take away the complex object systems component, we are mostly left with the idea that mockists prefer to write mocks so that they can focus on driving the object under test, and then go back to use their mocks as a contract for the next object they need to create. Again, something that makes a lot of sense in languages that punish you for creating new objects. Ruby is not like that.

In almost every scenario I can imagine, it’s better to just go ahead and create a skeleton version of an object you need than it is to form a mock that is sort of floating in space. It will likely take less time, and working with the real object will give better insight into its design than trying to dream it up through a cumbersome mock interface. Fowler does touch on this approach being a valid one but claims that the mockist approach provides more design guidance. I don’t see any evidence to support this claim, as the two are essentially functionally equivalent with respect to the object under test.

Fowler does an excellent job of covering the arguments about test isolation, and I don’t have too much to add there except to say that I am firmly in favor of watching my whole test suite go up in smoke when I make a far reaching change. The false-positives that mocks give are downright dangerous in these scenarios, and arguments about it being difficult to find what caused the breakage are most likely an indication of some deeper problem: I’ve never had that issue even on my most complex projects.

Fowler’s entire discussion about Design Style for classicists vs. mockists misses the mark. It probably had a lot of truth to it at the time he wrote the article, and may still have some truth outside of Ruby. But really, what he is describing here is the distinction between old fashioned regression-suite style TDD and what we now call Behavior Driven Development. In my opinion, BDD is just a new style of TDD that is more principled and focused on design as a first class component of writing testable code. So when Fowler says that mockists favor role based systems, I think this actually applies more generally to anyone practicing modern TDD.

Reflections

As I said at the very beginning of this article, I think the distinction between mockists and classicists is a false dichotomy. I do agree that there is a wide chasm to cross between the original purpose of test frameworks and the new way of looking at things. But really, once you’ve decided that tests are more than just a safety net for dealing with regressions, you have already fallen outside of Fowler’s classicist point of view. In my opinion, there is room for people who focus on behavior rather than state, but don’t necessarily feel like mock objects are a good tool to be using by default. These folks are just as concerned about design and driving code through tests, but do not subscribe to absolutist viewpoints that require a single technique to be used at all times.

Since I consider myself to be in the third category that I’ve wedged between Fowler’s two groups, I will need to share some examples of what that means in practical terms. The next article should help with that, because it provides an outline of how I decide when to mock and when to use real objects instead. Until then, I’d be happy to hear your thoughts on this topic, especially what you think of Fowler’s article.

NOTE: This article has also been published on the Ruby Best Practices blog. There may be additional commentary over there worth taking a look at.