Refactoring is not redesign
NOTE: This is one of four lessons learned from my 90 day self-study on test-driven development. If this topic interests you, be sure to check out the other lessons!
To maintain a productive TDD workflow, you need understand the difference between refactoring and redesign. These two activities are distinct from one another, but because they are often done in lockstep, it can be challenging to mentally separate them.
The problem I noticed in my own work is that seemingly simple changes often spiral into much more complex modifications. Whenever that happens, it is easy to make bad decisions that can cause progress to grind to a halt. Having a good way to distinguish between what can be accomplished via simple refactorings and what requires careful design consideration seems to be the key to preventing this problem.
My hope is that by reading what I have learned from my own experiences, you will be able to avoid some of these obstacles along your own path. These lessons are not fun to learn the hard way!
What is refactoring?
Refactoring in the traditional sense has to do with making small and safe transformations to a codebase without altering its external behavior. Because refactorings are designed to be atomic and almost trivial, you can apply them whenever you feel that they will make life easier for you down the road. For example, it is rarely a bad idea to clean up messy code by introducing a couple helper methods:
def belongs_to(parent, params)
- mapper.record_class.send(:define_method, parent) do
- Object.const_get(params[:class]).find(send(params[:key]))
+ define_association(parent) do
+ BrokenRecord.string_to_constant(params[:class])
+ .find(send(params[:key]))
end
end
def has_many(children, params)
table_primary_key = mapper.primary_key
- mapper.record_class.send(:define_method, children) do
- Object.const_get(params[:class])
- .where(params[:key] => send(table_primary_key))
+ define_association(children) do
+ BrokenRecord.string_to_constant(params[:class])
+ .where(params[:key] => send(table_primary_key))
end
end
On the surface, this change is very superficial, as a proper refactoring ought to be. However, it has several immediate advantages worth pointing out:
-
The
define_association
helper makes the code reveal its intentions much more clearly by hiding some awkward metaprogramming. -
The
BrokenRecord.string_to_constant
method makes it easy to extend this code so that it handles fully qualified constant names (i.e.SomeProject::Person
), without the need to add a bunch of extra noise in multiple places. -
Both helper methods cut down on duplication, eliminating the connascence of algorithm that was present in the original code.
-
Both helper methods reduce the amount of implementation details that the
belongs_to
andhas_many
methods need to be directly aware of, making them more adaptive to future changes.
The important thing to notice here is that while making this change opens a lot of doors for us, and has some immediate tangible benefits, it does not introduce any observable functional changes, both from the external perspective, and from the perspective of the object’s collaborators.
What is redesign?
While the concept of refactoring is easy to define and categorize, the process of redesigning code is not nearly as straightforward. Rather than attempting to provide an awkard definition for it, I will demonstrate what makes redesign different from refactoring by showing you a real example from my study.
When working on BrokenRecord (my toy implementation of the Active Record pattern), I initially designed it so that a single object was responsible for running queries against the database and mapping their results to user-defined models. This worked fine as a proof of concept, and the code was pretty easy to follow.
However, designing things this way lead to very high coupling between the query API and the underlying database implementation, as you can see in the following code:
module BrokenRecord
class Table
#...
def create(params)
escapes = params.count.times.map { "?" }.join(", ")
fields = params.keys.join(", ")
BrokenRecord.database.execute(
"insert into #{@table_name} (#{fields}) values (#{escapes})",
params.values
)
end
def find(id)
BrokenRecord.database
.execute("select * from #{@table_name} where id = ?", [id])
.map { |r| @row_builder.build_row(self, r) }.first
end
end
end
Even though I had no intentions of making BrokenRecord into a library that could be used for practical applications, this design was fundamentally inconsistent with what it means to be an object-relational mapper. The lack of abstraction made any sort of query optimization impossible, and also prevented the possibility of introducing support for multiple database backends.
In addition to these concerns about future extensibility, the current design made it much harder to test this code, and much harder to do some common queries without directly hijacking the global reference to the underlying database adapter. All these things combined meant that a redesign was clearly in order.
Taking a first glance at the implementation of BrokenRecord::Table, it was tempting to think that all that was needed here was to extract a class to encapsulate the database interactions. But because this object had come into existence as a result of a broad-based integration test rather than a series of focused unit tests, I was hesitant to perform an extraction without writing a few more tests first.
Thinking about the problem a little more, I noticed that the changes I wanted
were deeper than just putting together an internal object to hide
some implementation details and reduce coupling. The fact that Table
was
the best name I could think of for my extracted object even though that was
the name of the original class was a sign that I was in the process of
changing some responsibilities in the system, not just grouping related
bits of functionality together.
Taking a TDD-friendly approach to redesign
The mistake I’ve made in the past when it comes to redesigning internal objects is that I tended to make my changes recursively, often without introducing new tests as I went. So for example, I might take a helper object that had gotten too complex and break it into two objects, testing both objects only indirectly through some higher level test. That kind of change would often reveal to me that I wanted to extract even more classes or methods, or possibly even change the protocols between the low-level collaborators in the system.
Sooner or later, I would end up with a complicated web of internal objects that were all being tested through a single use case at the high level, and so any defects I introduced became very hard to track down. Even though my tests were protecting external defects from creeping into the system, I had negated the design and debugging benefits that come along with doing TDD more rigorously.
After discussing this bad habit of mine with Eric Hodel during one of Mendicant University’s study sessions, I came to realize that there are some simple ways to sidestep this problem. In particular, I realized that I could redesign systems by introducing new components from the bottom up, cutting over to the new implementation only when it was ready to be integrated.
Wanting to try out these new ideas in BrokenRecord, I started out by renaming
the BrokenRecord::Table
object to BrokenRecord::RecordTable
. I put virtually
no thought into the new name, because what I was really trying to do was free
up the BrokenRecord::Table
name so that I could completely change the
responsibilities associated with it. This allowed me to experience a similar
amount of freedom that simply deleting the original class would have given
me, but without the cost of having to work through a bunch of orphaned
references and broken tests in my system.
I drove the new BrokenRecord::Table
object test first, mostly mirroring the
ideas from the original object but sticking strictly to the interactions with
the database and representing records as simple Hash objects. I also
added a new feature which provided information about the columns
in a given table. You can get a rough idea for how I sketched out that
feature by checking out the following test:
it "must be able to retrieve column information" do
columns = table.columns
columns.count.must_equal(3)
columns[:id][:type].must_equal("integer")
columns[:title][:type].must_equal("text")
columns[:body][:type].must_equal("text")
end
The original BrokenRecord::Table
object was just a first iteration spike,
and so it expected that all model objects explicitly defined what fields
were in the tables they mapped to. This helped keep the implementation
simple, which was essential when the class was taking on two
responsibilities at once. However, in the new BrokenRecord::Table
object, this kind of low level database interaction looked perfectly at
home, and paved the way for removing the tedious BrokenRecord::RowMapper
object in the newly designed system.
Throughout the process of building better internals from the bottom
up, I was able to make these kinds of revisions to several objects, and
also introduced a couple more internal objects to help out with various
things. Sooner or later, I reached the point where I was ready to create
an object that could serve as a drop-in replacement for the original
BrokenRecord::Table
object (the one I renamed RecordTable
).
Feeling like I might actually keep this new object around for a while,
I decided to name it TableMapper
, which at least sounded slightly
less horrible than RecordTable
. Its methods ended up looking
something this:
module BrokenRecord
class TableMapper
# ...
def create(params)
id = @table.insert(params)
find(id)
end
def find(id)
fields = @table.where(:id => id).first
return nil unless fields
@record_class.new(:table => @table,
:fields => fields,
:key => id)
end
end
end
Functionality-wise, the newly created BrokenRecord::TableMapper
was nearly a
drop in replacement for the original system, even though it had a much better
underlying design. Because it only needed to implement a handful of methods
to maintain API compatibility, integrating it went very smoothly, and required
almost no changes to the original top-level tests. Once I cut things over
and had all the tests passing, I was able to completely remove the
BrokenRecord::RecordTable
object without any issues.
Reflections
If I had not taken this more disciplined approach and instead followed my old ways, I probably would have ended up in about the same place design-wise, but it would have come at a much higher cost. I would have had fewer tests, spent more time debugging trivial errors, and probably would have cut corners in places out of impatience or frustration. The overall codebase would have still been quite brittle, and future changes would be harder to make rather than easier. Taking that less disciplined approach might have allowed me to implement this particular set of changes a little faster, but my past experiences have taught me that I always end up having to pay down my techinical debt sooner or later.
By teaching myself to think of refactoring and redesign as distinct activities, I am much more likely to stop myself from going on long exploratory cleanup missions with little guidance from my tests. This has already made a big difference in my own work, so I’d definitely recommend giving it a try.
If you have questions, or a story of your own to share, please leave me a comment!
Practicing Ruby is a Practicing Developer project.
All articles on this website are independently published, open source, and advertising-free.