NOTE: This is one of four lessons learned from my 90 day self-study on test-driven development. If this topic interests you, be sure to check out the other lessons!

When used in moderation, experimental spikes can be a very powerful tool for shining some light into the dark corners of your projects. However, there is a natural tension between chaotic spiking and formal TDD practices that needs to be balanced if you want to use the two techniques side by side. Equalizing these forces can be very challenging, and it is something that I have struggled with throughout my career.

Because I started programming as a self-taught hobbyist, I spent many years writing code without a well defined process. As I started to work on larger and more important projects, I learned how to program in a more disciplined way. I developed an interest in object-oriented design and also picked up the basics of test-driven development. These methodologies helped me work in a more controlled fashion when I needed to, but they did not do much to change my everyday coding habits. I still relied on lots of messy experimentation; I just knew how to clean up my code so that I didn’t end up shipping sloppy work in the end.

While I have managed to be very productive over the years, my day to day efficiency has been very unpredictable because of the way that I do things. This is something I have been aware of for some time, and was one of the main problems that I wanted to take a closer look at during this study. With that in mind, I will now walk you through three examples of where I broke away from TDD to try out some experiments and then share my thoughts on what worked and what didn’t work.

Exploring the unknown

I knew when I started working on Blind that I would need to learn how to do two things with the Ray game engine that I hadn’t done before: work with positional audio, and write tests against the UI layer. I knew that these things were supported by Ray because the documentation had examples for them, but I needed to convince myself that they would work in practice by building a small proof of concept.

Rather than trying to build realistic examples that matched how I would end up using these features, I instead focused on their most basic prerequisites. For example, I knew that I’d never be able to have dynamically positioned sound emitters in a three-dimensional space if I couldn’t play a simple beeping sound without any positioning at all. I also saw from the documentation that in order to write tests against Ray it was necessary to use its class-based API rather than using its fancy DSL. Combining those two ideas together lead me to build the following (almost trivial) spike solution:

require "ray"

class MainScene < Ray::Scene
  scene_name :main

  def setup
    @sound = sound("beep.wav")
  end

  def register
    always do
      @sound.play
      sleep @sound.duration
    end
  end

  def render(win)
  end
end

class Game < Ray::Game
  def initialize
    super "Awesome Game"

    MainScene.bind(self)

    scenes << :main
  end
end

Game.new.run

While this code was little more than the end result of mixing a couple examples from Ray’s documentation together, it helped me verify that there weren’t any problems playing sounds on my system, and that the documentation I was reading was up to date.

Coincidentally, this tiny script helped me notice that my wife’s laptop was missing the core audio dependencies that Ray needed; which is a perfect example of what this kind of spike is made to test. It also gave me an opportunity to answer some questions that the documentation didn’t make clear to me. For example, removing the sleep call made me realize that playing a sound was a non-blocking operation, and deleting the render method made me realize that it only needed to be provided if it was doing something useful. In a fairly complex and immature project like Ray, this kind of trial-and-error based investigation is often a faster way to find answers than digging through source code.

I was actually very happy with the outcomes from this spike, and the effort I put into it was minimal compared to what I got out of it. While I can’t say the same for the other experiments I am about to show you, this little script serves as a nice example of spiking done right.

Trying out a new design

Mid-way through working on Blind, I decided to completely change the way I was modeling things. All elements in the game were originally modeled as rectangles, but as I tweaked the game rules, I started to realize that all I really cared about was point-to-point distance between the player and various locations in the world. The hoops I was having to jump through to work with rectangular game elements eventually got annoying enough that I decided to try out my new ideas on an experimental branch.

I started working on this redesign from the bottom up, test-driving a couple supporting objects that I knew I’d need, including a very boring Point class. Despite the smooth start, it eventually became clear to me that this approach would only take me so far: the original Game class was tightly coupled to a particular representation of Blind’s world. To make matters worse, the UI code I had written was a messy prototype that I hadn’t cleaned up or tested properly yet. These issues left me stuck between a rock and a hard place.

I had already sunk a lot of time into building the new object model, but didn’t want to keep investing in it without being reasonably sure that it was the right way to go. To build up my confidence, I decided to do a quick spike to transform the old UI into something that could work on top of the new object model.

Within an hour or two, I had a working game running on top of the new codebase. I made several minor changes and added a couple new features to various objects in the process of doing so, without writing any tests for them. I originally assumed that I didn’t need to write tests because I expected to throw all this code away, but after wrapping up my experiment I decided that the code was good enough to merge could be easily cleaned up later. This decision eventually came back to haunt me.

Over the next several days, I ran into small bugs in various edge case scenarios in the code that had been implemented during the spike. For example, the randomized positioning of mines and exit locations had not been rewritten to account for the fact that the game no longer defined regions as rectangles, and that would occasionally cause them to spawn in the wrong regions. The following patch was required to fix that problem:

       @current_position = Blind::Point.new(0,0)
 
       @mine_positions   = mine_count.times.map do
-        Blind::Point.new(rand(MINE_FIELD_RANGE), rand(MINE_FIELD_RANGE))
+        random_minefield_position
       end
 
-      @exit_position = 
-        Blind::Point.new(rand(MINE_FIELD_RANGE), rand(MINE_FIELD_RANGE))
+      @exit_position = random_minefield_position
     end
 
     attr_reader :current_position, :mine_positions, :exit_position
@@ -42,5 +41,15 @@ def current_region
         :outer_rim
       end
     end
+
+    private
+    
+    def random_minefield_position
+      begin 
+        point = Blind::Point.new(rand(MINE_FIELD_RANGE), rand(MINE_FIELD_RANGE))
+      end until MINE_FIELD_RANGE.include?(@center.distance(point))
+
+      point
+    end
   end
 end

Similarly, whenever I wanted to refactor some code to introduce a change or extend functionality in some way, I needed to write tests to fill the coverage gaps that were introduced during my experiment. This lead to a temporary but sharp rise in the cost of change, and that caused my morale to plummet.

Looking back on what happened, I think the problem was not that I created an experimental branch with some untested code on it, but that I decided to keep that code rather than throwing it out and starting fresh. Wiring up my new data model to the UI and seeing a playable game come out of it was a huge confidence booster, and it only cost me a couple hours to get to that point. But because I decided to merge that code into master, I inherited several more hours of unpredictable maintenance work that might have been avoided if I had redone the work in a more disciplined way.

Sketching out an idea

About mid-way through my study, I had an idea for a project that I knew I wouldn’t have time for right away: an abstract interface for describing vector drawings. However, because I couldn’t stop thinking about the problem, I decided I needed to make a simple prototype to satisfy my curiosity. An entire evening of hacking got me to the point where I was able to generate the following image in PDF format using Prawn:

The basic idea of my abstract interface was that rather than making direct calls to Prawn’s APIs, you could instead describe your diagrams in a general way, such as in the following example:

drawing = Vellum::Drawing.new(300,400)

drawing.layer(:box) do |g|
  g.rect(g.top_left, g.width, g.height)
end

drawing.layer(:x) do |g|
  g.line(g.top_left,  g.bottom_right)
   .line(g.top_right, g.bottom_left)
end

drawing.layer(:cross) do |g|
  g.line([g.width / 2, 0], [g.width / 2, g.height])
   .line([0, g.height / 2], [g.width, g.height/2])
end

drawing.style(:x,     :stroke_color => "ff0000") 

drawing.style(:box,   :line_width   => 2, 
                      :fill_color   => "ffffcc")

drawing.style(:cross, :stroke_color => "00ff00")

A Vellum::Renderer object would then be used to turn this abstract representation into output in a particular format, using some simple callbacks. A Prawn-based implementation is shown below:

require "prawn"

pdf      = Prawn::Document.new
renderer = Vellum::Renderer.new

renderer.on(Object) do |shape, style|
  pdf.stroke_color = style.fetch(:stroke_color, "000000")
  pdf.fill_color   = style.fetch(:fill_color, "ffffff")
  pdf.line_width   = style.fetch(:line_width, 1)
end

renderer.on(Vellum::Line) do |shape, style|
  pdf.stroke_line(shape.p1, shape.p2)
end

renderer.on(Vellum::Rectangle) do |shape, style|
  pdf.fill_and_stroke_rectangle(shape.point, shape.width, shape.height)
end

renderer.render(drawing)

pdf.render_file("foo.pdf")

Looking back on this code, I’m still excited by the basic idea, because it would make it possible for backend-agnostic graphics code to be written, and would allow for more than a few interesting manipulations of the abstract structures prior to rendering. However, I can’t help but think that for a throwaway prototype, there is far too much detail here.

If you take a closer look at how I actually implemented Vellum, you’ll find that I shoved together several classes into a single file, which I stowed away on a gist. I never bothered to record the history of my experiment, which I assume was actually built up incrementally rather than designed all at once. Without a single test to guide me, I would need to study the implementation code all over again if I wanted to begin to understand what I had actually learned from my experiment.

While it is hard to say whether this little prototype was worth the effort or not, it underscores a bad habit of mine that bites me from time to time: I can easily get excited about an idea and then dive into it with reckless abandon. In this particular situation, I ended up with some working code at the end of my wild hacking session, but there were several other ideas I worked on during my study that I ended up getting nowhere with.

What makes spiking different from cowboy coding?

The main thing I learned from taking a look at how I work on experimental ideas is that there is a big difference between spiking and cowboy coding.

When you are truly working on a spike, you have a specific question in mind that you want to answer, you know roughly how much you’re willing to invest in finding out that answer, and you cut as many corners as possible to get that answer as quickly as possible. The success of a spike is measured by what you learn, not what code you produce. Once you feel that you understand what needs to be done, you pull yourself out of spike mode and return to your more disciplined way of doing things.

Cowboy coding, on the other hand, is primarily driven by gut feelings, past experiences, and on-the-spot decision making. This kind of programming can be fun because it allows you to write code quickly without thinking deeply about its consequences, but in most circumstances, you end up needing to pay for your lack of discipline somewhere down the line.

Of the three examples I discussed in this article, the first one looks and feels like a true spike, and the third one is the result of straight-up guns-blazing cowboy coding. The second example lies somewhere between those two extremes, and perhaps represents a spike that turned into a cowboy coding session. I think scenarios like that are what we really need to look out for, because it is very easy to drop our good practices but much harder to return to them.

Now that I’ve laid this all out on the line for you, I’d love to hear some of your own stories! Please leave a comment if you have an interesting experience to share, or if you have any questions for me.

NOTE: While doing some research for this article, I stumbled across a nice excerpt from “The Art of Agile Development” which describes how to safely make use of spike solutions. It’s definitely worth checking out if you’re interested in studying this topic more.