Writing confident code
This article was contributed by Avdi Grimm. Avdi has been wrangling Ruby code for over a decade and shows no signs of slowing down. He is the author of *Exceptional Ruby* and *Objects on Rails*. His next book, *Confident Ruby*, focuses on writing Ruby code with a confident and straightforward style.
Losing the plot
Have you ever read a “choose your own adventure” book? Nearly every page ends with a question like this:
- If you fight the angry troll with your bare hands, turn to page 137.
- If you try to reason with the troll, turn to page 29.
- If you don your invisibility cloak, turn to page 6.
You’d pick one option, turn to the indicated page, and the story would continue.
Did you ever try to read one of those books from front to back? It’s a surreal experience. The story jumps forward and back in time. Characters appear out of nowhere. One page you’re crushed by the fist of an angry troll, and on the next you’re just entering the troll’s realm for the first time.
What if each individual page was this kind of mish-mash? What if every page read like this:
You exit the passageway into a large cavern. Unless you came from page 59, in which case you fall down the sinkhole into a large cavern. A huge troll, or possibly a badger (if you already visited Queen Pelican), blocks your path. Unless you threw a button down the wishing well on page 8, in which case there nothing blocking your way. The [troll or badger or nothing at all] does not look happy to see you.
If you came here from chapter 7 (the Pool of Time), go back to the top of the page and read it again, only imagine you are watching the events happen to someone else.
If you already received the invisibility cloak from the aged lighthouse-keeper, and you want to use it now, go to page 67. Otherwise, forget you read anything about an invisibility cloak.
If you are facing a badger (see above), and you choose to run away, turn to page 93…
Not the most compelling narrative, is it? The story asks you to carry so much mental baggage for it that just getting through a page is exhausting.
Code as narrative
What does this have to do with software? Well, code can tell a story as well. It might not be a tale of high adventure and intrigue. But it’s a story nonetheless; one about a problem that needed to be solved, and the path the developer(s) chose to accomplish that task.
A single method is like a page in that story. And unfortunately, a lot of methods are just as convoluted, equivical, and confusing as that made-up page above.
In the following sections, we’ll take a look at some examples of code that unnecessarily obscures the storyline of a method. We’ll also explore some techniques for minimizing distractions and writing methods that straightforwardly convey their intent.
Secure the borders
Here is some code that’s having some trouble sticking to the plot:
require 'date'
class Employee
attr_accessor :name
attr_accessor :hire_date
def initialize(name, hire_date)
@name = name
@hire_date = hire_date
end
def due_for_tie_pin?
raise "Missing hire date!" unless hire_date
((Date.today - hire_date) / 365).to_i >= 10
end
def covered_by_pension_plan?
# TODO Someone in HR should probably check this logic
((hire_date && hire_date.year) || 2000) < 2000
end
def bio
if hire_date
"#{name} has been a Yoyodyne employee since #{hire_date.year}"
else
"#{name} is a proud Yoyodyne employee"
end
end
end
We can speculate about the history of this class. It looks like over
the course of development, three different developers discovered that
#hire_date
might sometimes be nil
. They each chose to handle this
fact in a slightly different way. The one who wrote
#due_for_tie_pin?
added a check that raises an exception if the hire
date is missing. The developer responsible for
#covered_by_pension_plan
substituted a (seemingly arbitrary) default
value for nil
. And the writer of #bio
went with an if
statement
switching on the presence of #hire_date
.
This class has serious problems with second-guessing itself. And the
root of all this insecurity is the fact that the #hire_date
attribute is unreliable—even though it’s clearly important to the operation of the class!
One of the purposes of a constructor is to establish an object’s
invariant: a set of properties which should always hold true for that
object. In this case, it really seems like one of those invariants should be: employee hire date is a Date
.
But the constructor, whose job it is to stand guard against initial values which are not compatible with the class invariant, has fallen asleep on the job. As a result, every other method dealing with hire dates is burdened with the additional responsibility of checking whether the value is present.
This is an example of a class which needs to set some
boundaries. Since there is no obvious “right” way to handle a missing
hire date, it probably needs to simply insist on having a valid hire
date, thereby forcing the cause of these spurious nil
values to be
discovered and sorted out. To do this, it should guard its own
integrity by checking the value wherever it is set, either in the
constructor or elsewhere:
require 'date'
class Employee
attr_accessor :name
attr_reader :hire_date
def initialize(name, hire_date)
@name = name
self.hire_date = hire_date
end
def hire_date=(new_hire_date)
raise TypeError, "Invalid hire date" unless new_hire_date.is_a?(Date)
@hire_date = new_hire_date
end
def due_for_tie_pin?
((Date.today - hire_date) / 365).to_i >= 10
end
def covered_by_pension_plan?
hire_date.year < 2000
end
def bio
"#{name} has been a Yoyodyne employee since #{hire_date.year}"
end
end
In this version, the hire_date
attribute is protected by type check
in the setter method. Since the constructor now delegates to this
setter method to initialize the attribute, it is no longer possible to
construct new Employee
objects without a valid hire date. Now that
the “borders” of the object are guarded, all the other methods can
focus on telling their own stories, without being distracted by
a potentially missing hire_date
.
Be assertive
In the last section we saw how assertions in a class’ constructor or setter methods can help keep the other methods focused. But uncertainty and convoluted code can come from sources other than input parameters.
Let’s say you’re working on some budget management software. The next
user story requires the application to pull in transaction data from a
third-party electronic banking API. According to the meager
documentation you can find, you need to use the
Bank#read_transactions
method in order to load bank
transactions. The first thing you decide to do is to stash the loaded
transactions into a local data store.
class Account
def refresh_transactions
transactions = bank.read_transactions(account_number)
# ... now what?
end
end
Unfortunately the documentation doesn’t say what the
#read_transactions
method returns. An Array
seems likely. But what
if there are no transactions found? What if the account is not found?
Will it raise an exception, or perhaps return nil
? Given enough time
you might be able to work it out by reading the API library’s source
code, but it’s pretty convoluted and you might still miss some edge
cases.
You decide to make an assumption… but as insurance, you document your assumption with an assertion.
class Account
def refresh_transactions
transactions = bank.read_transactions(account_number)
transactions.is_a?(Array) or raise TypeError, "transactions is not an Array"
transactions.each do |transaction|
# ...
end
end
end
You manually test the code against a test account and it doesn’t blow up, so it seems your suspicion was correct. Next, you move on to pulling amount information out of the individual transactions.
You ask your teammate, who has had some experience with this API, what format transactions are in. She says she thinks they are hashes with string keys. You decide to tentatively try looking at the “amount” key.
transactions.each do |transaction|
amount = transaction["amount"]
end
You look at this for a few seconds, and realize that if there is no
“amount” key, you’ll just get a nil
back. Then you’d have to check
for the presence of nil
everywhere the amount is used. You’d prefer
to document your assumption more explicitly. So instead, you make an
assertion by using the Hash#fetch
method:
transactions.each do |transaction|
amount = transaction.fetch("amount")
end
Hash#fetch
will raise a KeyError
if the given key is not found,
signaling that one of your assumptions about the Bank API
was
incorrect.
You make another trial run and you don’t get any exceptions, so you proceed onward. Before you can store the value locally, you want to make sure the transaction amount is in a format that your local transaction store can understand. Nobody in the office seems to know what format the amounts come in as. You know that many financial system store dollar amounts as an integer number of cents, so you decide to proceed with the assumption that it’s the same with this system. In order to once again document your assumption, you make another assertion:
transactions.each do |transaction|
amount = transaction["amount"]
amount.is_a?(Integer) or raise TypeError, "amount not an Integer"
end
You put the code through it’s paces and… BOOM. You get an error.
TypeError: amount not an Integer
You decide to drop into the debugger on the next round, and take a look at the transaction values coming back from the API. You see this:
[
{"amount" => "1.23"},
{"amount" => "4.75"},
{"amount" => "8.97"}
]
Well that’s… interesting. The amounts are reported as decimal strings.
You decide to convert them to integers, since that’s what
your internal Transaction
class uses.
transactions.each do |transaction|
amount = transaction.fetch("amount")
amount_cents = (amount.to_f * 100).to_i
# ...
end
Once again, you find yourself questioning this code as soon as you
write it. You remember something about #to_f
being really forgiving
in how it parses numbers. A little experimentation proves this to be
true.
"1.23".to_f # => 1.23
"$1.23".to_f # => 0.0
"a hojillion".to_f # => 0.0
Only having a small sample of demonstration values to go on, you’re
not confident that the amounts this API might return will always be in
a format that #to_f
understands. What about negative numbers? Will
they be formatted as “-4.56”? Or as “(4.56)”? Having an unrecognized
amount format silently converted to zero could lead to nasty bugs down
the road.
Yet again, you want a way to state in no uncertain terms what kind of values the code is prepared to deal with. This time, you use Kernel#Float to assert that the amount is in a format Ruby can parse unambiguously as a floating point number:
transactions.each do |transaction|
amount = transaction.fetch("amount")
amount_cents = (Float(amount) * 100).to_i
cache_transaction(:amount => amount_cents)
end
Kernel#Float
is much stricter than String#to_f
:
Float("$1.23")
# ~> -:1:in `Float': invalid value for Float(): "$1.23" (ArgumentError)
# ~> from -:1:in `<main>'
The final code is chock full of assertions:
class Account
def refresh_transactions
transactions = bank.read_transactions(account_number)
transactions.is_a?(Array) or raise TypeError, "transactions is not an Array"
transactions.each do |transaction|
amount = transaction.fetch("amount")
amount_cents = (Float(amount) * 100).to_i
cache_transaction(:amount => amount_cents)
end
end
end
This code clearly states what it expects. It communicates a great deal of information about your understanding of the external API at the time you wrote it. It explicitly establishes the parameters within which it can operate confidently; as soon as any of its expectations are violated it fails quickly, with a meaningful exception message.
Unfortunately, as a result it is really telling two stories: one about refreshing transactions (remember, that’s nominally what this method is about) and one about the format of an external data source. This is quickly mended, however:
class Account
def refresh_transactions
fetch_transactions do |transaction_attributes|
cache_transaction(transaction_attributes)
end
end
# Yields a hash of cleaned-up transaction attributes for each transaction
def fetch_transactions
transactions = bank.read_transactions(account_number)
transactions.is_a?(Array) or raise TypeError, "transactions is not an Array"
transactions.each do |transaction|
amount = transaction.fetch("amount")
amount_cents = (Float(amount) * 100).to_i
yield(:amount => amount_cents)
end
end
end
By failing early rather than allowing misunderstood inputs to contaminate the system, it reduces the need for type-checking and coercion in other methods. And not only does this code document your assumptions now, it also sets up an early-warning system should the third-party API ever change unexpectedly in the future.
Represent special cases with objects
The most common causes of code that tells a confusing story are special cases. Let’s look at an example of a special case, in the context of our budgeting application.
You’ve implemented transaction import and it’s working great. Except for one little problem: users have been reporting bugs about the reported balances being off. And not just the balances; in fact, all of the reports seem to have incorrect numbers for some accounts.
You do some investigation into the system logs, and eventually discover the culprit. It turns out that some banks, when they receive a authorization for a credit card charge, immediately report it as a pending transaction in the transaction list. The data looks something like this:
{"amount" => "55.08", "type" => "pending", "id" => "98765"}
Then, when the charge is completed or “captured”, another transaction is recorded:
{"amount" => "55.08", "type" => "charge", "id" => "98765"}
Aside: if you’ve ever written code to deal with actual banking APIs, you’ve probably figured out by now that I have not. I’m making this up for the sake of example. I expect real banking APIs are just as idiosyncratic, though, in their own ways.
The result of these “double entries” is that your calculations get thrown off. Your application has routines for summing transactions, averaging them, breaking them down by month and quarter, and many more. And every one of these calculations uses the amount field to arrive at its results.
You briefly consider simply throwing out pending transactions. But after a quick consultation with your team you realize this would only introduce more problems. There is sanity-checking code in place which checks that the bank servers and the local cache have the same transaction count and contain the same transaction IDs. And not only that, you might actually want to use the pending transaction information for upcoming features.
Your second option is to handle the special case… specially, everywhere that the amount field is referenced. For example:
def account_balance
cached_transactions.reduce(starting_balance) do |balance, transaction|
if transaction.type == "pending"
balance
else
balance + transaction.amount
end
end
end
You’ll have to carefully audit the code base, adding conditionals to every use of amount. Not only that, you’ll have to make sure anyone else who works on this code understands the special case.
That doesn’t seem like a very attractive option. Thankfully, there is a third way. And the code above actually gives you the hint you needed to discover it.
Let’s look at that conditional again:
if transaction.type == "pending"
This code is branching on the type of a value. This is a huge clue. In an object-oriented language, anytime we branch on an object’s type, we’re doing work that the language could be doing for us.
You realize that this special case calls for a special type of object to represent it.
You decide to try this approach out. You find the code where transaction objects are being instantiated:
# Note: we expect that transaction attributes have already been
# converted to use Symbol keys at this point.
def cache_transaction(attributes)
cached_transactions << Transaction.new(attributes)
end
You change it to instantiate a different kind of object for pending
transactions. Because you want to quickly spike this approach, you use
an OpenStruct
to create a rough-and-ready ad-hoc object:
def cache_transaction(attributes)
transaction =
case attributes[:type]
when "pending"
pending_attributes = {
:amount => 0,
:pending_amount => attributes[:amount]
}
OpenStruct.new(attributes.merge(pending_attributes))
else
Transaction.new(attributes)
end
cached_transactions << transaction
end
This switches on type as well, but it only does it once. After that, the transaction can be used as-is in all of your existing algorithms.
You run some tests, and discover this fixes the problem! You consider simply leaving the code as it is, since it’s working now. But on reflection you decide that the concept of a pending transaction would be best represented by a proper class. That way you have a place to put documentation about this special case, as well as any more special logic you realize you need down the road.
# A pending credit-card transaction
class PendingTransaction
attr_reader :id, :pending_amount
def initialize(attributes)
@id = attributes.fetch(:id)
@pending_amount = attributes.fetch(:amount)
end
def amount
# Pending transactions duplicate finished transactions, thus
# throwing off calculations. For the purpose of calculations and
# reports, a pending transaction always has a zero amount. The
# real amount is available from #pending_amount.
0
end
end
You then rewrite #cached_transactions
to use this new class.
def cache_transaction(attributes)
transaction =
case attributes[:type]
when "pending"
PendingTransaction.new(attributes)
else
Transaction.new(attributes)
end
cached_transactions << transaction
end
This code solves the immediate problem of a special type of transaction, without duplicating logic for that special case all throughout the codebase. But not only that, it is exemplary: it sets a good example for code that follows. When, inevitably, another special case transaction type turns up, whoever is tasked with dealing with it will see this class and be guided towards representing the new case as a distinct type of object.
Conclusion
Ruby is a language which values expressiveness over just about everything else. It is optimized to help us programmers say exactly what we mean, without any extraneous fluff, to both the computer and to future readers of our code. This is what makes it so much fun to code in.
When we allow our methods to become cluttered up with ifs and maybes and provisos and digressions, we let go of that expressiveness. We start to lose the clear, confident narrative voice. We force the future maintainers of the code to navigate through a twisty path full of logical forks in the road in order to understand the purpose of a method. Reading and updating the code stops being fun.
My challenge to you is this: when you are writing a new method, keep a clear idea in mind of the story you are trying to tell. When detours and diversions start to show up along the way, figure out what you need to do to restore the narrative, and do it. You might get rid of repetitive data integrity checks by introducing preconditions in the initializer of a method. Maybe you can surround an external API in assertions that document your beliefs about it, rather than trying to handle anything it throws at you. Or perhaps you can eliminate a family of often-repeated conditionals by representing a special case as a class in its own right.
However you do it, keep your focus on telling a straightforward tale. Not only will the future readers of your code thank you for it, but I think you’ll find that it makes your code more robust and easier to maintain as well.
Practicing Ruby is a Practicing Developer project.
All articles on this website are independently published, open source, and advertising-free.