Jonathan Davies

Friendly Failing Tests

One thing I’ve been thinking about recently is making test failures clearer. It can be a bit scary when your terminal explodes with large diffs and you often have to parse a lot of extraneous information in order to really understand what is happening.

Additionally, when working on a team, you want your tests to be as expressive as possible and you should think about not only how they look in how they’re put together (aka, a four-phase test pattern) but how they act as a tool for solving issues when they do fail.

Here’s a simple example.

Let’s say I’ve got a method that returns just unread RSS feed items (e.g. blog posts):

class Item < ApplicationRecord

	def self.unread
		where(read_at: nil)
	end

end

My spec looks like this:

RSpec.describe Item, type: :model do
	context '.unread' do
		it "only returns unread items" do
			unread_item = create(:item, read_at: 1.day.ago)
			read_item = create(:item, read_at: nil)

			unread_items = Item.unread

			expect(unread_items).to eq([unread_item])
		end
	end
end

Now what happens when this test fails, let’s say for some reason we switch the unread method to check if it has bookmarked_at value instead of read_at (the specifics of why it failed doesn’t really matter).

Failures:

  1) Item.unread only returns unread items
     Failure/Error: expect(items).to eq([unread_item])

       expected: [#<Item id: 532, title: "Unread Item", content: "Content", permalink: "http://marvin.com/kurt", publi...5-12 15:53:28.096098000 +0000", updated_at: "2022-05-12 15:53:28.096098000 +0000", account_id: 997>]
            got: #<ActiveRecord::Relation [#<Item id: 532, title: "Unread Item", content: "Content", permalink: "http:...-12 15:53:28.105968000 +0000", updated_at: "2022-05-12 15:53:28.105968000 +0000", account_id: 999>]>

       (compared using ==)

       Diff:
       @@ -1,26 +1,51 @@
       -[#<Item id: 532, title: "Unread Item", content: "Content", permalink: "http://marvin.com/kurt", published_at: "2022-05-11 15:53:28.069097000 +0000", feed_id: 781, entry_id: "entry-id-1", read_at: nil, bookmarked_at: nil, created_at: "2022-05-12 15:53:28.096098000 +0000", updated_at: "2022-05-12 15:53:28.096098000 +0000", account_id: 997>]
       +[#<Item:0x0000000105993c38
       +  id: 532,
       +  title: "Unread Item",
       +  content: "Content",
       +  permalink: "http://marvin.com/kurt",
       +  published_at: Wed, 11 May 2022 15:53:28.069097000 UTC +00:00,
       +  feed_id: 781,
       +  entry_id: "entry-id-1",
       +  read_at: nil,
       +  bookmarked_at: nil,
       +  created_at: Thu, 12 May 2022 15:53:28.096098000 UTC +00:00,
       +  updated_at: Thu, 12 May 2022 15:53:28.096098000 UTC +00:00,
       +  account_id: 997>,
       + #<Item:0x0000000105993af8
       +  id: 533,
       +  title: "Read Item",
       +  content: "Content",
       +  permalink: "http://abbott-wolf.net/guadalupe",
       +  published_at: Wed, 11 May 2022 15:53:28.098398000 UTC +00:00,
       +  feed_id: 782,
       +  entry_id: "entry-id-2",
       +  read_at: Wed, 11 May 2022 15:53:28.098437000 UTC +00:00,
       +  bookmarked_at: nil,
       +  created_at: Thu, 12 May 2022 15:53:28.105968000 UTC +00:00,
       +  updated_at: Thu, 12 May 2022 15:53:28.105968000 UTC +00:00,
       +  account_id: 999>]

As you can see by the amount of scrolling you’ve had to do, the test isn’t doing a very good at describing exactly what has gone wrong. The reason it failed is because we’re now returning read items too.

How can we improve the test? Here’s some improvements:

RSpec.describe Item, type: :model do
	context '.unread' do
	 	it 'only returns unread items' do
      		create(:unread_item, title: 'Unread Item')
      		create(:read_item, title: 'Read Item')

     		items = Item.unread.map(&:title)

     		expect(items).to eq(["Unread Item"])
    	end
	end
end

Now what do we get when we run the tests?

Item.unread only returns unread items
     Failure/Error: expect(items).to eq(["Unread Item"])

       expected: ["Unread Item"]
            got: ["Unread Item", "Read Item"]

       (compared using ==)

Now that’s a lot easier to understand.

Here’s why I like it:

  • We’re leveraging attributes on the records themselves, without having to come up with new variable names (unread_item, read_item - yuck)
  • The names express the intent of the object and why we’re using it
  • By mapping over the title attribute we’re able to remove the noise that was clogging up the diff.
  • If you’re following a TDD approach, you want your failures to give you as bigger clue as possible about which next step to implement.