Friendly Failing Tests

One thing I’ve been thinking about recently is making test failures clearer. It can be a bit scary when your terminal explodes with large diffs and you often have to parse a lot of extraneous information in order to really understand what is happening.

Additionally, when working on a team, you want your tests to be as expressive as possible and you should think about not only how they look in how they’re put together (aka, a four-phase test pattern) but how they act as a tool for solving issues when they do fail.

Here’s a simple example.

Let’s say I’ve got a method that returns just unread RSS feed items (e.g. blog posts):

class Item < ApplicationRecord

	def self.unread
		where(read_at: nil)
	end

end

My spec looks like this:

RSpec.describe Item, type: :model do
	context '.unread' do
		it "only returns unread items" do
			unread_item = create(:item, read_at: 1.day.ago)
			read_item = create(:item, read_at: nil)

			unread_items = Item.unread

			expect(unread_items).to eq([unread_item])
		end
	end
end

Now what happens when this test fails, let’s say for some reason we switch the unread method to check if it has bookmarked_at value instead of read_at (the specifics of why it failed doesn’t really matter).

Failures:

  1) Item.unread only returns unread items
     Failure/Error: expect(items).to eq([unread_item])

       expected: [#<Item id: 532, title: "Unread Item", content: "Content", permalink: "http://marvin.com/kurt", publi...5-12 15:53:28.096098000 +0000", updated_at: "2022-05-12 15:53:28.096098000 +0000", account_id: 997>]
            got: #<ActiveRecord::Relation [#<Item id: 532, title: "Unread Item", content: "Content", permalink: "http:...-12 15:53:28.105968000 +0000", updated_at: "2022-05-12 15:53:28.105968000 +0000", account_id: 999>]>

       (compared using ==)

       Diff:
       @@ -1,26 +1,51 @@
       -[#<Item id: 532, title: "Unread Item", content: "Content", permalink: "http://marvin.com/kurt", published_at: "2022-05-11 15:53:28.069097000 +0000", feed_id: 781, entry_id: "entry-id-1", read_at: nil, bookmarked_at: nil, created_at: "2022-05-12 15:53:28.096098000 +0000", updated_at: "2022-05-12 15:53:28.096098000 +0000", account_id: 997>]
       +[#<Item:0x0000000105993c38
       +  id: 532,
       +  title: "Unread Item",
       +  content: "Content",
       +  permalink: "http://marvin.com/kurt",
       +  published_at: Wed, 11 May 2022 15:53:28.069097000 UTC +00:00,
       +  feed_id: 781,
       +  entry_id: "entry-id-1",
       +  read_at: nil,
       +  bookmarked_at: nil,
       +  created_at: Thu, 12 May 2022 15:53:28.096098000 UTC +00:00,
       +  updated_at: Thu, 12 May 2022 15:53:28.096098000 UTC +00:00,
       +  account_id: 997>,
       + #<Item:0x0000000105993af8
       +  id: 533,
       +  title: "Read Item",
       +  content: "Content",
       +  permalink: "http://abbott-wolf.net/guadalupe",
       +  published_at: Wed, 11 May 2022 15:53:28.098398000 UTC +00:00,
       +  feed_id: 782,
       +  entry_id: "entry-id-2",
       +  read_at: Wed, 11 May 2022 15:53:28.098437000 UTC +00:00,
       +  bookmarked_at: nil,
       +  created_at: Thu, 12 May 2022 15:53:28.105968000 UTC +00:00,
       +  updated_at: Thu, 12 May 2022 15:53:28.105968000 UTC +00:00,
       +  account_id: 999>]

As you can see by the amount of scrolling you’ve had to do, the test isn’t doing a very good at describing exactly what has gone wrong. The reason it failed is because we’re now returning read items too.

How can we improve the test? Here’s some improvements:

RSpec.describe Item, type: :model do
	context '.unread' do
	 	it 'only returns unread items' do
      		create(:unread_item, title: 'Unread Item')
      		create(:read_item, title: 'Read Item')

     		items = Item.unread.map(&:title)

     		expect(items).to eq(["Unread Item"])
    	end
	end
end

Now what do we get when we run the tests?

Item.unread only returns unread items
     Failure/Error: expect(items).to eq(["Unread Item"])

       expected: ["Unread Item"]
            got: ["Unread Item", "Read Item"]

       (compared using ==)

Now that’s a lot easier to understand.

Here’s why I like it:

We’re leveraging attributes on the records themselves, without having to come up with new variable names (unread_item, read_item - yuck)
The names express the intent of the object and why we’re using it
By mapping over the title attribute we’re able to remove the noise that was clogging up the diff.
If you’re following a TDD approach, you want your failures to give you as bigger clue as possible about which next step to implement.