How I test my Rails apps

Solo Rails / Issue #2

There is a lot of noise about how to of test a Rails application. It's easy to get lost. Let's get down to the essentials.

What to test first? Does coverage matter?

Ultimately, you'll want to test everything that matters. Coverage matters, but not in the way you think. Traditional codecov tools, which measure coverage by lines of code or called functions measure the wrong thing. You need to approach coverage from the user's perspective. The failure state isn't "there is a bug in production", it's "this person leaves, stops using and/or paying for my app". I approach this in terms of critical paths of user experience.

This includes, but is not limited to: sign up, sign in, password recovery, common billing tasks (subscribe, up/down sell, cancel), interacting with your main Domain Object (e.g. if you're a blogging platform, drafting/publishing/editing a post), everything that explicitly costs money (e.g. premium features).

When I have limited time or energy to spend on testing, or I just need to pick one to start, I go through the following metrics:

1. Screw It Probability: the chance that if that part of the code breaks in some way, the user gives up and goes to look for an alternative.

2. Dollar Lost per Minute Risk: the chance that the part of the code breaking turns into a bug that loses me dollars per minute (i.e. security risk, inserting incorrect data, accidentally deleting data).

3. Code Churn: (not to be confused with subscription churn) code that changes often, either for technical or product reasons. You'd think this is counter-intuitive (why bother writing tests if it changes so often?), but every time you significantly alter code, there's a chance you're introducing an unseen bug or a missing a side-effect.

Again, ultimately everything should be tested, but everything doesn't mean every line of code, it means every conceivable interaction between your application and its user.

What type of test?

There are multiple test "granularity levels", from broadest to narrowest:

System (or End-to-End, E2E): mimicks a real in-browser interaction as closely as possible, with as many real dependencies as possible (i.e. real database, real browser with JS enabled).
Feature: like System, but without JS enabled. Optionally other dependencies omitted or mocked.
Controller/routing: focused on the behavior of controllers (i.e. response statuses, formats, etc), and routing behavior. I rarely use them, even in large projects.
Unit: tests the behavior of a class or method as an isolated unit. This includes mailer, job, services/interactors & other Rails-specific unit types.

We want to start from the top down, so our main test for a given interaction will be a system test (if the interaction requires JS), or a feature test (it it doesn't require JS).

Note: having feature specs run in a barebones browser engine without Javascript (rack_test) is an optimization detail specific to my projects. Your setup may vary, and you simply might not be able to afford it, for example, if your projects uses a heavy JS front-end, like a React SPA, in which case I would recommend writing your E2E specs in a specialized test runner like Playwright.

I will note however, that being able to run many if not most E2E specs without Javascript (hence having a massively faster test suite) is a fantastic perk of a modern Rails stack using Turbo 8 and morphing.

Here's what that looks like:

"Golden path" (feature/system spec): I visit the page / click on the button / the expected result happens in a way the user can see.
Authorization - likely case (feature spec): I visit a page with a record I'm not allowed to see: I am redirected with a message.
Empty state case (feature spec): I visit a page with a list of records, the list is empty, I'm seeing a meaningful empty state.
Mutation/interactor spec (unit spec): if a mutation (creating/updating a record, etc) has complex logic, it gets its own unit spec.
Filtering logic - canary case (feature spec): I filter a collection from a control on the page, validating that additional parameters are correctly routed to the query object/interactor that handles filtering, and the expected results are displayed.
Filtering logic - additional cases (unit spec): searching, sorting and other filters are tested in isolation in the aforementioned object.
Authorization - additional cases (unit spec): likewise, authorization is one of the layers which is likely to have a lot of scenarios & combinations. So, it's tested in isolation, provided a canary case is tested end-to-end.

Why not just feature specs for everything?

Since they have the most comprehensive coverage, it would be easiest to just write feature / system specs for everything. However, they are much slower, by at least 1-2 orders of magnitude than a unit test.

I think of feature specs as the first line of defense or the proverbial canary in the coalmine. They're meant to be broad and catch most shallow mistakes. Every time I need to test more complex or more variable behavior, I drill down one level.

It's all about balancing coverage, performance of the test suite, and confidence in the stability of the application. Perfect coverage is not worth a 20 minute test run, and a blazing fast suite isn't worth all the bugs it lets through.

Performance does matter a lot. If your suite is too slow, you will run it as rarely as possible, you will be tempted to skip it before deploying, etc. It will become a burden.

In summary

Start with system/feature specs with broad coverage. Cover the most likely cases.
Drill down to specialized object unit specs when you're dealing with complexity or repetition.
Use controller specs where appropriate if you don't have object-encapsulated logic.
Balance performance with coverage & precision.

Going further

This article is already getting long, and I haven't touched upon the following:

How I approach mocking & testing with external dependencies
What libraries I actually use to test
How I deal with test data (fixtures vs. factories)
How I deal with flaky tests
My take on TDD (Test-Driven Development)

These are all subjects I want to write about eventually, but I want your feedback to shape the direction of this newsletter. If one of those interests you more than the others, please let me know.

I also realize this is a very abstract subject, and this article goes over a lot of things quickly. If you'd like a more detailed breakdown of something, please let me know as well.

Build well, build smart, but spend more time playing.

— Kevin