Writing UI tests in React applications without breaking a sweat
Written by German Ivanov
We all want our software to be stable. To achieve this, we test the software and we test it a lot. The same principle should be applied to customer-facing applications. However, it’s often neglected and overlooked which leads to having un(der)tested code in production.
What do people do in these cases? They try to test their code according to the testing pyramid. It is an idea that states that majority of your tests should be unit tests, some should be integration tests, and only few should be end-to-end tests. This model tries to optimize the amount of written automated tests without compromising on the overall stability.
In this article, we’ll look at how we’ve approached this problem at Sixfold, and the learnings we’ve acquired along the way. We’ll also look at why the classical testing pyramid might not be the best pick for modern UI applications. Examples in this article are React-centric, but I believe they apply to any other component-based application framework.
Note: this article is not an introductory one. It is targeted towards more experienced front-end developers.
Unit testing
Let’s start with the foundation of this pyramid — unit tests. They’re supposed to be the most frequently written ones. One could even argue you should strive for 100% coverage of your codebase with them. In theory, it sounds like a great idea that brings a lot of value. You get resilient pieces of reusable, bullet-proof code. But do you though? Let’s focus on the specifics of our domain and see what usually happens.
Unit tests have to be isolated to be effective. In theory, this should play well with React
components. They are also supposed to be self-contained to an extent. So that we would have something to discuss, let's use this piece of code as an example:
We would then probably write something like this to verify it:
This is a typical test which checks whether the callback is fired on the click
event and how the content is rendered. However, this test wouldn't provide much value to you. First of all, it can't say whether the button has the correct visual appearance which is arguably very valuable in UI testing. Secondly, it doesn't say whether this button would work in some form because there are many moving pieces. Thirdly, even on this isolated level, this test can fall short. Let's assume that the styles of this button look like this:
This button won’t work in an actual browser but tests are passing. Why’s that? Because jsdom
is an environment that can't fully simulate how browsers work. In fact, Goal of the project is to emulate enough to be useful
. Tests have to be written with this peculiarity in mind.
End-to-end (E2E) testing
Based on the aforementioned issues, E2E testing seems like a better fit. It’s executed in an actual browser so it gets as close to the user experience as possible. However, the source of benefits is also the main source of disadvantages. Browser tests tend to be slow, expensive and hard to make stable.
The main reason is the number of parts that are affected by these tests: network, your cloud provider, your backend API and your luck on this day. This might be good sometimes. As an example, this works well for a smoke test. But when we’re trying to verify highly interactive user scenarios, we want to have results we can rely on at a scale.
There are several things browser tests are very good for: visual testing, cross-browser compatibility checks, smoke tests. These scenarios are usually less painful to get stable and bring considerable value compared to the amount of effort put in. They might also be used for covering some business-critical happy scenarios which are worth spending the time on.
So, in this case, I agree with the test pyramid that end-to-end tests should be the least frequently written ones. If neither unit tests nor end-to-end tests can give us enough confidence, what else can we do about it?
Enhanced integration testing
I believe the most frequently written test suites in UI applications have to be integration tests. They are relatively cheap, fast to run and can cover all usage scenarios your app has — all happy and not-so-happy paths.
So, the first step to success is to simulate events as realistically as possible. For that, we can use a library called @testing-library/user-event. It does many neat tricks to ensure our components will behave the same in the browser as they do during tests. It’s a part of the testing-library which encourages many good practices. The main motivation behind it is:
The more your tests resemble the way your software is used, the more confidence they can give you.
In essence, you should try to test the behaviour rather than the implementation details of your component. It might seem hard at first but it’ll pay off in the long run.
Let’s try to write a more complex test with this idea in mind. We’ll test a form that uses ComplexCustomButton
:
We would then test it the same way a user would use it. They would probably enter some text first and then hit Search, right? So, let’s write it.
This test is as close to the way users will interact with the form as possible. They will look for an input field, enter some query and click on a button with a text which seemingly submits the form. What’s even better, @testing-library/user-event
is smart enough to understand the button is not clickable so the test will fail. This assures us the passable test gives confidence of a working code.
Conclusion
As we demonstrated, tests can lead to false confidence. The unit test passes even though the functionality is actually not there. Similarly, tests may lead to frustration, in the form of a browser test failing due to increased network latency. The art is in finding the balance, be it in an integration test, or some other approach entirely.
There are likely still many projects where the classical testing pyramid is a good model to follow. However, we should not follow the model blindly, instead, focusing more on the intent of the test — ensuring users are able to use our software, and the software working correctly in response.