Card sorting is a great solution when it comes to creating a reliable information architecture. A classic mistake we make is trying to structure the information architecture based on what we think is right, but this will not be the same as how users will want it to be.
We have written an informative article on card sorting, it tells you what card sorting is, how it is conducted, along with recommended card sorting tools: Card Sorting: Understand Your Users For Better Information Architecture
Now that we’re up to speed on card sorting, we ask the question: “How many test users is enough?”
The simple answer to this question: 15 test users is enough.
Read on to find out more about why 15 test users is enough for reliable card sorting result.
Card Sorting Correlation Scores
Correlations run from a score of -1 to +1, where -1 is a negative correlation which indicates that the datasets are opposite to each other. Whereas 0 shows no relationship and a correlation of 1 indicates that the two datasets are perfectly aligned.
For card sorting, Nielsen states that there is only a 0.75 correlation between the ultimate results compared to the results from five users. This correlation score isn’t sufficient for reliable results, indicating that more than five users are required for card sorting.
This calculation from Nielsen stems from a study from Tulis & Wood that conducted an in-depth research analysis to assess the minimum number of participants that is required for a card-sorting study, the research article is available if you’d like to take a read: How Many Users Are Enough for a Card-Sorting Study?
In order to reach a correlation score of 0.90, 15 users are required. After 15 users, the correlations don’t increase very much. In fact, testing with 30 people only provides a correlation of 0.95.
The additional 0.05 is slightly better but it isn’t worth the effort in doubling the number of test users. Furthermore, testing with 60 people only leads to a correlation score of 0.98.
Plainly said, 15 is the point where you start getting diminishing returns and if you were to insist on testing with over 100 users for card sorting compared to, say 15-30 users, the results would be more or less the same.
Why is 15 test users enough?
If you have tried to Google an answer to this question, most of the search results suggest that 20-30 test users are recommended for card sorting.
So why are we suggesting that 15 users are enough?
The correlation score of 0.90 for 15 users is good enough for practical purposes, we’re suggesting this number of users with expenses in mind. An additional 45 users for an added score of 0.08 would result in much higher costs and time. Is an extra of 45 users worth the additional 0.08 correlation?
Card sorting is meant to be a relatively easy and painless research study, and as long as you are recruiting the right test users that represent your target users, 15 of these test users would provide sufficiently reliable results. The reality is, there is a UX budget we need to adhere to, and card sorting is just one of the many research studies we will do.
If your company has a lavish UX research budget or if there’s a huge well-funded project, you could splurge and double the size by testing with 30 users, bringing the correlation score to 0.95.
Let’s take a look at this graph for a better understanding:
As you can see, after the sample size of 15 users, the line reaches a plateau where the average correlation starts to flatten out indicating that there is diminishing returns.
Why Do We Need 3 Times More Users Compared to Usability Testing?
Five users are enough for most usability testing so why do we need three times as many users to reach a similar level of insights for card sorting?
Usability testing is an evaluation method. This is when we have a design and we want to determine whether it matches users’ needs and mindset. Despite differing capabilities in computer skills and intelligence, when it comes to challenging design elements, the results are prominent after testing with a few users.
For usability testing, all we are trying to find out is which design elements are problematic and required improvements. Because the nature of this research study is concerning design elements, there isn’t a need for more than five users in order to uncover the underlying problems.
Whereas card sorting is a generative method. There isn’t a design yet, the goal is to explore how people think on certain issues. Compared to design elements, when it comes to mental models and vocabulary, there is great variability. People have different ways to describe the same concepts and so naturally more users are required to generate a stable understanding of users’ preferred structure and to determine the best way to accommodate the difference among users.
Recommendation: Best of Both Worlds
Consider combining the two methods by firstly conducting card sorting, the generative method, to set the direction and reliable starting point.
Then create a draft based on the findings and conduct usability evaluation to refine the design, it can function as a quality check for the initial generative findings.
Doing so will allow you to catch any minor mistakes in subsequent user testing instead of tripling the size of the card sorting studies (and tripling the expenses).
Listen to Users
It is important to point out that we shouldn’t design the information architecture purely on card sorting’s numeric similarity scores.
When it comes to deciding what goes where, pay attention to the qualitative insights we gain from the test session. A plethora of valuable information comes from listening to users’ comments as they sort the cards: understanding why users group certain cards together provide an in-depth insight into their mental models.
15 test users is sufficiently reliable for a practical card sorting study, consider doubling that and recruiting 30 users if you have an excessive budget that can be utilised.
Consider combining card sorting and user testing to comb out any small mistakes without over-recruiting and over-spending.
Don’t forget to listen to your users as they sort through the cards, the qualitative insights can provide a valuable understanding to users’ mental models.