UI/UX for Devs: Have You Thought About the Mental Model?

Friday, July 31, 2020

My day job sometimes involves reviewing existing or planned user interfaces of business applications outside my product team. When I give my feedback to frontend developers, I try to use the opportunity to convey knowledge on UI/UX basics. For instance, by showing them how my findings are informed by a set of questions that can they can use themselves as an early check of their work.

One of these questions is: What is the intended mental model?

I find it helpful because in my experience,

it highlights problems in a constructive way and
is a great starter for discussions on working towards a solution.

So, let’s talk about mental models.

What is a mental model?

Generally, a mental model is a simplified, often incomplete (or even wrong) representation of the surrounding world in your mind. The model forms based on the available visible, audible, or sensory information, prior personal experience, what other people have told you or just by guessing.

The benchmark for a mental model is how good it helps you navigate the real world, i.e. if it can explain something that has happened or can offer a reasonable prediction for what will happen.

Take gravity for example. The simple mental model of “things falls down until they get stopped by something” is good enough for most situations in our everyday life.

If a mental model fails to explain an observation (“why doesn’t this ‘balloon’ thing fall down?”) or when additional information becomes available (Helium is lighter than air), the model may evolve, be replaced by another model, or by several models for different situations. But the model(s) remain simple – you usually do not explicitly think about buoyancy if no water is involved, or the gravitational forces that cosmic objects exert on your car.

Example: A mental model for scrolling

Let’s get back to user interfaces. Imagine you have never seen something like scrolling before. Then you watch somebody drag the narrow rectangle at the right side of the window down, causing the larger part to move up:

After you have seen some more scrolling, noticing that content that left the screen can be brought back, the mental model in your head will most likely look like this:

Is this model correct? Do the areas in light gray really exist?

From a purely technical point of view, this is debatable – especially when you think about virtualized scrolling, where the UI “lies” to you, creating the visible part on the fly from pure data (e.g. the text in a text editor).

From a user’s point of view this is not important. The model explains why parts of the screen moved the way they did when you moved the scrollbar. And it allows a reliable prediction what action will bring back content that was moved off the viewport.

In other words: The model is good enough.

Visual cues are key

When a user perceives an application as easy to understand, it means that he/she was able to quickly develop an initial mental model that matches the reality of the user interface. That model surely will continue to grow and evolve and may have to be adjusted over time. But what to look out for are gross misunderstandings, when a wrong mental model leads to confusion, or worse, data loss.

Your job when designing a UI is to “nudge” the user towards a correct mental model. This is a case of “show, don’t tell”: Explaining the basic ideas of the UI in an online manual or a onboarding experience (likely to be skipped) will not reach all users. The UI needs visual cues that convey the mental model that you intended.

For example, when you cannot show all content at once due to space constraints, you somehow must tell the user that there is more content to discover:

Partially visible content that gets cut off at an edge hints that the area can be scrolled or at least resized.
Page numbers or simply dots look like paging is possible.
Stack-like visuals suggest that content can be switched.

Example: A mental model for tabbing

Tab views show different sets of one or more controls in the same space. This user interface pattern is useful if you…

cannot (or do not want to) show all controls at the same time, but need fast switching and
you can cluster the controls into groups that make sense.

A mental model for a tab view comes straight from the real world, just imagine e.g. three sheets of paper:

Like other UI elements, the design of tabs has evolved over the years. Moving away from mimicking physical objects and their behavior, the overall trend went to visually reduced designs.

While the following design still tries to convey a foreground-background relationship…

…the second version does not care about how things would work in the real world:

Which is not necessarily a bad thing, to be clear. In UI design, you must deal with the fact that a user has a certain “attention budget”. In your UI, each pixel different from the main background color chips away from that budget. Thus, getting rid of lines, borders and colored area is a good way to reduce “cognitive load” – until it is not. This is the point when you have removed important visual cues that help forming the mental model that you originally intended.

How visual cues influence your mental model

Look at the following four sketches of an application with two levels of navigation:

Notice how the gray backgrounds influence your mental model:

The upper left does not give you an idea what to expect when you click an item in the top row or one of the icons on the left side.
The lower right is very clear about the relationships of the different parts of the UI. At the same time, the design is pretty “busy”.
The upper right and lower left are similar in design but convey different mental models. Which one is “better” depends on factors like scrolling, expected window sizes, etc. So, the answer is the usual “it depends”.

Ask yourself: What is the intended mental model?

The application

Users do not develop a mental model in a vacuum. They look for similarities to other applications. In unfamiliar situations, their behavior is influenced by earlier experiences. While not all applications are the same, certain archetypes have developed over time.

If you aim for a quick route of your users to familiarity with the application, you should be clear about the general model. For instance:

Is the application a document editor?
Most business users know how to to deal with Office documents. If your application has “things” that can be handled like documents (regardless of whether they are actually stored in files), users can use their previous experience with documents. At the same time, this comes with certain expectations. For example, editing or deleting one “document” should not affect another document (unless there is an explicit relationship).
Is it something that manages one or more lists of “things”, offering views on the data at different detail levels?
This is the model behind applications that have some kind of database at their core. Browsing, searching, or filtering, followed by viewing/editing of single data items are comparable across business applications. Users of your application could e.g. expect that after they have narrowed down the results list, and dive into a single item, that the list will be “somewhere off-screen”. This is an example where a mental model is used for predicting what will happen, i.e. that is possible somehow to come back to the result list.
Is it like a machine that builds something after you have specified what you want to build?
Software that uses the concept of “projects” and “builds” is what developers work with all day. Some single-source documentation systems use the same approach. Another example are e.g. video production tools.

Your application surely can combine different models or be something unique. And if it is similar to existing software, being different can be an advantage. The best thing users can say is “it is like X, but you no longer have to do Y manually”.

What you must avoid is intentionally trying to be similar to other software and then offering a bad experience. For example, early web applications sometimes tried very hard to be like their desktop counterparts before the technology was reliable enough. Which could result in losing work simply by pressing a single key at the wrong time (some may remember: a page-based application, a large text field, some JavaScript that goes wild, lost focus, backspace – everything gone).

The data

Internally, most (.NET) applications deal with objects with properties which in turn can be objects, lists of objects or references to one or more objects. One of the challenges of UI design is to decide how much users should or should not know of these structures.

Is what a user experiences as a “thing” the same as an object in your code?
Sometimes it is good to hide the complexity of composite objects, sometimes it is not. Take, for example, a form for entering a person’s data. Do you present the address as something that feels like a “thing”, or as something that is part of the person similar to its given and family name? As usual, it depends. On one hand you want a clean and simple UI, on the other hand imagine a database that tracks all addresses that a person has ever lived at.
How do you handle incomplete or invalid “things”?
Let’s say a user has a concept of “things” that map pretty well to objects behind the scenes. Unfortunately, the infrastructure for storing these objects requires them to be complete and valid. Does this impact the user, forcing an “all or nothing” approach when creating a new thing? Or do you allow incomplete data as a “draft”, even if it means it must be stored separately because you e.g. have no influence on the available storage infrastructure?

What to do

Prepare yourself to be asked “what’s the mental model?”
Even if you will never get asked, this forces you to come up with a short summary that helps you communicating the basic ideas of your application (or a specific part of it).
Remember mental models are simplified and incomplete
When a user writes an email and presses the “send” button, in their mind, this sends the email. Conceptually, emails are moved into the outbox, then collected and sent. But even people who know about this (and may have seen an email stuck in the outbox) tend to think of “click equals send”, because this mental model is – you may have guessed it by now – “good enough”. When you think about mental models in your application, expect users to start with the simplest model they can work with.
Watch yourself how you figure out how other software works
Which visual cues do help you? Where does the software rely on known interaction patterns? Why can you use an application that you have not used before?
Think of instances when your mental model was wrong
Was it your mistake? Did you have assumptions based on prior experience? Was it the application’s fault? What could the software have done differently to prevent this? What about your application? Could this happen to users of your software?