XR Expectations Part Four
"We're walking the tightrope of immersion, trying to keep the balance of not exposing any of the telltale signs of trickery to our audience, a process we at PlayStation used to describe as 'Presence Wrangling' - the skill of keeping the user in that sweet spot of immersion for sustained periods, with distraction and activity.
"To do it right, it becomes a huge, complex performance, and as a result it becomes exponentially more difficult to control the experience. Any one of those components going slightly awry might just give the whole game away.“
Quote | Me, quoting myself - very classy! - from Part 3 of this series
Here's a question: If we accept that Immersion is a major goal of an XR experience, should we always be aiming for the maximum level of immersion? Shouldn’t we aim for realism wherever we can? The more realistic the better, because realism is always going to be more immersive... right?
Well, I can say in all honesty that Yes, that’s completely right. But at the same time, I can also say honestly that No ... that’s not entirely true.
A perfect simulation of reality would absolutely be just as immersive and believable as the real world. But reality is hard to believably recreate. We live in it every day, we know what to consistently expect from reality, and as such we’re good at spotting when something’s not right. And so when we try and replicate reality, it’s going to be a long time before we have the technology to get everything right. Until that day, moments of unreality will always risk reminding us of the artificiality of the simulation. And as soon as the user is hit with that thought, the illusion is dispelled.
As we have seen in the previous parts of this series, it’s all about expectations.
User’s expectations don’t just come from the real world, they come from movies, video games, things we’ve heard, videos we’ve watched, stuff we’ve read. It’s impossible to think of any experience without immediately applying some expectations about how it would look and feel. Any wild idea an immersive designer might have, as they think of it in their head they are already framing it based on their own experience. Humans are highly experiential; we imagine things in terms of not just the visual, but also the physical; the weight, the textures, the feelings we get from touching, tasting, smelling, holding it. We have both intellectual and sensory expectations of almost any activity or interaction before we engage with it. And where we haven’t experienced it first-hand, we rely on second-hand information and analogous experiences to fill in the gaps. You’ve probably never put, say, a live octopus on your face, or jumped into a swimming pool full of jelly, but you can’t help but immediately conjure some expectations about what that might feel like.
As long as we don’t go counter to the user’s expectations, the truth is that users are unlikely to question the veracity of the experience from moment-to-moment. Users are generally happy to go with the flow, and in most cases are naturally inclined to be complicit with the experience. They want it to work for them, because we owe them.
There are of course a bunch of things that can break this immersion state, an overwhelming number of potential presence-breakers. There’s lurking risk everywhere for the immersive designer to be aware of. We know that things we do all the time in immersive experiences carry with them inherent mismatches that, if noticed, can snap the user out of it. Reality is hard to simulate.
And so we have a number of tools at our disposal to try and make sure the user doesn't notice them. XR is just a fancy high-tech stage illusion after all, and the magician has a bag of ‘Presence Wrangling’ tricks and techniques they can rely on. Let's take a look at how some of these illusions work. I've split them into 4 different approaches - Distraction, Abstraction, Consistency and Influencing the user's Expectations.
1 | Distraction
The first set of tricks for presence wrangling are based around the idea of using distraction techniques to manage the player’s focus. With the player’s attention arrested, they are thinking diegetically within the experience, immersed in the illusion, rather than thinking non-diegetically, external to the experience itself, as a user experiencing a piece of software.
Use Action & Interaction as distractions
Giving the user something to do, like shooting enemies or interacting with elements of the world, can be a strong distraction. Sony’s cockney criminal VR classics The London Heist and it’s spiritual successor Blood and Truth both make use of this during the car chase sequences. The lateral movements of the car weaving about and changing lanes never happens in isolation; these potential discomfort triggers occur while the user is focused on shooting bikers and mob cars, so they don’t affect comfort or break the immersion. The need to find ammo from within the vehicles keeps pulling your attention away from the visual movement cues that are strong discomfort triggers.
Use Visual / Audio distractions
Visual and audio cues can lead the eye and influence what the user is looking at. In The London Heist, enemies drive past to draw your aim – and your eyes - away from the worst discomfort trigger elements. Your driver and muscle Mickey is a likeable, animated lunk who constantly engages you in conversation and pulls your attention to him. This technique is used to set up a couple of unexpected and effective narrative moments during the chase. The age-old techniques of characters chatting in your ear or video and audio recordings filling in the backstory can keep the user engaged and immersed during journey sections and lulls in the action. It can be especially easy to lose users when they have nothing to focus their in-experience attention on; bored minds and idle hands tend to start pulling on the threads of believability if left alone too long.
Use Narrative / Emotional distractions
Many users might get involved in a good narrative, and if a narrative makes them think about the situations and the characters from moment-to-moment, this means they’re distracted from other things that are going on. Involvement and investment in the characters and narrative can go a long way to distract your audience. This works doubly-so when your narrative exerts a strong emotional pull on your user. Building unease and suspense in horror games is a classic example of this, it tends to totally absorb the player until it is dispelled.
Player to Character interactions can be similarly arresting as long no shortcomings are evidenced. Blood and Truth has standout character sequences that make you care about your bickering crime family through ambitious character modelling and performances. Half-Life:Alyx puts players in the shoes of a scripted protagonist who shares constant, engaging dialogue over their comm-link. The strong characterizations and fun dialogue keep the user immersed and propelled through journey sections that might otherwise be narrative and action dead-spots where the player’s level of immersion may diminish to the point they start thinking out-of-experience, admiring the graphics or thinking about the game from a structural perspective (“Why would there be an empty room here? Am I missing something?”).
2 | Set, Change or Influence Expectations
The second trick in the magician’s bag is influence over the user’s expectations. You can re-frame your user’s expectations of how reality should work, or create entirely new paradigms and make them feel like such a step-up that existing expectations are brushed aside in favour of the new.
Teach the user new Expectations
A familiar example of this is the humble tutorial, which sets the ground rules and tells us how to interact with the experience. When we introduce an interaction or gameplay mechanism, we’re also setting the user’s expectations about what to expect from that element when they encounter it later; how the user will interact with it, what happens when they do, and what the utility of it is to the user. Sometimes we teach you as part of the narrative - before you step on a launch pad and suddenly find yourself flung through virtual reality, we might show it happening first to the guy in front of you so you know what to expect. Making him giggle with glee will set different expectations than if he screams in terror. The user’s expectations are constantly being set through things like layout, narrative events, music, dialogue, and environmental storytelling.
Frame it as having increased convenience or utility
We don’t miss old experiences when new experiences are better. Immersive technologies give us, uniquely, lots of opportunities to improve on reality. We all wish we could teleport to work rather than commuting. We’d all like to force-pull distant objects into our hands rather than walking over and fetching them. Virtual, and to a lesser extent augmented, realities can make this a possibility. But those improvements can change the experience; it’s no longer the same thing.
This exposes a fundamental question which haunts practically every design choice you make in XR. Do you aim for realism or convenience?
If you can improve on reality, users may be happy to accept a tool or interaction that doesn’t match their expectations. That makes the immersive designer’s job much easier and can be used to avoid difficult mismatches between the real and virtual experiences. But it comes at the cost of realism and the immersion that authenticity brings. It’s a subjective choice in every case.
Realism vs Convenience : where we can improve on reality, should we? In crafting these worlds and experiences we can unlock lots of quality-of-life improvements for users, but we have to consider how much of the essential soul we lose from what we’re trying to evoke.
My go-to example of this comes from Sony’s early proto-metaverse PlayStation Home. The team were working on a centerpiece monorail system that would run through all the different expansive areas and connect them together. If it was a real theme park it would have been perfect, but Home didn’t need to be constrained to work like reality.
The designers believed, with noble intent, that their users would appreciate it. It would promote exploration of lesser visited areas, encourage social contact with fellow travelers, build anticipation ahead of your destination, and would simply feel grounded, be more immersive and thus more engaging for users. But the counter argument was that users would almost always just choose to teleport there directly if they could. Why spend time riding the monorail when the system could take you straight to your destination?
The designers’ counter-counter argument couldn’t be ignored though. They were worried that allowing fast travel wouldn’t just replace the need for a monorail, it would also fundamentally change the whole experience users had exploring Home. They put it simply; what would happen to the commerce model and the user experience of Disneyland if nobody needed to walk between attractions any more?
Realism vs Convenience crops up like this in all sorts of cases where we can improve on reality – should we? In crafting these worlds and experiences we can unlock lots of quality-of-life improvements for users, but we have to consider how much of the essential soul we lose from what we’re trying to evoke.
If we choose to make it more convenient for the user, there may be a cost to the user experience and to immersion. It can sometimes be a challenging choice.
Frame it as something else that’s familiar
Re-framing how a user sees something is often a good way to change their expectations of it, and we’ll explore more use cases in a moment. One technique that can be quite useful is to borrow the experiential qualities of something else that the user might be more familiar with.
With DriveClubVR, getting the in-seat positioning right was a headache. Users would always slump into their real-life seats over time; most sofas support a different posture than a racing car seat. Real drivers also have rich expectations about what feels right and what feels wrong with a real life car seat. It directly affects the immersion if you’re familiar with driving if you feel taller or smaller than normal. But trying to get the VR position set just right was fiddly and got in the way of the game, so we added controls to fine-tune the user’s position in VR.
Rather than pull you out of the experience to explain the complexities and operation of adjusting the camera position, and why it was different from the standard VR reset, we just labelled them as ‘seat adjust’ and everyone instantly understood how to use it. Not only that, people even said that this 'tap tap tap' digital input was how they hoped cars of the future would let you adjust your position, rather than using the handles, levers and wheels we have today.
Other common examples include making an alien vehicle controls work like a car - the user’s expectations about how it should work and feel are set when they recognise a steering wheel as the main control - or using 'telekinetic powers' to move objects around in VR, which is familiar and aspirational from movies and TV. It also allows a generous amount of 'helper' code to guide the objects to the right places, something that can easily break immersion if done with direct 1:1 hand control.
3 | Abstraction
It’s a fundamental tenet of all kinds of design disciplines that by simplifying the representation of something, the representation can carry a stronger, clearer message. Abstracted, simplified art styles and communication can work just as effectively, if not more so, than realistic and detailed representation. Abstraction can cut through the mismatches with reality by presenting itself plainly as being stylised and pointedley not-real, thereby changing the user's immediate expectations.
Amplifying your meaning through simplification
The idea of Amplification through simplification is one suggested by comics sensei Scott McCloud, in his detailed deconstruction of the medium, Understanding Comics.
‘Abstracting away from realism’ is something that we regularly do in immersive experiences too, and it can be a useful tool for evoking engagement and deepening immersion. Remember, everything we experience through XR mediums is an abstraction of reality anyway; for now there will always be some shortcomings in anything we simulate in XR. For this reason I always explain to students and clients that, in actuality, immersive design is not concerned with simulation; we work exclusively in the realms of emulation.
Our task is to evoke the qualities of an experience for the user, using the tools and techniques that we have to do it convincingly enough that they don’t stop to question the illusion. If you throw an object in VR, there’s very little simulation in what happens next. Users throw like they would in real life, and expect the outcome they would get in real life. But without the weight of anything other than the controller in your hand, without feeling the heft and shape of the object, and relying only on limited movement data, it’s a tricky thing to simulate. So instead the code will be designed to work out what you’re expecting to happen, and to help you achieve it. It uses clues like where you’re looking, what the possible targets are that are there, their probable priorities to you, and maybe throws in some randomness and fail parameters so you don’t hit the bullseye every time. We can’t simulate it easily in VR, but we can infer a lot about your intentions and give you something that feels more natural and believable than the flawed simulation data would allow. Emulation is always the way; mixing simulation, prediction and creative design to evoke the expected output from your inputs.
Job Simulator by Owlchemy Labs is a perfect example of this; every interaction evokes a real world activity. The representation is abstracted and simplified down to the bare essentials needed to evoke the experience of using it. And yet it works perfectly to immerse the user in familiar settings where we get to break the rules. We recognise each type of object and device from their streamlined representations, and the properties they possess are as we expect. The delight comes in the unexpected outcomes of combining them or using them illogically, where those outcomes are surprising, compelling, entertaining and, on reflection, entirely logical. It’s easy to stay deeply immersed for long periods as you play in the sandbox and explore what happens when we supersize other things than soda and fries, or photocopy a doughnut, or throw things at the customer. The simplification and abstraction doesn’t detract from the immersive qualities of the game, but instead shifts the focus away from the realism of the world to the twisted realism of how the elements react with one another, short-cutting straight to the true immersive pull of the experience and keeping the user there, happy within a framework of sometimes unexpected, but consistent and logical outcomes.
Frame it with a different rule set
This is an extension of the idea that abstracted, simplified representations set our expectations differently.
Tone and Framing affect the user’s response to things happening in the world by conjuring certain types of expectations. Cartoon worlds suggest cartoon affordances. Realistic settings suggest real world rules will be in play, unless we’re told otherwise.
Think about the simple, cel-shaded art styles which are prevalent on the Quest store. The varied cartoon aesthetics are good fits for the on-board processing capabilities of the standalone device. But the style also acts as a useful framing device for setting the user’s expectations about the fidelity of the experience.
If it looks cartoony, the user is less likely to have their immersion broken when they experience unexpected physics and behaviors. We can tweak gravity to make jumps higher and falling objects slower. We can encounter characters who are whimsical and weird while avoiding seeing them as physically grotesque, because we expect such characteristics from animation. These stylized characters would have a different impact framed against a believable and real-world setting; their proportions and personalities might seem alarming and bizarre, and quite possibly terrifying with a realistic rendering. This is exactly why people were repulsed by ugly sonic.
Real world isn’t like the movies, but it’s possible to evoke a cinematic tone and framing in VR, where action-movie super-realism is presented as the norm, to make expectations shift. Sony’s Blood and Truth demonstrates this perfectly; the game’s ‘Lock Stock’ cockney action stylings make leaning out of a fast moving van or diving through a 10th floor office window seem thematically sensible. When high action moments switch in slow-motion, we are reminded of a hundred scenes in action movies, and it pulls us deeper into the fantasy. It not only feels incredibly immersive and appropriate and in line with our expectations, it also makes the frantic, complex scenes easier to manage on the tech-side too.
It's also common to use videogame framing to change the audience’s expectations about the rules of the world. In videogames, anything goes, and real world rules comfortably sit alongside abstractions such as health bars, lives and enemy weak spots. Framing your immersive experience as a video-game style experience immediately opens up the user’s expectations, and strict adherence to real-world expectations can often be relaxed without breaking immersion. 10-15 years ago videogames went through a trend of reducing and removing familiar GUI elements to heighten the sense of immersion. With XR design currently limited by the degrees with which we can satisfyingly conjure realism, there’s presence-wrangling value in moving the other way, and using techniques to gently suggest our immersive experiences shouldn’t carry real-world expectations, but videogame expectations.
We can do this to different degrees, by utilising familiar videogame conventions like non-diegetic GUI elements, scoring mechanisms, and joystick/button interactions. Evoking familiar videogame imagery and interactions can also have the same effect. Just like the videogame tonalities adopted by Scott Pilgrim’s surreal fights, re-framing your immersive experience, or moments of your immersive experience, to suggest videogame expectations should be applied can cause fewer moments where the real and the virtual don’t match up.
4 | Consistency
Consistency is the last trick we’re going to look at from the magician’s bag. It’s a regular technique in the illusionist’s act. By establishing a truth consistently, over and over, the audience comes to accept it without question. We see it all the time in modern politics. But rather than thinking about how this technique can be misused in the real world, think about the many possibilities it presents to help the player stay immersed in a virtual one.
Between designer and user there is an implicit agreement on how the rules of the world will work. We call it the fidelity contract, and it’s simply a promise on the part of the designer that the world, and the rules governing it, will always act consistently and as expected.
Common examples are things like when an experience allows you to pick up one object from a desk, then there’s an implicit understanding that the user will be able to pick up any of the objects. Another familiar example is doors – if you can grab the handle of one door and open it, it breaks immersion when the next door you find is just set dressing, with a handle can't interact with. Suddenly the user snaps out of their immersion to wonder why the two doors are so different. The realisation that it’s a shortcoming of the immersive experience, not a diegetic curiosity within the world, can easily yank the user out of their immersion. Not just this time, either. Every time they reach for a door in future, they’ll step out of their immersion for a moment to wonder ‘which type of door will this be’?
Maintaining consistency is key, and can provide a strong through-line of believability and dependency that can make the world predictable, and thus seem more real. Making your core mechanics consistent is key to this; if your user is spending most of their time engaged in believable interactions, their expectations and level of immersion will be deeply influenced.
So, while by no means an exhaustive list, distraction, abstraction, consistency and influencing the user's expectations are four common and very useful tools the XR designer can use for Presence Wrangling. The best immersive magicians have the skill and dexterity to use many of these techniques at once to impressive effect, managing to keep the player entranced throughout, maintaining a state of deep immersion, and rarely giving the user cause to remind them that the real world exists.
But before we finish...
Let's look at Half-Life:Alyx, a Masterclass in managing Immersion
To close, I want to zoom in on one last example which showcases this brilliantly. Half-Life:Alyx, Valve’s extraordinary 2020 VR game which is one of the best-in-class examples of keeping the user deeply immersed for sustained periods. In reality, Alyx is not a complex game. There are few game systems for the user to learn, a modest cast of creatures, a straightforward linear narrative, compact spaces separated by loads, and no kind of scoring or reward framework to distract the player. However the experience is deeply involving, and is often celebrated for conjuring a sense of presence in the world unmatched by other VR games.
This simplicity belies how smartly different techniques have been applied. Nowhere is this more apparent than in Alyx’’s gravity glove, which seems so very cool and extraordinary to use. In reality it’s a ‘force pull’ mechanism; point at an object and pull it towards you. We’ve seen this a thousand times in VR already. But Alyx gets so much right that it elevates the mechanic to feel fresh and incredible again.
While it is a fantastical sci-fi gadget, it exists within the realist context provided by the game’s setting and framing, which is near-future, but grounded, and familiar. The world is believable, the experimental tech is borne of narrative conceits that have been established through the previous games in the series. The operation of the glove works through very natural 1:1 gestures, which become quickly intuitive after a couple of goes, and it works reliably and consistently for the user. Yanking objects into the air and catching them becomes a satisfying interaction and is super-useful compared to moving over and picking things up. It's not just a quality-of-life improvement, it also makes the user feel empowered in the world, to feel they have a vital advantage over the environment and their foes. In a medium where videogame super powers are granted to us in many experiences, Alyx's simple little interaction arguably evokes that feeling of 'Great Power' better than any of it's contemporaries, because of how cleverly Valve contextualise and frame the ability.
One thing that delighted me as a designer is how the experience sets your expectations so solidly that getting this device feels transformative. Before you get the glove, the game has been showcasing some very cool aspects of their tech. One of these is the fantastic hand modelling and the manipulation of objects. It’s cutting-edge stuff. The opening apartment scenes are full of all sorts of toys and trash to pick up, manipulate and play with. And the focus is all on your hands, establishing just how good those systems are, and showcasing the finger-tracking functions of Valve’s own headset the Index. The world and the objects in it feel real. It’s convincing and you’re expecting to probably be doing fine-grained handling of objects and complex physical interactions later in the game. Because why else would you be given these play-spaces full of toys to stack and balance and throw and catch?
The fine-fingered interactions you’ve experienced up to now have done a great job of establishing the realism of the world. Now it’s time to introduce the fantastical device that empowers you to break the rules of realism. Before too long you receive the gravity glove as part of the story, and you never really interact with the environment the same way again. You still can, at any time, walk over, pick up and object and play with it. But why would you when the gravity glove are so much quicker, easier, and more useful in the heat of the action? It’s very natural and simply fun and satisfying to use. It makes you feel cool. It’s simply better than reality.
You can see it as soon as you start playing with the device. Your expectations are established in a short and funny character-driven training sequence, a natural part of the ongoing linear narrative, which introduces the mechanic and allows the user free time to practice it in a realistic setting. Pull things, catch things, launch things, smash things. It sets rock-solid expectations about the glove’s behavior that then persist for the rest of rest of the game.
How the glove works soon feels authentic, and because the user has some agency over the speed and direction of the ‘pull’, it’s gently skill-based, meaning mistakes feel like they’re the user’s fault, not the game.
It’s a perfect example of a fantastical element that nobody has ever experienced in the real world, but that still manages to feel just as you’d expect every time you use it. Much of the gameplay is built around this simple core interaction, and its robustness and reliability mean there are very few moments featuring the glove that might cause a break in the immersion.
Even more importantly, having that rewarding, consistent core interaction being used regularly throughout the game gives the immersive qualities a solid and reliable anchor that helps enable sustained presence. Despite being an entirely imaginary way to interface with the world, it becomes such a core reason for the user remaining deeply immersed in the illusion for long periods.
So not only is Half-Life:Alyx’s gravity glove a fantastic virtual illusion in and of itself - robust, believable, and empowering – it’s also one of the key ingredients of a bigger trick that’s being pulled off, that of the sustained deep immersion that Alyx is celebrated for. It sets its own rules and it’s own expectations, and then sticks rigidly to the fidelity contract it has defined. That consistently robust experience makes the abstraction of the interface melt away. We don’t think about it, we just do it. We can all soon gravity-glove like a pro. I think this is one of the key elements that keeps the user experience tethered to something deeply believable, even when we’re dodging head-crabs that are flying at our face.
Most experiences can’t rely on emotion-tracking tech to give us insight into the user’s level of immersion at any given time. In the absence of direct live feedback, these techniques can give greater confidence that you will be mitigating or avoiding some common immersion-breaking occurrences. They’re techniques for managing immersion levels, though, not magical solutions to suddenly make everything more immersive. It still falls to the XR magicians, the creators and developers, to craft engaging and immersive experiences, using their creative skills and their understanding of what the user wants and expects. That’s challenging for sure, but it’s the same challenge as every creative medium faces.
Comments