Why Designing for XR is Illusionism on Hard Mode
Updated: Jun 20, 2022
The term illusionism is used to describe a painting that creates the illusion of a real object or scene, or a sculpture where the artist has depicted the figure in such a realistic way that they seem alive and part of the real world. Rather than an artistic interpretation of the world, these works seek to convince us that they are the real world. As is their way, the French have a wonderful name for this approach; ‘Trompe l’oeil’.
Trompe l’oeil means “to deceive the eye”, an historical art tradition in which the artist fools us into thinking a flat image has depth, perspective and is a 3 dimensional part of the world – in short, that we’re looking at a real thing. Whether it’s Edward Collier’s collection of newspapers, letters and writing implements on a wooden board, Edgar Mueller’s pioneering 3D pavement art, Boba Fett lording it over Times Square, or my personal favorite, the unexpected tiled floor at the Casa Ceramica showroom here in my home town of Manchester, Trompe-l'œil makes the viewer question the boundary between the created world and ours.
The technique plays not just on our understanding of the painting’s subject, but also the expectations we bring regarding the setting that the work is presented in. Often the context in which the work is viewed is essential to the success of the trick.
These artworks can take a lot of time, dedication and artistry to plan and create, all in service of casting a temporary spell that might persist for just a moment or two, before we realise that what we’re seeing can only be be a clever illusion. But rather than robbing the experience of value, realizing we’ve been fooled just focuses our appreciation on the talent of the artist in pulling off the trick. If only for a moment, we applaud them for making us believe their trickery is reality. The technique plays not just on our understanding of the painting’s subject,
but also the expectations we bring regarding the location and setting that the work is presented in.
Often the context in which the work is viewed is essential to the success of the trick.
New Jersey artist Alysia shows the illusion from all sides. Seeing the work from the wrong vantage point entices the viewer into voluntarily seeking out the perfect perspective.
For instance, with 3D pavement art, the viewing angle is often all-important. You won’t be fooled by the trick if you approach from anything other than the perfect direction. Knowing this only encourages you to walk around the work and find the point where the illusion is perfect. With a canvas painting using Trompe l’oeil techniques, it’s placement in the gallery and the direction of light hitting the canvas can be essential to achieving the desired effect.
There are two important factors here which put me in mind of the grand trickery that we know of as an immersive XR experience.
Users are willing to be complicit in the trick; they’ll play along and do their bit to maximize the magic.
Changing the setting and context the work is viewed in can affect the believability of the illusion.
I've long regarded XR design as being an amped-up offshoot of this popular art tradition. To me, XR design encapsulates the principles of Trompe-l'œil techniques, but with an almost ridiculously expanded level of scope. It's illusion writ large, turbocharged with added layers of believability and immersion, happening live, in real time, in your world. We’re in the same business of encouraging users to buy into an illusion that they know cannot be truth. Just like any form of Trompe-l'œil, we’re practicing the art of not just fooling the audience but asking them to fool themselves.
I am starting the think that perhaps we should think of immersive design as the start of a new tradition, one of Tromper les sense if you will, where the role of the artist is to fool not just the eyes, but all of the participant’s senses. Tricking as many sensory inputs as possible, through a mix of convincing portrayal where we can and misdirection where we can't, to make the user believe that the magic they are experiencing is real.
Right now we're at the point where, for the many people discovering VR, MR or AR as a new user, Arthur C Clarke's famous law that states "Any sufficiently advanced technology is indistinguishable from magic" can be applied with little hesitation. The spectacle the technology provides tends to provide the early enchantment. New users, drunk on the delights the tech has shown them, tend to rave about the magic of the first applications they've tried, even if they're comparatively poor examples compared to what's our there. But of course the tech alone doesn't maintain that sense of enchantment as the user gains experience. That becomes the job of the immersive designers who are crafting the experiences, but as any immersive designer will confess, it is never easy. Magic takes practice and dedication, study and research. And most importantly an open mind to exploring new ideas, and finding ways to use existing ideas in new and unexpected ways to keep the magic fresh. Stage magicians have been reinventing and representing the same bag of tricks and illusions to increasingly smart audiences now for hundreds of years.
XR has many more aspects for the immersive designer to consider. Beyond what the user is merely seeing, as with most Trompe l’oeil illusions, we also have to consider lots of other aspects. Here's a quick list just off the top of my head, but it's by no means exhaustive:
Locomotion in AR and MR is normally 1:1 with our real-world movement because the play space is our real-life space. We need to maintain the illusion with solid tracking and positioning of augmented elements as the user moves around. But we rely on the user providing a clear, clutter-free and safe environment. If we place an object, we need to ensure the user can actually reach it if they need to. And for home users at least, we have no control over that environment, the layout or the clutter that will get in the way of the illusion working.
In VR, locomotion can be even harder, of course. Because we want to offer worlds bigger than your real life space, and we don't want users walking into the limits of their real-life spaces while using that space, we need unique solutions that wouldn't make sense for a mixed / augmented reality application. Almost all of these solutions can easily break the user's immersion and thus spoil the trick.
Their interactions with the virtual objects we have added, whether they behave as expected, and whether they satisfy the utility needs of the user. A believable-looking shovel I can pick up and intuitively manipulate with believable grip points and natural posture is immersive and incredibly hard to pull off, but we won't feel the weight of it or the physical reaction of striking a surface with it. Clearly false, boos for the magician.
The sounds the user is hearing, both natural from the outside world and generated by the experience, need to provide clear information to the user. Every artificial object needs to sound real, spatially correct, and appropriately mixed in with the soundscape. In AR and MR, that's a real soundscape - challenging. In VR, the soundscape is entirely artificial - a lot of work.
The movements and gestures the user is making, and how the experience reacts to that. In real-life, a body that behaves as expected is the norm, and our body language is an integral part of us, and how we communicate with others. If we wave at a character or flip them the bird in an XR experience and they don't register it, boom! The trick is blown wide open and we feel disappointed.
The (limited) touch sensations we can conjure through haptic vibrations rarely give us the feedback we'd get from real life. And without additional haptic inputs like a suit, users are only going to feel them through the controllers anyway (and soon, thanks to PSVR2, through the headset itself). Haptic feedback is better than no haptic feedback, and is becoming increasingly sophisticated, but it's a long way from the true sensations we're familiar with from real life. Clever is cool but it doesn't fool.
However users interact with the illusion, we're usually doing it through several layers of abstraction. Anything involving interaction requires some degree of translation on the user's part - rather than reaching out and grabbing that object you might be tapping a screen (AR), using a pinch gesture (finger tracking in MR and VR), or pressing a button on a controller. All of these need to be learned, and none of them may be a convincing replacement for the illusion you're trying to create.
These are just some of the challenges we face, and so we have developed techniques we deploy from our magician’s bag of tricks to make the extended illusion happen for our audience. We try to maintain immersion to a degree where the user feels present in the scene, in the moment, and buys into that illusion without questioning it. Combined, they have the capabilities to present a robust and convincing Tromper les sense illusion. But with more elements adding to the trick, there are many more potential points of failure.
If any one of those things goes wrong, or fails to convince, the veil of illusion might be whipped away. As immersive designers we fear crossing into the uncanny valley where expectations are tested to their limits. The more believability we add, the more we risk breaking the whole trick. In a sense, we’re adding more magicians to the stage and performing more varied tricks. At the same time, we're walking the tightrope of immersion, trying to keep the balance of not exposing any of the telltale signs of trickery to our audience, a process we at PlayStation used to describe as 'Presence Wrangling' - the skill of keeping the user in that sweet spot of immersion for sustained periods, with distraction and activity.
To do it right, it becomes a huge, complex performance, and as a result it becomes exponentially more difficult to control the experience. Any one of those components going slightly awry might just give the whole game away.
And unlike Trompe l’oeil, the XR designer is never aiming for the user to realise they've been fooled. We want to keep them believing for as long as we can. They know going in that this will be a virtual reality, or an augmented reality. They're holding up their phone or adjusting how their headset is sitting on their head. That moment of honesty and realisation that Trompe l’oeil requires is something that Tromper les sense is actively seeking to avoid.
And we know where this is going. In the future, the immersive experiences our users explore will be working even harder, across multiple technologies and techniques; To respond to their voices. To respond to the eye-contact they make with virtual characters. To react to the emotional responses we captured through eye-tracking and bio-metrics. We’re developing new tools all the time that will let us add more nuanced control to how the trick works for our users. More tools and more techniques for the magicians. This is going to open up the possibilities of the illusions we can create, expand the scale and scope, but that means the orchestration of those tricks becomes more complex as a result. More magicians join the stage, then. And now some of them are juggling. On unicycles. Tromper les sense is going to demand the greatest show on earth.
Remember what we said about the importance of the context the artwork is presented in. Whether we’re designing for a virtual reality or an augmented/mixed reality, just like viewing the 3D pavement art from the right viewpoint, the context of the presentation is always of supreme importance to the success of the illusion. Augmented (AR/MR) and Virtual (VR) realities give different contexts.
If AR and MR are Street Magic, tricks played out in a real world setting that can’t be rigged or controlled, then VR is Theatrical Magic, where the environment is stage managed and fully under the conjuror’s control.
With MR and AR the virtual has to successfully blend with the pass-through real world you can see behind the illusion. There is no other choice. For the magic trick not to fail in the eyes of the audience, we strive to keep the tracking flawless to maintain the positioning and orientation of the illusion within our real space. We work to ensure the mixing of the real with the virtual to appears as seamless as possible, displaying the expected visual fidelity and inheriting the real-world lighting of the scene. We aim for the virtual sounds to feel natural and integrated spatially into the real world soundscape. And most importantly, we want the behaviors of these elements, and how they interact with the real world, to work as per our user’s expectations. The real world framework sets hard rules, because that’s the environment we’re operating in with MR and AR. Anything virtual that augments that world, and represents itself as part of it, is expected to inherit those rules as far as possible, to behave accordingly and as expected.
With VR the major difference is that everything our user is seeing is an illusion. The whole world and everything in it. Because we control the whole realm, the context within which the illusion is being viewed is very different, and that sets different obstacles for creators. Unlike AR and MR, we don’t face the challenge of making the virtual seem believable within the framework of the real world, where juxtaposing the two seamlessly is essentially the core of the trick. Instead with VR the challenge is creating a virtual whole that’s convincing enough and consistent enough to allow us to suspend our disbelief. Creators have some flexibility over how that world can work and can tailor the environment to support the trick. There’s no real world in view to match the experience against directly, so we could in theory create any world we wanted, with any set of rules. But users always enter VR with expectations as to how the world should work. We have to observe the rules if we want to present reality, but those rules can be more flexible than when we’re augmenting the visible real world.
If AR and MR are street magic, tricks played out in a real world setting that can’t be rigged or controlled, then VR is theatrical magic, where the environment is stage managed and fully under the conjuror’s control. The whole world is part of the trick with VR, and that opens many new opportunities to distract and misdirect the audience.
In either case, If any of these immersive expectations fall short, the illusion is spoiled.
If an augmented object suddenly disappears due to a tracking blip, or if it tracks lazily and sluggishly against our real world movements, any magic we’re experiencing can quickly dissipate as we have to switch to a different cognitive state to register the cause of the issue. We might have to consider what action we need to take to get the trick working again. We switch from one expectation model; you're seeing a whale swimming outside your window and are caught in the magic – to a different expectation model with a lower set of standards to meet; we see it clip through a wall and are instantly reminded that this is not all real. It is reality augmented with unreality, a falseness, a lie.
And as we become more familiar with these behaviors, more and more we go in expecting to see a trick, rather than expecting to see magic.
Every time one of these illusions ‘fails’ for us in XR, as designers we accept that we break the user’s belief in the moment. They emerge from their immersion, sometimes explicitly in a clean break, and sometimes little-by-little as a sense of unreality slowly emerges. At the same time that we have spoiled that one moment for the audience by reminding them it’s a trick, we are also slowly chipping away at the wonder and building a different set of expectations in the user’s mind. Over time that user will put on the headset expecting to see these shortcomings.
And that’s where seasoned XR users find themselves. We take our seats in front of the stage hoping to be wowed, certainly, but mainly we’re there to see the quality of the performance. As users we launch into these games and applications with an experienced awareness of what to expect from a virtual or augmented reality experience. We go in understanding the limitations and constraints. That ‘XR Wow’ feeling that’s in plentiful supply when you’re a virgin user becomes an increasingly rare treat as time goes on. Our expectations of what that medium will deliver are no longer inherited from our expectations of reality, where magic evokes wonder because we cannot explain it.
Instead, we’ve become Penn and Teller, watching and rating every XR magician's tricks with a critical eye, saving our excitement for when something can come along and really fool us. The winners are those experiences where we’re so caught up in the performative, immersive qualities that we find ourselves wowed at the trick and lost again in the magic, not cognitively stepping back to recognise how the trick became obvious to our expert eyes.
As we become more familiar with these behaviors, more and more we go in expecting to see a trick, rather than expecting to see magic.
When you’re designing an immersive experience, it’s important to always remember that new users and experienced users are going to have very different expectations. As XR designers, we always have to think about that context and consider the different experiences different users will have with the same piece of software, the different experiences they’ll have using the hardware itself, all depending on their own familiarity with the medium. That means we have to consider how well the magic tricks that we employ to immerse our users in our unrealities are going to work when many of them understand how the trick works.
As someone who’s spent a decade working at entertaining people through the various magics of VR, MR and AR, I see myself as a few years further down the line than the average user. I’m distinctly in the ‘seasoned magic fan’ camp at this point. I know how the assistant gets sawed in half. I know the secret of the vanishing cabinet and how the magician catches the bullet in their teeth. So the feeling of genuine awe and disbelief is much rarer for me now, and more precious. I take my seat at the show largely to see the implementation of the trick and to appreciate the technique, the magician’s mastery. But I take my seat always hoping to be wowed, to be fooled by the trick and not think about the technique, if only for a moment or two. The magic for me is when I find I have been immersed for a stretch of time, fully believing in the unreality around me without stepping back from the illusion and seeing the trick at work.
That point of realisation comes around when the trick fails to convince me. I want to be lost in the illusion, and the thing that gets me there and keeps me there is careful craft that effectively manages my immersion from moment-to-moment, using Presence Wrangling tools of skillful distraction and engagement, and steering me away from mismatched expectations that pull away the curtain and shatter the illusion. As a seasoned magic fan, I know It’s a hell of a trick to keep the performance going for an extended period, all those magicians working away flawlessly without me noticing a single one of them messing up. From my Penn-and-Teller viewpoint, you’ve fooled me. That’s illusionism succeeding on Hard mode.
Trompe-l'œil is incredibly challenging to pull off successfully, but if there was such a thing as Tromper les sense it would represent the next step up in complexity and difficulty for the artist. As more and more users become familiar with the technologies, VR, MR and AR are the platforms where the most accomplished conjurors are needed to step up to that challenge as the technology and user expectations around them advance. This is a time where smart and creative XR design will need to take center stage and push illusionism to new levels. Because soon we’ll be performing for a more sophisticated, experienced audience, and they’ll all be in on the trick.
Bring on the next level of XR Magic. The audience is going to expect to be fooled like they've never been fooled before.