I’m still processing this article from Mel Slater, where he also defines Place and Plausibility illusions. It has a lot of fundamental ideas in it, so let’s now talk about “sensorimotor contingencies” (SC) and Valid Actions and how they could help normalize VR systems and applications.
Bear with me, it’s long but interesting, and it’s a good warmup for the upcoming year 🙂
Again complicated names for a simple idea : Sensorimotor contingencies and Valid Actions represent the set of possible actions and perceptions of your physical VR system.
Valid Actions
Immersive systems can be characterised by the sensorimotor contingencies (SC) that they support. SCs refer to the actions that we know to carry out in order to perceive, for example, moving your head and eyes to change gaze direction, or bending down and shifting head and gaze direction in order to see underneath something.
The SCs supported by a system define a set of valid actions that are meaningful in terms of perception within the virtual environment depicted. For example, turn your head or bend forward and the rendered visual images ideally change the same as they would if you were in an equivalent physical environment. If head tracking was not enabled, then turning your head would have no effect, and therefore such an action could not be useful for perception. We define the set of Valid Sensorimotor Actions with respect to a given IVR system to be those actions that consistently result in changes to images [Cb: in the sense of ‘perceptive images’] (in all sensory modalities) so that perception may be changed meaningfully.
We define the set of Valid Effectual Actions as those actions that the participant can take in order to effect changes in the environment. We call the union of these two sets the set of Valid Actions – the actions that a participant can take that can result in changes in perception, or changes to the environment.
For example, consider an environment displayed visually through a head-tracked HMD.A participant in such an environment can usually quickly learn the effect of headmovements on visual perception – the SCs. Such head movements will be ValidSensorimotor Actions. However, suppose the participant reaches out to touch a virtualobject, but feels nothing because there is no haptics in this system. Here, the reaching outto touch something is not a valid sensorimotor action for this IVR.For example, consider an environment displayed visually through a head-tracked HMD. A participant in such an environment can usually quickly learn the effect of head movements on visual perception – the SCs. Such head movements will be Valid Sensorimotor Actions. However, suppose the participant reaches out to touch a virtual object, but feels nothing because there is no haptics in this system. Here, the reaching out to touch something is not a valid sensorimotor action for this IVR.
Now imagine an environment displayed visually on a large back-projected screen – again with headtracking. However, now when the participant looks far enough to one side visual elements from the surrounding real world would intrude into the field of view. Actions that result in perception from outside of the virtual environment are also not valid sensorimotor actions.
This is also why HMDs may result in less breaks in presence since, if the outside world is blocked, well you won’t see the outside world ! Which happens a bit too often in caves unless you’re in a 6-sided cave.
Please note that we’re only talking about the physical actions of the user, and not what happens in the virtual environment.
Defining your system
You might think that all those definitions are useless. Â It looks like simply putting words on things you already intuitively know.
But actually they’re the root of the definition of a VR system !
When you’re talking about the displays and devices of your system, you’re describing the technical solution to the question : “What is possible with my VR system ?”.
The real answer is the set of Valid Actions as described above. Guess what type of VR system I’m describing :
– “I can move my head and one hand, and they wil l tracked in a 3x2x2m box with a resolution of 0.1cm”
– “If I look forward, to the right, to the left and down, I will see the virtual environment”
– “My field of view is always at least 60°”
So, what system fulfills those Valid Actions ?
A typical cave with 4 faces is a good candidate… but a (good) HMD based setup will also work ! You can see now that an application should provide exactly the same experience on different systems as long as those systems provide the same Valid Actions!
Different VR systems are equivalent if they allow the same set of Valid Actions. Right now, Idon’t think any Cave is equivalent to a HMD based system, and vice versa, but one day it could be!
Those definitions can also be used to classify VR systems :
In this view therefore, we describe immersion not by displays plus tracking, but as a property of the valid actions that are possible within the system. Generally, system A is at a higher level of immersion than system B if the valid actions of B form a proper subset of those of A.
Now, every aspect of your VR system is linked to the Valid Actions you want to get :
In this framework displays and interactive capabilities are inseparable. Consider for example the issue of display resolution. At first sight this may appear to have nothing to do with  interaction or SCs, but in fact if the participant wants to examine an object very closely, then the extent to which this is possible will be limited by the resolution of the display. Relatively low visual display resolution will mean that the normal action of bringing an object closer in vision by moving the body, head and eyes closer to it, will fail earlier than it would in physical reality, and at different times in different systems.
That’s why, rather than seeing the technical specifications of a VR system (it has 4 faces, 2x2x2m, 1 hd projector per face and optical tracking), we should instead have the set of Valid Actions that are possible in it : the head and hand are tracked in a 2x2x2m area with 0.1cm precision and 120hz update, the horizontal fov is 200°, vertical is 90° up, 100° down, when standing at the origin, the pixel angular resolution is …
What particular hardware is used to achieve that doesn’t really matter. We still have to define the set of existing Valid Actions, but I think with current systems the set will be pretty small !
This framework also has the advantage of putting the user at the center of the VR system. Everything is defined with respect to the user’s perception and actions. This is of course how systems are designed, but those requirements are never printed in press releases or websites.
Portability
Now suppose you want to write a VR application that can run on many VR systems. How do you define the set of valid systems for your application and how can the application know about those requirements ?
Some VR software offer virtual devices that allow your application to be abstracted from the physical device that will be used. Simply instantiate a compatible device at runtime and your application will work. But if you only define a 3d tracker, your application won’t know the available range of tracking. If the application knew that when the head is 20 cm near the boundary of tracking (or 20cm to a wall), it could warn the user, or trick him into moving in another direction.
More advanced software will provide a user abstraction. This way, your application only relies on this virtual user’s head and hand position, plus a virtual device for interaction. How each part of this virtual user is actually moved depends on the physical system. This is a step forward compared to only virtual devices since it’s formally putting the user back in the loop, and provides more abstraction (and more standardized) than virtual devices only.
Defining the compatible VR systems for you application
All this is only one part of the solution.
Suppose you’re writing a VR tennis game. Your game will only work if you have a minimum volume of tracked area, and if your hand tracking device is able to acurately record hand movement when doing a very fast movement to hit the ball.
Or you want to create a VR climbing game. You will need to be able to track the user’s hand to at least 2m high and probably want to have a way for the user to look down to feel the height.
You’re simply defining the set of Valid Actions that your application requires to function correctly.
And any VR system that offers all those Valid Actions will be compatible with your application.
Now how you define those Valid Actions for applications and systems? How do you match them ? How will this impact the creation of an application? Will it be simpler or more complicated to create a VR application?
Anyway it should improve application portability and overall user experience by making clear the minimum requirements needed by the application.
This is a nice research topic and is left as an exercise to the reader 😉
That’s a very interesting paper; the concepts described seem like they would be very useful – I’ll need to make some time to read it in full!