Introduction

Immersive virtual reality (IVR) places the user into a computer-generated but vivid scenario rich with sensory input, eliciting a strong sense of immersion and creating the illusion of a real experience (Suh & Prophet, 2018). IVR also prompts strong, “realistic” psychological and physiological user responses (e.g., emotions; Diemer et al., 2015).

IVR is accessed either by using an HMD (Head-Mounted Display) or CAVE (Cave Automatic Virtual Environment). The HMD is a headset with two integrated lenses the user puts on. CAVE projects the IVR content onto multiple screens that physically surround the user. Both of them track the user’s head movements, among other things, continuously updating what is displayed to match the user’s viewing perspective as they move. Gonzalez-Franco and Lanier (2017) argue this aligns IVR content with what the user’s brain expects to happen next, so they “tend to treat the simulated reality as real, which in turn will engage additional neural mechanisms to further the veracity of the illusion” (p. 2). The hypothesized supporting neural mechanisms are threefold (Gonzalez-Franco & Lanier, 2017): sensory input from multiple modalities is congruent and can thus be integrated well (bottom-up multisensory processing), the user feels the motor actions they take in IVR are indeed initiated by them (sensorimotor self-awareness), and the user can take actions in IVR to confirm predicted outcomes (top-down prediction manipulation). However, whether VR indeed elicits the same brain response as a real experience remains controversial. Related empirical findings are still scarce, as established neuroimaging methods conflict with the modalities of IVR (Bailenson, 2018): The user’s ability to move their body to navigate the virtual space is key to IVR but causes issues for EEG (e.g., increased noise due to movement artifacts) and fMRI (subjects are required to lie still).

Hence, applied studies focusing on IVR’s practical implications for brain plasticity – the human brain’s remarkable ability to adapt its structure and connections in response to stimuli and changing demands (Kolb & Gibb, 2011) – are more numerous. For example, IVR is used successfully in treating deficits in clinical populations (e.g., treatment of phobias; Rizzo & Koenig, 2017) and enhancing abilities in regular populations (e.g., learning and teaching in schools; Hew & Cheung, 2010). Notably, one of the technology’s core advantages is access to realistic experiences without the risks of the real-life scenario. This has proven especially useful for training social skills in people with ASD as it enables them to practice social interaction flexibly and repeatedly without embarrassment or trouble (Ip et al., 2018). Similarly, IVR is also useful for training empathy and prosocial behavior in adults (Herrera et al., 2018): it creates a vivid illusion of reality where the user can embody any object or living being, including people allowing you to embody someone else’s perspective. Hence, the present essay reviews two related pieces of evidence: a study by Ip et al. (2018) using IVR to enhance social and emotional skills in children with ASD, and a study by Herrera et al. (2018) using IVR to improve adults’ empathy and perspective-taking abilities.

Study A

Although controversial, the research literature broadly describes people with ASD as having “deficits in emotional and social adaptation skills” (Ip et al., 2018, p. 1). To help children with ASD better integrate into society, Ip et al. (2018) ran an IVR intervention program with 94 primary school students from Hong Kong, targeting said emotional and social adaptation skills. It is the largest study conducted in this strand of research thus far (see Bailey et al., 2021; Mosher et al., 2021 for extensive reviews).

The study employed an experimental design where the treatment group (N = 47) went through the IVR intervention, accompanied by a trainer, whereas the control group (N = 47) did not receive the intervention. Children in the treatment group completed a pre-assessment phase, an intervention phase, and a post-assessment phase. Children in the control group only completed the pre- and post-assessment phases. The authors hypothesized that the intervention group children would show significant improvement on all post-assessment measures of emotional and social adaptation skills, while control group children would show no significant improvement.

In the pre- and post-assessment phases, all children (and in some cases their parents) completed questionnaires and tests regarding their cognitive profile, emotion recognition, emotion expression, emotion regulation, and social adaptive skills. In the intervention phase, the children completed 28 IVR training sessions over 14 weeks. The authors designed six IVR scenarios that broadly matched the daily experience of Hong Kong schoolchildren. For instance, in scenario three, the user is in the school library and handles various social challenges like sharing their workspace with others and borrowing a book at the library counter (Ip et al., 2018). All scenarios were presented in half-CAVE, i.e., the IVR content was projected onto a setup with four screens surrounding the user.

In summary, the study reported most of the expected positive effects: By the end of the 14 weeks, the intervention group children had significantly improved on measures of emotion expression, emotion regulation, and social reciprocity. However, they showed no improvement in their emotion recognition or social adaptive skills. Interestingly, while control group children showed no pre-post differences on most measures, they improved on two social skills subscales, i.e., communication and community use (e.g., road crossing behavior). Unfortunately, the authors do not mention if the control group children could participate in other interventions or training parallel to the study. Since the study spanned five months, and all children in the sample were “required to attend mainstream school in the inclusive education setting,” (Ip et al., 2018, p. 1) it is possible control group children underwent training in road crossing behavior, leading to the observed improvements.

The study aligns with the rest of the recent empirical evidence (see Mosher et al., 2021) in indicating that IVR interventions can help children with ASD build social skills in an engaging and effective way. However, the study contained disproportionately few girls, i.e., just 8 out of 94 participants. Future studies should recruit a much higher proportion of girls so that the results’ interpretability is not limited to boys. The study’s replicability is also limited and made difficult both for researchers and practitioners interested in (re-)testing the intervention: not only is the IVR setup highly specific (i.e., complicated hardware and IVR content tailored toward Hong Kong schoolchildren), but it is also expensive according to the authors.

Study B

Herrera et al. (2018) designed and conducted two separate studies, of which only the first is reviewed in the present essay. They investigated how an IVR perspective-taking task affects empathy and prosocial behavior compared to a traditional perspective-taking task, both short- and long-term. The target population to empathize with were the homeless because they are “considered an extreme outgroup that people often struggle to empathize with … [and] anyone can become homeless.” (p. 6)

One group completed a traditional narrative-based perspective-taking task (NPT; N = 56) and the other a perspective-taking task in IVR (VRPT; N = 61). The authors hypothesized that compared to NPT participants, VRPT participants would show increased empathy and prosocial behaviors both right after the intervention and in the long term within eight weeks after the intervention. In the NPT, participants imagined becoming homeless and then read a narrative text. In the VRPT, participants experienced homelessness in IVR. The narrative shown in IVR matched the narrative in the NPT’s text, containing three scenes: 1) eviction out of your own apartment, 2) living out of your car leading to a police encounter, and 3) seeking shelter on a bus with the worry of getting your things stolen.

The study contained a pre-intervention phase, an intervention phase, a post-intervention phase, and a follow-up phase. In the pre-intervention phase, participants answered demographic questions and completed two scales, one assessing differences in empathy and the other assessing beliefs about empathy. Next, participants completed either the NPT or the VRPT in about 15 minutes. In the post-intervention phase, right after task completion, all participants completed a range of self-report scales and questionnaires on empathic concern, personal distress, closeness and connectedness to homeless people, dehumanization of homeless people, and attitude toward homeless people. Participants were also asked about their agreement with “proposition A” concerning affordable housing, their willingness to sign a related petition, and what amount they are willing to donate to homeless shelters. In the follow-up phase, participants completed further prosocial behavior tasks (writing letters about the homeless and stating how much they agree with “measure B” regarding affordable housing).

In summary, the VRPT intervention showed some positive short-term effects (increased empathy and personal distress) and long-term effects (improved attitude toward the homeless and less dehumanization) compared to NPT. However, other measures such as closeness and connectedness with, dehumanization of, and attitudes toward the homeless saw no change. As for prosocial behavior, while the amount donated to homeless shelters and the support for a “proposition A” did not differ across conditions, VRPT participants gave significantly more signatures on the petition supporting affordable housing.

There are a few criticisms to be raised. First, among the many measures the authors administered, they did not include a measure of social desirability. This is surprising – the study design lends itself to influences of social desirability, and valid short-form measures (e.g., “Marlowe-Crowne Social Desirability Scale”; Fischer & Fick, 1993) are readily available. Second, the authors argue IVR avoids the biases and preexisting schemas of participants since the task scenario “is all rendered digitally, and […] at an advantage in terms of accurate content, in comparison to traditional perspective-taking” (p. 4). However, this argument ignores that the programming of IVR content is also done by humans with biases and schemas. Thus, researchers should not assume it is inherently bias-free. On a positive note, study B provided detailed descriptions of its IVR setup and narrative, even making the used IVR content freely downloadable in the Steam Store (https://store.steampowered.com/app/738100/Becoming_Homeless_A_Human_Experience/). This is a great example of easing replicability for researchers and practitioners.

Discussion and Conclusion

Ip et al. (2018) and Herrera et al. (2018) have provided evidence for IVR’s potential in enhancing adults’ empathy and perspective-taking abilities and building the social and emotional skills of children with ASD. Both studies succeeded in helping individuals build skills essential to integration into society, human interaction, and society’s functioning as a whole. They also jointly demonstrate two major strengths of IVR-based interventions: For one, IVR can mimic scenes from real life in an authentic, engaging, and safe way. This is of practical use to children with ASD so that they can practice social interaction flexibly, repeatedly, and without embarrassment. It also allows anyone to dive into scenarios that are otherwise associated with considerable real-life risks, such as the experience of homelessness. Second, IVR comes with high control over the content shown to the user, which is advantageous to both interventions and experimental studies as it ensures standardization and replicability.

As for brain plasticity, neither of the reviewed studies employed neuroimaging methods. Still, they add to the applied research literature in discussing IVR’s potential for treating deficits in clinical populations and enhancing abilities in regular populations. Albeit ASD is considered a developmental disorder with persistent symptoms and deficits (Diagnostic and Statistical Manual of Mental Disorders, 2013), Ip et al.’s work (2018) adds to the evidence (see Bailey et al., 2021; Mosher et al., 2021) that these aspects are not immutable but can be improved so that the individual can better deal with society’s demands. Herrera et al.’s (2018) work also highlights IVR’s ability to make users perceive a simulated experience as real, although outcomes were mixed. Whether this illusion of a real experience can be explained by the neural mechanisms Gonzalez-Franco and Lanier (2017) propose – i.e., bottom-up multisensory processing, sensorimotor self-awareness, and top-down prediction manipulation – requires more neuroscientific study and advances in VR-conducive imaging methods.

Despite these promising findings and the many potential use cases of IVR, it should be noted that corresponding research is still in its infancy – even recent reviews note the general effects of IVR use on development are largely unknown (Bailey & Bailenson, 2017; Common Sense Media, 2018). While VR manufacturers seem to acknowledge this limitation, age recommendations still vary strongly, ranging from “age seven and up, with adult supervision” to “should not be used by children under age 13” (Common Sense Media, 2018). It is thus essential that both the VR industry and scientists continue treading with caution in how and with whom they employ IVR. Bailenson (2018, p. 322) puts it succinctly: “[…] if an experience is not impossible, dangerous, expensive, or counterproductive, then you should seriously consider using a different medium—or even doing it in the real world. Save VR for the special moments.”