The VR experience is nothing if it is not immersive, and in language learning, the value of immersion in VR is seen to be the way in which it can lead to what we might call ‘engagement’ or ‘flow’. Fully immersed in a VR world, learning can be maximized, or so the thinking goes (Lan, 2020; Chen & Hsu, 2020). ‘By blocking out visual and auditory distractions in the classroom, VR has the potential to help students deeply connect with the material’ (Gadelha, 2018). ‘There are no distracting classroom windows to stare out of when students are directly immersed into the topic they are investigating’ (Bonner & Reinders, 2018: 36). Such is the allure of immersion that it is no surprise to find the word in the names of VR language learning products like Immerse and ImmerseMe (although the nod to bilingual immersion progammes (such as those in Canada) is an added bonus).

There is, however, immersion and immersion. A common categorisation of VR is into:

  • non-immersive (e.g. a desktop game with a 2D screen and avatars)
  • semi-immersive (e.g. high-end arcade games and flight simulators with large projections)
  • fully immersive (e.g. with a head-mounted display, headphones, body sensors)

Taking things a little further is the possibility of directly inducing responses in the nervous system with molecular nanotechnology. We’re some way off that, but, fear not, people are working on it. At this point, it’s worth noting that this hierarchy of immersivity is driven by technological considerations: more tech = more immersion.

In ELT, the most common VR applications are currently at the low end of this scale. Probably the most talked about currently is the use of 3600 photography and a very simple headset like Google Cardboard, along with headphones, to take students on virtual field trips – anywhere from a museum or a Disney castle to a coral reef or outer space. See Raquel Ribeiro’s blog post for CUP for more ideas. Then, there are self-study packages, like Velawoods, which is a sort of combination of the SIMS with interaction made possible through speech recognition. The syllabus will be familiar to anyone used to using a contemporary coursebooks.

And, now, up a technological notch or two, is Immerse, which requires an Oculus headset. It appears to be a sort of Second Life where language learners can interact with each other and a trainer in a number of role plays, set in, for example, a garden barbecue, a pool bar, a conference or a deserted island. In addition to interacting with each other, students can interact with virtual objects, picking up darts and throw them at questions they want to focus on, for example. ‘Total physical engagement with the environment’ is how this is described by Immerse’s Chief Product Office. You can find out more in this promotional video.

Paul Driver has suggested that the evolution of VR can be ‘traced back through time as a constant struggle to create more immersive experiences. From the intricate scrolls of twelfth-century China, the huge panoramic paintings of the nineteenth century and early experiments in stereoscopic photography, to the promising but over-hyped 1990s arcade machines (which raised hopes and then dashed expectations for a whole generation), the history of virtual reality has been a meandering march forward, punctuated with long periods of stagnation’. Immerse may be fairly sophisticated as a VR language learning platform, but it has a long way to go as an immersive environment in comparison to games like Meeting Rembrandt: Master of Reality or Project VR Fishing. Its animations are crude and clunky, its scenarios short of detail.

But however ‘lifelike’ games like these are, their immersive potential is extremely limited if you have no interest in Rembrandt or fishing. VR is only as immersive as the intrinsic interest of (1) the ‘real world’ it is attempting to replicate, and (2) what you can do in it. The novelty factor may hold attention for a while, but not for long.

With simpler 3600 Google Cardboard versions of VR, you can’t actually do anything in the VR world besides watch, listen and marvel, so the intrinsic interest of the content is even more important. I quite like exploring the Okavango Delta, but I have no interest in rollercoasters or parachute jumps. But, to be immersed, I don’t actually need the 3600 experience at all, if the quality of the video is good enough. In many ways, I prefer an old-fashioned screen where my hands are not tied up with holding the phone into the Cardboard and the Cardboard to my nose.

3600 videos are usually short, and I can see how they can be used in a language class as a springboard for other work. But as a language learning tool, old-fashioned screens (with good content) may offer more potential than headsets (whether Cardboard or Oculus) because we can do other things (like communicate with other people, use a dictionary or take notes) at the same time.

VR technology in language learning cannot, therefore, (whatever its claims) generate immersion or engagement on its own. For the time being, it can, for some, captivate initial curiosity. For others, already used to high-end Oculus games, programmes like Immerse are more likely to generate a resounding ‘meh’. Engagement in learning is a highly complex phenomenon. Mercer and Dörnyei (2020: 102 ff.) argue that engaging learning materials must be designed for particular groups of learners (in terms of level and interests, for example) and they must get learners emotionally invested. Improvements in VR technology won’t really change anything.

VR is already well established and successful in some forms of education: military, healthcare and engineering, especially. Virtual reality is obviously a good place to learn how to defuse a bomb or carry out keyhole surgery. In other areas, such as soft skills training in corporate contexts, its use is growing, but its effectiveness is much less clear. In language learning, the purported advantages of VR (see, for example, Alizadeh, 2019, which has a useful bibliography, or Lloyd et al., 2017) are not convincing. There is no problem in language learning for which VR is the solution. This doesn’t mean that VR does not have a place in language learning / teaching. VR field trips may offer occasional moments of variety. Conversation in VR worlds like Facebook Spaces may be welcomed by some. And there will be markets for dedicated platforms like Velawoods, Mondly or Immerse.

Predictions about edtech are often thinly disguised attempts to accelerate a predicted future. Four years ago I went to a conference presentation by Saul Nassé, Chief Executive of Cambridge Assessment. All the participants were given a Cambridge branded Google Cardboard. At the time, Nassé wrote the following:

The technology is only going to get better and cheaper. In two or three years it will be wireless and cost less than a smart phone. That’s the point when you’ll see whole classrooms equipped with VR. And I like to think we’ll find a way of Cambridge English content being used in those classrooms, with people learning English in a whole new way. It may have been a long time coming, but I think the VR revolution is now truly here to stay’.

The message was echoed in Lloyd et al (2017), all three of whom worked for Cambridge Assessment, and amplified in a series of blog posts and conference presentations around that time. Since then, it has all gone rather quiet. There are still people out there (including the investors who have just pumped $1.5 million into Immerse in Series A funding), who believe that VR will be the next big thing in language learning. But edtech investors have a long track record of turning a blind eye to history. VR, as Saul Nassé observed, ‘has been the next big thing for thirty years’. And maybe for the next thirty years, too.


subtitlesAs both a language learner and a teacher, I have a number of questions about the value of watching subtitled videos for language learning. My interest is in watching extended videos, rather than short clips for classroom use, so I am concerned with incidental, rather than intentional, learning, mostly of vocabulary. My questions include:

  • Is it better to watch a video that is subtitled or unsubtitled?
  • Is it better to watch a video with L1 or L2 subtitles?
  • If a video is watched more than once, what is the best way to start and proceed? In which order (no subtitles, L1 subtitles and L2 subtitles) is it best to watch?

For help, I turned to three recent books about video and language learning: Ben Goldstein and Paul Driver’s Language Learning with Digital Video (CUP, 2015), Kieran Donaghy’s Film in Action (Delta, 2015) and Jamie Keddie’s Bringing Online Video into the Classroom (OUP, 2014). I was surprised to find no advice, but, as I explored more, I discovered that there may be a good reason for these authors’ silence.

There is now a huge literature out there on subtitles and language learning, and I cannot claim to have read it all. But I think I have read enough to understand that I am not going to find clear-cut answers to my questions.

The learning value of subtitles

It has been known for some time that the use of subtitles during extensive viewing of video in another language can help in the acquisition of that language. The main gains are in vocabulary acquisition and the development of listening skills (Montero Perez et al., 2013). This is true of both L1 subtitles (with an L2 audio track), sometimes called interlingual subtitles, (Incalcaterra McLoughlin et al, 2011) and L2 subtitles (with an L2 audio track), sometimes called intralingual subtitles or captions (Vanderplank, 1988). Somewhat more surprisingly, vocabulary gains may also come from what are called reversed subtitles (L2 subtitles and an L1 audio track) (Burczyńska, 2015). Of course, certain conditions apply for subtitled video to be beneficial, and I’ll come on to these. But there is general research agreement (an exception is Karakaş & Sariçoban, 2012) that more learning is likely to take place from watching a subtitled video in a target language than an unsubtitled one.

Opposition to the use of subtitles as a tool for language learning has mostly come from three angles. The first of these, which concerns L1 subtitles, is an antipathy to any use at all of L1. Although such an attitude remains entrenched in some quarters, there is no evidence to support it (Hall & Cook, 2012; Kerr, 2016). Researchers and, increasingly, teachers have moved on.

The second reservation that is sometimes expressed is that learners may not attend to either the audio track or the subtitles if they do not need to. They may, for example, ignore the subtitles in the case of reversed subtitles or ignore the L2 audio track when there are L1 subtitles. This can, of course, happen, but it seems that, on the whole, this is not the case. In an eye-tracking study by Bisson et al (2012), for example, it was found that most people followed the subtitles, irrespective of what kind they were. Unsurprisingly, they followed the subtitles more closely when the audio track was in a language that was less familiar. When conditions are right (see below), reading subtitles becomes a very efficient and partly automatized cognitive activity, which does not prevent people from processing the audio track at the same time (d’Ydewalle & Pavakanun, 1997).

Related to the second reservation is the concern that the two sources of information (audio and subtitles), combined with other information (images and music or sound effects), may be in competition and lead to cognitive overload, impacting negatively on both comprehension and learning. Recent research suggests that this concern is ungrounded (Kruger et al, 2014). L1 subtitles generate less cognitive load than L2 subtitles, but overload is not normally reached and mental resources are still available for learning (Baranowska, 2020). The absence of subtitles generates more cognitive load.

Conditions for learning

Before looking at the differences between L1 and L2 subtitles, it’s a good idea to look at the conditions under which learning is more likely to take place with subtitles. Some of these are obvious, others less so.

First of all, the video material must be of sufficient intrinsic interest to the learner. Secondly, the subtitles must be of a sufficiently high quality. This is not always the case with automatically generated captions, especially if the speech-to-text software struggles with the audio accent. It is also not always the case with professionally produced L1 subtitles, especially when the ‘translations are non-literal and made at the phrase level, making it hard to find connections between the subtitle text and the words in the video’ (Kovacs, 2013, cited by Zabalbeascoa et al., 2015: 112). As a minimum, standard subtitling guidelines, such as those produced for the British Channel 4, should be followed. These limit, for example, the number of characters per line to about 40 and a maximum of two lines.

For reasons that I’ll come on to, learners should be able to switch easily between L1 and L2 subtitles. They are also likely to benefit if reliably accurate glosses or hyperlinks are ‘embedded in the subtitles, making it possible for a learner to simply click for additional verbal, auditory or even pictorial glosses’ (Danan, 2015: 49).

At least as important as considerations of the materials or tools, is a consideration of what the learner brings to the activity (Frumuselu, 2019: 104). Vanderplank (2015) describes these different kinds of considerations as the ‘effects of’ subtitles on a learner and the ‘effects with’ subtitles on learner behaviour.

In order to learn from subtitles, you need to be able to read fast enough to process them. Anyone with a slow reading speed (e.g. some dyslexics) in their own language is going to struggle. Even with L1 subtitles, Vanderplank (2015: 24) estimates that it is only around the age of 10 that children can do this with confidence. Familarity with both the subject matter and with subtitle use will impact on this ability to read subtitles fast enough.

With L2 subtitles, the language proficiency of the learner related to the level of difficulty (especially lexical difficulty) of the subtitles will clearly be of some significance. It is unlikely that L2 subtitles will be of much benefit to beginners (Taylor, 2005). It also suggests that, at lower levels, materials need to be chosen carefully. On the whole, researchers have found that higher proficiency levels correlate with greater learning gains (Pujadas & Muñoz, 2019; Suárez & Gesa, 2019), but one earlier meta-analysis (Montero Perez et al., 2013) did not find that proficiency levels were significant.

Measures of general language proficiency may be too blunt an instrument to help us all of the time. I can learn more from Portuguese than from Arabic subtitles, even though I am a beginner in both languages. The degree of proximity between two languages, especially the script (Winke et al., 2010), is also likely to be significant.

But a wide range of other individual learner differences will also impact on the learning from subtitles. It is known that learners approach subtitles in varied and idiosyncratic ways (Pujolá, 2002), with some using L2 subtitles only as a ‘back-up’ and others relying on them more. Vanderplank (2019) grouped learners into three broad categories: minimal users who were focused throughout on enjoying films as they would in their L1, evolving users who showed marked changes in their viewing behaviour over time, and maximal users who tended to be experienced at using films to enhance their language learning.

Categories like these are only the tip of the iceberg. Sensory preferences, personality types, types of motivation, the impact of subtitles on anxiety levels and metacognitive strategy awareness are all likely to be important. For the last of these, Danan (2015: 47) asks whether learners should be taught ‘techniques to make better use of subtitles and compensate for weaknesses: techniques such as a quick reading of subtitles before listening, confirmation of word recognition or meaning after listening, as well as focus on form for spelling or grammatical accuracy?’

In short, it is, in practice, virtually impossible to determine optimal conditions for learning from subtitles, because we cannot ‘take into account all the psycho-social, cultural and pedagogic parameters’ (Gambier, 2015). With that said, it’s time to take a closer look at the different potential of L1 and L2 subtitles.

L1 vs L2 subtitles

Since all other things are almost never equal, it is not possible to say that one kind of subtitles offers greater potential for learning than another. As regards gains in vocabulary acquisition and listening comprehension, there is no research consensus (Baranowska, 2020: 107). Research does, however, offer us a number of pointers.

Extensive viewing of subtitled video (both L1 and L2) can offer ‘massive quantities of authentic and comprehensible input’ (Vanderplank, 1988: 273). With lower level learners, the input is likely to be more comprehensible with L1 subtitles, and, therefore, more enjoyable and motivating. This makes them often more suitable for what Caimi (2015: 11) calls ‘leisure viewing’. Vocabulary acquisition may be better served with L2 subtitles, because they can help viewers to recognize the words that are being spoken, increase their interaction with the target language, provide further language context, and increase the redundancy of information, thereby enhancing the possibility of this input being stored in long-term memory (Frumuselu et al., 2015). These effects are much more likely with Vanderplank’s (2019) motivated, ‘maximal’ users than with ‘minimal’ users.

There is one further area where L2 subtitles may have the edge over L1. One of the values of extended listening in a target language is the improvement in phonetic retuning (see, for example, Reinisch & Holt, 2013), the ability to adjust the phonetic boundaries in your own language to the boundaries that exist in the target language. Learning how to interpret unusual speech-sounds, learning how to deal with unusual mappings between sounds and words and learning how to deal with the acoustic variations of different speakers of the target language are all important parts of acquiring another language. Research by Mitterer and McQueen (2009) suggests that L2 subtitles help in this process, but L1 subtitles hinder it.

Classroom implications?

The literature on subtitles and language learning echoes with the refrain of ‘more research needed’, but I’m not sure that further research will lead to less ambiguous, practical conclusions. One of my initial questions concerned the optimal order of use of different kinds of subtitles. In most extensive viewing contexts, learners are unlikely to watch something more than twice. If they do (watching a recorded academic lecture, for example), they are likely to be more motivated by a desire to learn from the content than to learn language from the content. L1 subtitles will probably be preferred, and will have the added bonus of facilitating note-taking in the L1. For learners who are more motivated to learn the target language (Vanderplank’s ‘maximal’ users), a sequence of subtitle use, starting with the least cognitively challenging and moving to greater challenge, probably makes sense. Danan (2015: 46) suggests starting with an L1 soundtrack and reversed (L2) subtitles, then moving on to an L2 soundtrack and L2 subtitles, and ending with an L2 soundtrack and no subtitles. I would replace her first stage with an L2 soundtrack and L1 subtitles, but this is based on hunch rather than research.

This sequencing of subtitle use is common practice in language classrooms, but, here, (1) the video clips are usually short, and (2) the aim is often not incidental learning of vocabulary. Typically, the video clip has been selected as a tool for deliberate teaching of language items, so different conditions apply. At least one study has confirmed the value of the common teaching practice of pre-teaching target vocabulary items before viewing (Pujadas & Muñoz, 2019). The drawback is that, by getting learners to focus on particular items, less incidental learning of other language features is likely to take place. Perhaps this doesn’t matter too much. In a short clip of a few minutes, the opportunities for incidental learning are limited, anyway. With short clips and a deliberate learning aim, it seems reasonable to use L2 subtitles for a first viewing, and no subtitles thereafter.

An alternative frequent use of short video clips in classrooms is to use them as a springboard for speaking. In these cases, Baranowska (2020: 113) suggests that teachers may opt for L1 subtitles first, and follow up with L2 subtitles. Of course, with personal viewing devices or in online classes, teachers may want to exploit the possibilities of differentiating the subtitle condition for different learners.


