Posts Tagged ‘reading’

I was intrigued to learn earlier this year that Oxford University Press had launched a new online test of English language proficiency, called the Oxford Test of English (OTE). At the conference where I first heard about it, I was struck by the fact that the presentation of the OUP sponsored plenary speaker was entitled ‘The Power of Assessment’ and dealt with formative assessment / assessment for learning. Oxford clearly want to position themselves as serious competitors to Pearson and Cambridge English in the testing business.

The brochure for the exam kicks off with a gem of a marketing slogan, ‘Smart. Smarter. SmarTest’ (geddit?), and the next few pages give us all the key information.

Faster and more flexible‘Traditional language proficiency tests’ is presumably intended to refer to the main competition (Pearson and Cambridge English). Cambridge First takes, in total, 3½ hours; the Pearson Test of English Academic takes 3 hours. The OTE takes, in total, 2 hours and 5 minutes. It can be taken, in theory, on any day of the year, although this depends on the individual Approved Test Centres, and, again, in theory, it can be booked as little as 14 days in advance. Results should take only two weeks to arrive. Further flexibility is offered in the way that candidates can pick ’n’ choose which of the four skills they want to have tests, just one or all four, although, as an incentive to go the whole hog, they will only get a ‘Certificate of Proficiency’ if they do all four.

A further incentive to do all four skills at the same time can be found in the price structure. One centre in Spain is currently offering the test for one single skill at Ꞓ41.50, but do the whole lot, and it will only set you back Ꞓ89. For a high-stakes test, this is cheap. In the UK right now, both Cambridge First and Pearson Academic cost in the region of £150, and IELTS a bit more than that. So, faster, more flexible and cheaper … Oxford means business.

Individual experience

The ‘individual experience’ on the next page of the brochure is pure marketing guff. This is, after all, a high-stakes, standardised test. It may be true that ‘the Speaking and Writing modules provide randomly generated tasks, making the overall test different each time’, but there can only be a certain number of permutations. What’s more, in ‘traditional tests’, like Cambridge First, where there is a live examiner or two, an individualised experience is unavoidable.

More interesting to me is the reference to adaptive technology. According to the brochure, ‘The Listening and Reading modules are adaptive, which means the test difficulty adjusts in response to your answers, quickly finding the right level for each test taker. This means that the questions are at just the right level of challenge, making the test shorter and less stressful than traditional proficiency tests’.

My curiosity piqued, I decided to look more closely at the Reading module. I found one practice test online which is the same as the demo that is available at the OTE website . Unfortunately, this example is not adaptive: it is at B1 level. The actual test records scores between 51 and 140, corresponding to levels A2, B1 and B2.

Test scores

The tasks in the Reading module are familiar from coursebooks and other exams: multiple choice, multiple matching and gapped texts.

Reading tasks

According to the exam specifications, these tasks are designed to measure the following skills:

  • Reading to identify main message, purpose, detail
  • Expeditious reading to identify specific information, opinion and attitude
  • Reading to identify text structure, organizational features of a text
  • Reading to identify attitude / opinion, purpose, reference, the meanings of words in context, global meaning

The ability to perform these skills depends, ultimately, on the candidate’s knowledge of vocabulary and grammar, as can be seen in the examples below.

Task 1Task 2

How exactly, I wonder, does the test difficulty adjust in response to the candidate’s answers? The algorithm that is used depends on measures of the difficulty of the test items. If these items are to be made harder or easier, the only significant way that I can see of doing this is by making the key vocabulary lower- or higher-frequency. This, in turn, is only possible if vocabulary and grammar has been tagged as being at a particular level. The most well-known tools for doing this have been developed by Pearson (with the GSE Teacher Toolkit ) and Cambridge English Profile . To the best of my knowledge, Oxford does not yet have a tool of this kind (at least, none that is publicly available). However, the data that OUP will accumulate from OTE scripts and recordings will be invaluable in building a database which their lexicographers can use in developing such a tool.

Even when a data-driven (and numerically precise) tool is available for modifying the difficulty of test items, I still find it hard to understand how the adaptivity will impact on the length or the stress of the reading test. The Reading module is only 35 minutes long and contains only 22 items. Anything that is significantly shorter must surely impact on the reliability of the test.

My conclusion from this is that the adaptive element of the Reading and Listening modules in the OTE is less important to the test itself than it is to building a sophisticated database (not dissimilar to the GSE Teacher Toolkit or Cambridge English Profile). The value of this will be found, in due course, in calibrating all OUP materials. The OTE has already been aligned to the Oxford Online Placement Test (OOPT) and, presumably, coursebooks will soon follow. This, in turn, will facilitate a vertically integrated business model, like Pearson and CUP, where everything from placement test, to coursework, to formative assessment, to final proficiency testing can be on offer.

There has been wide agreement for a long time that one of the most important ways of building the mental lexicon is by having extended exposure to language input through reading and listening. Some researchers (e.g. Krashen, 2008) have gone as far as to say that direct vocabulary instruction serves little purpose, as there is no interface between explicit and implicit knowledge. This remains, however, a minority position, with a majority of researchers agreeing with Barcroft (2015) that deliberate learning plays an important role, even if it is only ‘one step towards knowing the word’ (Nation, 2013: 46).

There is even more agreement when it comes to the differences between deliberate study and extended exposure to language input, in terms of the kinds of learning that takes place. Whilst basic knowledge of lexical items (the pairings of meaning and form) may be developed through deliberate learning (e.g. flash cards), it is suggested that ‘the more ‘contextualized’ aspects of vocabulary (e.g. collocation) cannot be easily taught explicitly and are best learned implicitly through extensive exposure to the use of words in context’ (Schmitt, 2008: 333). In other words, deliberate study may develop lexical breadth, but, for lexical depth, reading and listening are the way to go.

This raises the question of how many times a learner would need to encounter a word (in reading or listening) in order to learn its meaning. Learners may well be developing other aspects of word knowledge at the same time, of course, but a precondition for this is probably that the form-meaning relationship is sorted out. Laufer and Nation (2012: 167) report that ‘researchers seem to agree that with ten exposures, there is some chance of recognizing the meaning of a new word later on’. I’ve always found this figure interesting, but strangely unsatisfactory, unsure of what, precisely, it was actually telling me. Now, with the recent publication of a meta-analysis looking at the effects of repetition on incidental vocabulary learning (Uchihara, Webb & Yanagisawa, 2019), things are becoming a little clearer.

First of all, the number ten is a ballpark figure, rather than a scientifically proven statistic. In their literature review, Uchihara et al. report that ‘the number of encounters necessary to learn words rang[es] from 6, 10, 12, to more than 20 times. That is to say, ‘the number of encounters necessary for learning of vocabulary to occur during meaning-focussed input remains unclear’. If you ask a question to which there is a great variety of answers, there is a strong probability that there is something wrong with the question. That, it would appear, is the case here.

Unsurprisingly, there is, at least, a correlation between repeated encounters of a word and learning, described by Uchihara et al as statistically significant (with a medium effect size). More interesting are the findings about the variables in the studies that were looked at. These included ‘learner variables’ (age and the current size of the learner’s lexicon), ‘treatment variables’ (the amount of spacing between the encounters, listening versus reading, the presence or absence of visual aids, the degree to which learners ‘engage’ with the words they encounter) and ‘methodological variables’ in the design of the research (the kinds of words that are being looked at, word characteristics, the use of non-words, the test format and whether or not learners were told that they were going to be tested).

Here is a selection of the findings:

  • Older learners tend to benefit more from repeated encounters than younger learners.
  • Learners with a smaller vocabulary size tend to benefit more from repeated encounters with L2 words, but this correlation was not statistically significant. ‘Beyond a certain point in vocabulary growth, learners may be able to acquire L2 words in fewer encounters and need not receive as many encounters as learners with smaller vocabulary size’.
  • Learners made greater gains when the repeated exposure took place under massed conditions (e.g. on the same day), rather than under ‘spaced conditions’ (spread out over a longer period of time).
  • Repeated exposure during reading and, to a slightly lesser extent, listening resulted in more gains than reading while listening and viewing.
  • ‘Learners presented with visual information during meaning-focused tasks benefited less from repeated encounters than those who had no access to the information’. This does not mean that visual support is counter-productive: only that the positive effect of repeated encounters is not enhanced by visual support.
  • ‘A significantly larger effect was found for treatments involving no engagement compared to treatment involving engagement’. Again, this does not mean that ‘no engagement’ is better than ‘engagement’: only that the positive effect of repeated encounters is not enhanced by ‘engagement’.
  • ‘The frequency-learning correlation does not seem to increase beyond a range of around 20 encounters with a word’.
  • Experiments using non-words may exaggerate the effect of frequent encounters (i.e. in the real world, with real words, the learning potential of repeated encounters may be less than indicated by some research).
  • Forewarning learners of an upcoming comprehension test had a positive impact on gains in vocabulary learning. Again, this does not mean that teachers should systematically test their students’ comprehension of what they have read.

For me, the most interesting finding was that ‘about 11% of the variance in word learning through meaning-focused input was explained by frequency of encounters’. This means, quite simply, that a wide range of other factors, beyond repeated encounters, will determine the likelihood of learners acquiring vocabulary items from extensive reading and listening. The frequency of word encounters is just one factor among many.

I’m still not sure what the takeaways from this meta-analysis should be, besides the fact that it’s all rather complex. The research does not, in any way, undermine the importance of massive exposure to meaning-focussed input in learning a language. But I will be much more circumspect in my teacher training work about making specific claims concerning the number of times that words need to be encountered before they are ‘learnt’. And I will be even more sceptical about claims for the effectiveness of certain online language learning programs which use algorithms to ensure that words reappear a certain number of times in written, audio and video texts that are presented to learners.

References

Barcroft, J. 2015. Lexical Input Processing and Vocabulary Learning. Amsterdam: John Benjamins

Laufer, B. & Nation, I.S.P. 2012. Vocabulary. In Gass, S.M. & Mackey, A. (Eds.) The Routledge Handbook of Second Language Acquisition (pp.163 – 176). Abingdon, Oxon.: Routledge

Nation, I.S.P. 2013. Learning Vocabulary in Another Language 2nd edition. Cambridge: Cambridge University Press

Krashen, S. 2008. The comprehension hypothesis extended. In T. Piske & M. Young-Scholten (Eds.), Input Matters in SLA (pp.81 – 94). Bristol, UK: Multilingual Matters

Schmitt, N. 2008. Review article: instructed second language vocabulary learning. Language Teaching Research 12 (3): 329 – 363

Uchihara, T., Webb, S. & Yanagisawa, A. 2019. The Effects of Repetition on Incidental Vocabulary Learning: A Meta-Analysis of Correlational Studies. Language Learning, 69 (3): 559 – 599) Available online: https://www.researchgate.net/publication/330774796_The_Effects_of_Repetition_on_Incidental_Vocabulary_Learning_A_Meta-Analysis_of_Correlational_Studies