Posts Tagged ‘vocabulary’

Knowble, claims its developers, is a browser extension that will improve English vocabulary and reading comprehension. It also describes itself as an ‘adaptive language learning solution for publishers’. It’s currently beta and free, and sounds right up my street so I decided to give it a run.

Knowble reader

Users are asked to specify a first language (I chose French) and a level (A1 to C2): I chose B1, but this did not seem to impact on anything that subsequently happened. They are then offered a menu of about 30 up-to-date news items, grouped into 5 categories (world, science, business, sport, entertainment). Clicking on one article takes you to the article on the source website. There’s a good selection, including USA Today, CNN, Reuters, the Independent and the Torygraph from Britain, the Times of India, the Independent from Ireland and the Star from Canada. A large number of words are underlined: a single click brings up a translation in the extension box. Double-clicking on all other words will also bring up translations. Apart from that, there is one very short exercise (which has presumably been automatically generated) for each article.

For my trial run, I picked three articles: ‘Woman asks firefighters to help ‘stoned’ raccoon’ (from the BBC, 240 words), ‘Plastic straw and cotton bud ban proposed’ (also from the BBC, 823 words) and ‘London’s first housing market slump since 2009 weighs on UK price growth’ (from the Torygraph, 471 words).

Translations

Research suggests that the use of translations, rather than definitions, may lead to more learning gains, but the problem with Knowble is that it relies entirely on Google Translate. Google Translate is fast improving. Take the first sentence of the ‘plastic straw and cotton bud’ article, for example. It’s not a bad translation, but it gets the word ‘bid’ completely wrong, translating it as ‘offre’ (= offer), where ‘tentative’ (= attempt) is needed. So, we can still expect a few problems with Google Translate …

google_translateOne of the reasons that Google Translate has improved is that it no longer treats individual words as individual lexical items. It analyses groups of words and translates chunks or phrases (see, for example, the way it translates ‘as part of’). It doesn’t do word-for-word translation. Knowble, however, have set their software to ask Google for translations of each word as individual items, so the phrase ‘as part of’ is translated ‘comme’ + ‘partie’ + ‘de’. Whilst this example is comprehensible, problems arise very quickly. ‘Cotton buds’ (‘cotons-tiges’) become ‘coton’ + ‘bourgeon’ (= botanical shoots of cotton). Phrases like ‘in time’, ‘run into’, ‘sleep it off’ ‘take its course’, ‘fire station’ or ‘going on’ (all from the stoned raccoon text) all cause problems. In addition, Knowble are not using any parsing tools, so the system does not identify parts of speech, and further translation errors inevitably appear. In the short article of 240 words, about 10% are wrongly translated. Knowble claim to be using NLP tools, but there’s no sign of it here. They’re just using Google Translate rather badly.

Highlighted items

word_listNLP tools of some kind are presumably being used to select the words that get underlined. Exactly how this works is unclear. On the whole, it seems that very high frequency words are ignored and that lower frequency words are underlined. Here, for example, is the list of words that were underlined in the stoned raccoon text. I’ve compared them with (1) the CEFR levels for these words in the English Profile Text Inspector, and (2) the frequency information from the Macmillan dictionary (more stars = more frequent). In the other articles, some extremely high frequency words were underlined (e.g. price, cost, year) while much lower frequency items were not.

It is, of course, extremely difficult to predict which items of vocabulary a learner will know, even if we have a fairly accurate idea of their level. Personal interests play a significant part, so, for example, some people at even a low level will have no problem with ‘cannabis’, ‘stoned’ and ‘high’, even if these are low frequency. First language, however, is a reasonably reliable indicator as cognates can be expected to be easy. A French speaker will have no problem with ‘appreciate’, ‘unique’ and ‘symptom’. A recommendation engine that can meaningfully personalize vocabulary suggestions will, at the very least, need to consider cognates.

In short, the selection and underlining of vocabulary items, as it currently stands in Knowble, appears to serve no clear or useful function.

taskVocabulary learning

Knowble offers a very short exercise for each article. They are of three types: word completion, dictation and drag and drop (see the example). The rationale for the selection of the target items is unclear, but, in any case, these exercises are tokenistic in the extreme and are unlikely to lead to any significant learning gains. More valuable would be the possibility of exporting items into a spaced repetition flash card system.

effectiveThe claim that Knowble’s ‘learning effect is proven scientifically’ seems to me to be without any foundation. If there has been any proper research, it’s not signposted anywhere. Sure, reading lots of news articles (with a look-up function – if it works reliably) can only be beneficial for language learners, but they can do that with any decent dictionary running in the background.

Similar in many ways to en.news, which I reviewed in my last post, Knowble is another example of a technology-driven product that shows little understanding of language learning.

Advertisements

Last month, I wrote a post about the automated generation of vocabulary learning materials. Yesterday, I got an email from Mike Elchik, inviting me to take a look at the product that his company, WeSpeke, has developed in partnership with CNN. Called en.news, it’s a very regularly updated and wide selection of video clips and texts from CNN, which are then used to ‘automatically create a pedagogically structured, leveled and game-ified English lesson‘. Available at the AppStore and Google Play, as well as a desktop version, it’s free. Revenues will presumably be generated through advertising and later sales to corporate clients.

With 6.2 million dollars in funding so far, WeSpeke can leverage some state-of-the-art NLP and AI tools. Co-founder and chief technical adviser of the company is Jaime Carbonell, Director of the Language Technologies Institute at Carnegie Mellon University, described in Wikipedia as one of the gurus of machine learning. I decided to have a closer look.

home_page

Users are presented with a menu of CNN content (there were 38 items from yesterday alone), these are tagged with broad categories (Politics, Opinions, Money, Technology, Entertainment, etc.) and given a level, ranging from 1 to 5, although the vast majority of the material is at the two highest levels.

menu.jpg

I picked two lessons: a reading text about Mark Zuckerberg’s Congressional hearing (level 5) and a 9 minute news programme of mixed items (level 2 – illustrated above). In both cases, the lesson begins with the text. With the reading, you can click on words to bring up dictionary entries from the Collins dictionary. With the video, you can activate captions and again click on words for definitions. You can also slow down the speed. So far, so good.

There then follows a series of exercises which focus primarily on a set of words that have been automatically selected. This is where the problems began.

Level

It’s far from clear what the levels (1 – 5) refer to. The Zuckerberg text is 930 words long and is rated as B2 by one readability tool. But, using the English Profile Text Inspector, there are 19 types at C1 level, 14 at C2, and 98 which are unlisted. That suggests something substantially higher than B2. The CNN10 video is delivered at breakneck speed (as is often the case with US news shows). Yes, it can be slowed down, but that still won’t help with some passages, such as the one below:

A squirrel recently fell out of a tree in Western New York. Why would that make news?Because she bwoke her widdle leg and needed a widdle cast! Yes, there are casts for squirrels, as you can see in this video from the Orphaned Wildlife Center. A windstorm knocked the animal’s nest out of a tree, and when a woman saw that the baby squirrel was injured, she took her to a local vet. Doctors say she’s going to be just fine in a couple of weeks. Well, why ‘rodent’ she be? She’s been ‘whiskered’ away and cast in both a video and a plaster. And as long as she doesn’t get too ‘squirrelly’ before she heals, she’ll have quite a ‘tail’ to tell.

It’s hard to understand how a text like this got through the algorithms. But, as materials writers know, it is extremely hard to find authentic text that lends itself to language learning at anything below C1. On the evidence here, there is still some way to go before the process of selection can be automated. It may well be the case that CNN simply isn’t a particularly appropriate source.

Target learning items

The primary focus of these lessons is vocabulary learning, and it’s vocabulary learning of a very deliberate kind. Applied linguists are in general agreement that it makes sense for learners to approach the building of their L2 lexicon in a deliberate way (i.e. by studying individual words) for high-frequency items or items that can be identified as having a high surrender value (e.g. items from the AWL for students studying in an EMI context). Once you get to items that are less frequent than, say, the top 8,000 most frequent words, the effort expended in studying new words needs to be offset against their usefulness. Why spend a lot of time studying low frequency words when you’re unlikely to come across them again for some time … and will probably forget them before you do? Vocabulary development at higher levels is better served by extensive reading (and listening), possibly accompanied by glosses.

The target items in the Zuckerberg text were: advocacy, grilled, handicapping, sparked, diagnose, testified, hefty, imminent, deliberative and hesitant. One of these ‘grilled‘ is listed as A2 by English Vocabulary Profile, but that is with its literal, not metaphorical, meaning. Four of them are listed as C2 and the remaining five are off-list. In the CNN10 video, the target items were: strive, humble (verb), amplify, trafficked, enslaved, enacted, algae, trafficking, ink and squirrels. Of these, one is B1, two are C2 and the rest are unlisted. What is the point of studying these essentially random words? Why spend time going through a series of exercises that practise these items? Wouldn’t your time be better spent just doing some more reading? I have no idea how the automated selection of these items takes place, but it’s clear that it’s not working very well.

Practice exercises

There is plenty of variety of task-type but there are,  I think, two reasons to query the claim that these lessons are ‘pedagogically structured’. The first is the nature of the practice exercises; the second is the sequencing of the exercises. I’ll restrict my observations to a selection of the tasks.

1. Users are presented with a dictionary definition and an anagrammed target item which they must unscramble. For example:

existing for the purpose of discussing or planning something     VLREDBETEIIA

If you can’t solve the problem, you can always scroll through the text to find the answer. Burt the problem is in the task design. Dictionary definitions have been written to help language users decode a word. They simply don’t work very well when they are used for another purpose (as prompts for encoding).

2. Users are presented with a dictionary definition for which they must choose one of four words. There are many potential problems here, not the least of which is that definitions are often more complex than the word they are defining, or they present other challenges. As an example: cause to be unpretentious for to humble. On top of that, lexicographers often need or choose to embed the target item in the definition. For example:

a hefty amount of something, especially money, is very large

an event that is imminent, especially an unpleasant one, will happen very soon

When this is the case, it makes no sense to present these definitions and ask learners to find the target item from a list of four.

The two key pieces of content in this product – the CNN texts and the Collins dictionaries – are both less than ideal for their purposes.

3. Users are presented with a box of jumbled words which they must unscramble to form sentences that appeared in the text.

Rearrange_words_to_make_sentences

The sentences are usually long and hard to reconstruct. You can scroll through the text to find the answer, but I’m unclear what the point of this would be. The example above contains a mistake (vie instead of vice), but this was one of only two glitches I encountered.

4. Users are asked to select the word that they hear on an audio recording. For example:

squirreling     squirrel     squirreled     squirrels

Given the high level of challenge of both the text and the target items, this was a rather strange exercise to kick off the practice. The meaning has not yet been presented (in a matching / definition task), so what exactly is the point of this exercise?

5. Users are presented with gapped sentences from the text and asked to choose the correct grammatical form of the missing word. Some of these were hard (e.g. adjective order), others were very easy (e.g. some vs any). The example below struck me as plain weird for a lesson at this level.

________ have zero expectation that this Congress is going to make adequate changes. (I or Me ?)

6. At the end of both lessons, there were a small number of questions that tested your memory of the text. If, like me, you couldn’t remember all that much about the text after twenty minutes of vocabulary activities, you can scroll through the text to find the answers. This is not a task type that will develop reading skills: I am unclear what it could possibly develop.

Overall?

Using the lessons on offer here wouldn’t do a learner (as long as they already had a high level of proficiency) any harm, but it wouldn’t be the most productive use of their time, either. If a learner is motivated to read the text about Zuckerberg, rather than do lots of ‘busy’ work on a very odd set of words with gap-fills and matching tasks, they’d be better advised just to read the text again once or twice. They could use a look-up for words they want to understand and import them into a flashcard system with spaced repetition (en.news does have flashcards, but there’s no sign of spaced practice yet). More, they could check out another news website and read / watch other articles on the same subject (perhaps choosing websites with a different slant to CNN) and get valuable narrow-reading practice in this way.

My guess is that the technology has driven the product here, but without answering the fundamental questions about which words it’s appropriate for individual learners to study in a deliberate way and how this is best tackled, it doesn’t take learners very far.

 

 

 

 

A personalized language learning programme that is worth its name needs to offer a wide variety of paths to accommodate the varying interests, priorities, levels and preferred approaches to learning of the users of the programme. For this to be possible, a huge quantity of learning material is needed (Iwata et al., 2011: 1): the preparation and curation of this material is extremely time-consuming and expensive (despite the pittance that is paid to writers and editors). It’s not surprising, then, that a growing amount of research is being devoted to the exploration of ways of automatically generating language learning material. One area that has attracted a lot of attention is the learning of vocabulary.

Memrise screenshot 2Many simple vocabulary learning tasks are relatively simple to generate automatically. These include matching tasks of various kinds, such as the matching of words or phrases to meanings (either in English or the L1), pictures or collocations, as in many flashcard apps. Doing it well is rather harder: the definitions or translations have to be good and appropriate for learners of the level, the pictures need to be appropriate. If, as is often the case, the lexical items have come from a text or form part of a group of some kind, sense disambiguation software will be needed to ensure that the right meaning is being practised. Anyone who has used flashcard apps knows that the major problem is usually the quality of the content (whether it has been automatically generated or written by someone).

A further challenge is the generation of distractors. In the example here (from Memrise), the distractors have been so badly generated as to render the task more or less a complete waste of time. Distractors must, in some way, be viable alternatives (Smith et al., 2010) but still clearly wrong. That means they should normally be the same part of speech and true cognates should be avoided. Research into the automatic generation of distractors is well-advanced (see, for instance, Kumar at al., 2015) with Smith et al (2010), for example, using a very large corpus and various functions of Sketch Engine (the most well-known corpus query tool) to find collocates and other distractors. Their TEDDCLOG (Testing English with Data-Driven CLOze Generation) system produced distractors that were deemed acceptable 91% of the time. Whilst impressive, there is still a long way to go before human editing / rewriting is no longer needed.

One area that has attracted attention is, of course, tests, and some tasks, such as those in TOEFL (see image). Susanti et al (2015, 2017) were able, given a target word, to automatically generate a reading passage from web sources along with questions of the TOEFL kind. However, only about half of them were considered good enough to be used in actual tests. Again, that is some way off avoiding human intervention altogether, but the automatically generated texts and questions can greatly facilitate the work of human item writers.

toefl task

 

Other tools that might be useful include the University of Nottingham AWL (Academic Word List) Gapmaker . This allows users to type or paste in a text, from which items from the AWL are extracted and replaced as a gap. See the example below. It would, presumably, not be too difficult, to combine this approach with automatic distractor generation and to create multiple choice tasks.

Nottingham_AWL_Gapmaster

WordGapThere are a number of applications that offer the possibility of generating cloze tasks from texts selected by the user (learner or teacher). These have not always been designed with the language learner in mind but one that was is the Android app, WordGap (Knoop & Wilske, 2013). Described by its developers as a tool that ‘provides highly individualized exercises to support contextualized mobile vocabulary learning …. It matches the interests of the learner and increases the motivation to learn’. It may well do all that, but then again, perhaps not. As Knoop & Wilske acknowledge, it is only appropriate for adult, advanced learners and its value as a learning task is questionable. The target item that has been automatically selected is ‘novel’, a word that features in the list Oxford 2000 Keywords (as do all three distractors), and therefore ought to be well below the level of the users. Some people might find this fun, but, in terms of learning, they would probably be better off using an app that made instant look-up of words in the text possible.

More interesting, in my view, is TEDDCLOG (Smith et al., 2010), a system that, given a target learning item (here the focus is on collocations), trawls a large corpus to find the best sentence that illustrates it. ‘Good sentences’ were defined as those which were short (but not too short, or there is not enough useful context, begins with a capital letter and ends with a full stop, has a maximum of two commas; and otherwise contains only the 26 lowercase letters. It must be at a lexical and grammatical level that an intermediate level learner of English could be expected to understand. It must be well-formed and without too much superfluous material. All others were rejected. TEDDCLOG uses Sketch Engine’s GDEX function (Good Dictionary Example Extractor, Kilgarriff et al 2008) to do this.

My own interest in this area came about as a result of my work in the development of the Oxford Vocabulary Trainer . The app offers the possibility of studying both pre-determined lexical items (e.g. the vocabulary list of a coursebook that the learner is using) and free choice (any item could be activated and sent to a learning queue). In both cases, practice takes the form of sentences with the target item gapped. There are a range of hints and help options available to the learner, and feedback is both automatic and formative (i.e. if the supplied answer is not correct, hints are given to push the learner to do better on a second attempt). Leveraging some fairly heavy technology, we were able to achieve a fair amount of success in the automation of intelligent feedback, but what had, at first sight, seemed a lesser challenge – the generation of suitable ‘carrier sentences’, proved more difficult.

The sentences which ‘carry’ the gap should, ideally, be authentic: invented examples often ‘do not replicate the phraseology and collocational preferences of naturally-occurring text’ (Smith et al., 2010). The technology of corpus search tools should allow us to do a better job than human item writers. For that to be the case, we need not only good search tools but a good corpus … and some are better than others for the purposes of language learning. As Fenogenova & Kuzmenko (2016) discovered when using different corpora to automatically generate multiple choice vocabulary exercises, the British Academic Written English corpus (BAWE) was almost 50% more useful than the British National Corpus (BNC). In the development of the Oxford Vocabulary Trainer, we thought we had the best corpus we could get our hands on – the tagged corpus used for the production of the Oxford suite of dictionaries. We could, in addition and when necessary, turn to other corpora, including the BAWE and the BNC. Our requirements for acceptable carrier sentences were similar to those of Smith et al (2010), but were considerably more stringent.

To cut quite a long story short, we learnt fairly quickly that we simply couldn’t automate the generation of carrier sentences with sufficient consistency or reliability. As with some of the other examples discussed in this post, we were able to use the technology to help the writers in their work. We also learnt (rather belatedly, it has to be admitted) that we were trying to find technological solutions to problems that we hadn’t adequately analysed at the start. We hadn’t, for example, given sufficient thought to learner differences, especially the role of L1 (and other languages) in learning English. We hadn’t thought enough about the ‘messiness’ of either language or language learning. It’s possible, given enough resources, that we could have found ways of improving the algorithms, of leveraging other tools, or of deploying additional databases (especially learner corpora) in our quest for a personalised vocabulary learning system. But, in the end, it became clear to me that we were only nibbling at the problem of vocabulary learning. Deliberate learning of vocabulary may be an important part of acquiring a language, but it remains only a relatively small part. Technology may be able to help us in a variety of ways (and much more so in testing than learning), but the dreams of the data scientists (who wrote much of the research cited here) are likely to be short-lived. Experienced writers and editors of learning materials will be needed for the foreseeable future. And truly personalized vocabulary learning, fully supported by technology, will not be happening any time soon.

 

References

Fenogenova, A. & Kuzmenko, E. 2016. Automatic Generation of Lexical Exercises Available online at http://www.dialog-21.ru/media/3477/fenogenova.pdf

Iwata, T., Goto, T., Kojiri, T., Watanabe, T. & T. Yamada. 2011. ‘Automatic Generation of English Cloze Questions Based on Machine Learning’. NTT Technical Review Vol. 9 No. 10 Oct. 2011

Kilgarriff, A. et al. 2008. ‘GDEX: Automatically Finding Good Dictionary Examples in a Corpus.’ In E. Bernal and J. DeCesaris (eds.), Proceedings of the XIII EURALEX International Congress: Barcelona, 15-19 July 2008. Barcelona: l’Institut Universitari de Lingüística Aplicada (IULA) dela Universitat Pompeu Fabra, 425–432.

Knoop, S. & Wilske, S. 2013. ‘WordGap – Automatic generation of gap-filling vocabulary exercises for mobile learning’. Proceedings of the second workshop on NLP for computer-assisted language learning at NODALIDA 2013. NEALT Proceedings Series 17 / Linköping Electronic Conference Proceedings 86: 39–47. Available online at http://www.ep.liu.se/ecp/086/004/ecp13086004.pdf

Kumar, G., Banchs, R.E. & D’Haro, L.F. 2015. ‘RevUP: Automatic Gap-Fill Question Generation from Educational Texts’. Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, 2015, pp. 154–161, Denver, Colorado, June 4, Association for Computational Linguistics

Smith, S., Avinesh, P.V.S. & Kilgariff, A. 2010. ‘Gap-fill tests for Language Learners: Corpus-Driven Item Generation’. Proceedings of ICON-2010: 8th International Conference on Natural Language Processing, Macmillan Publishers, India. Available online at https://curve.coventry.ac.uk/open/file/2b755b39-a0fa-4171-b5ae-5d39568874e5/1/smithcomb2.pdf

Susanti, Y., Iida, R. & Tokunaga, T. 2015. ‘Automatic Generation of English Vocabulary Tests’. Proceedings of 7th International Conference on Computer Supported Education. Available online https://pdfs.semanticscholar.org/aead/415c1e07803756902b859e8b6e47ce312d96.pdf

Susanti, Y., Tokunaga, T., Nishikawa, H. & H. Obari 2017. ‘Evaluation of automatically generated English vocabulary questions’ Research and Practice in Technology Enhanced Learning 12 / 11

 

9781316629178More and more language learning is taking place, fully or partially, on online platforms and the affordances of these platforms for communicative interaction are exciting. Unfortunately, most platform-based language learning experiences are a relentless diet of drag-and-drop, drag-till-you-drop grammar or vocabulary gap-filling. The chat rooms and discussion forums that the platforms incorporate are underused or ignored. Lindsay Clandfield and Jill Hadfield’s new book is intended to promote online interaction between and among learners and the instructor, rather than between learners and software.

Interaction Online is a recipe book, containing about 80 different activities (many more if you consider the suggested variations). Subtitled ‘Creative activities for blended learning’, the authors have selected and designed the activities so that any teacher using any degree of blend (from platform-based instruction to occasional online homework) will be able to use them. The activities do not depend on any particular piece of software, as they are all designed for basic tools like Facebook, Skype and chat rooms. Indeed, almost every single activity could be used, sometimes with some slight modification, for teachers in face-to-face settings.

A recipe book must be judged on the quality of the activities it contains, and the standard here is high. They range from relatively simple, short activities to much longer tasks which will need an hour or more to complete. An example of the former is a sentence-completion activity (‘Don’t you hate / love it when ….?’ – activity 2.5). As an example of the latter, there is a complex problem-solving information-gap where students have to work out the solution to a mystery (activity 6.13), an activity which reminds me of some of the material in Jill Hadfield’s much-loved Communication Games books.

In common with many recipe books, Interaction Online is not an easy book to use, in the sense that it is hard to navigate. The authors have divided up the tasks into five kinds of interaction (personal, factual, creative, critical and fanciful), but it is not always clear precisely why one activity has been assigned to one category rather than another. In any case, the kind of interaction is likely to be less important to many teachers than the kind and amount of language that will be generated (among other considerations), and the table of contents is less than helpful. The index at the back of the book helps to some extent, but a clearer tabulation of activities by interaction type, level, time required, topic and language focus (if any) would be very welcome. Teachers will need to devise their own system of referencing so that they can easily find activities they want to try out.

Again, like many recipe books, Interaction Online is a mix of generic task-types and activities that will only work with the supporting materials that are provided. Teachers will enjoy the latter, but will want to experiment with the former and it is these generic task-types that they are most likely to add to their repertoire. In activity 2.7 (‘Foodies’ – personal interaction), for example, students post pictures of items of food and drink, to which other students must respond with questions. The procedure is clear and effective, but, as the authors note, the pictures could be of practically anything. ‘From pictures to questions’ might be a better title for the activity than ‘Foodies’. Similarly, activity 3.4 (‘Find a festival’ –factual interaction) uses a topic (‘festivals’), rather than a picture, to generate questions and responses. The procedure is slightly different from activity 2.7, but the interactional procedures of the two activities could be swapped around as easily as the topics could be changed.

Perhaps the greatest strength of this book is the variety of interactional procedures that is suggested. The majority of activities contain (1) suggestions for a stimulus, (2) suggestions for managing initial responses to this stimulus, and (3) suggestions for further interaction. As readers work their way through the book, they will be struck by similarities between the activities. The final chapter (chapter 8: ‘Task design’) provides an excellent summary of the possibilities of communicative online interaction, and more experienced teachers may want to read this chapter first.

Chapter 7 provides a useful, but necessarily fairly brief, overview of considerations regarding feedback and assessment

Overall, Interaction Online is a very rich resource, and one that will be best mined in multiple visits. For most readers, I would suggest an initial flick through and a cherry-picking of a small number of activities to try out. For materials writers and course designers, a better starting point may be the final two chapters, followed by a sampling of activities. For everyone, though, Online Interaction is a powerful reminder that technology-assisted language learning could and should be far more than what is usually is.

(This review first appeared in the International House Journal of Education and Development.)

 

In the last post, I suggested a range of activities that could be used in class to ‘activate’ a set of vocabulary before doing more communicative revision / recycling practice. In this, I’ll be suggesting a variety of more communicative tasks. As before, the activities require zero or minimal preparation on the part of the teacher.
1 Simple word associations
Write on the board a large selection of words that you want to recycle. Choose one word (at random) and ask the class if they can find another word on the board that they can associate with it. Ask one volunteer to (1) say what the other word is and (2) explain the association they have found between the two words. Then, draw a line through the first word and ask students if they can now choose a third word that they can associate with the second. Again, the nominated volunteer must explain the connection between the two words. Then, draw a line through the second word and ask for a connection between the third and fourth words. After three examples like this, it should be clear to the class what they need to do. Put the students into pairs or small groups and tell them to continue until there are no more words left, or it becomes too difficult to find connections / associations between the words that are left. This activity can be done simply in pairs or it can be turned into a class / group game.
As a follow-up, you might like to rearrange the pairs or groups and get students to see how many of their connections they can remember. As they are listening to the ideas of other students, ask them to decide which of the associations they found the most memorable / entertaining / interesting.
2 Association circles (variation of activity #1)
Ask students to look through their word list or flip through their flashcard set and make a list of the items that they are finding hardest to remember. They should do this with a partner and, together, should come up with a list of twelve or more words. Tell them to write these words in a circle on a sheet of paper.
Tell the students to choose, at random, one word in their circle. Next, they must find another word in the circle which they can associate in some way with the first word that they chose. They must explain this association to their partner. They must then find another word which they can associate with their second word. Again they must explain the association. They should continue in this way until they have connected all the words in their circle. Once students have completed the task with their partner, they should change partners and exchange ideas. All of this can be done orally.
3 Multiple associations
Using the same kind of circle of words, students again work with a partner. Starting with any word, they must find and explain an association with another word. Next, beginning with the word they first chose, they must find and explain an association with another word from the circle. They continue in this way until they have found connections between their first word and all the other words in the circle. Once students have completed the task with their partner, they should change partners and exchange ideas. All of this can be done orally.
4 Association dice
Prepare two lists (six in each list) of words that you want to recycle. Write these two lists on the board (list A and list B) with each word numbered 1 – 6. Each group in the class will need a dice.
First, demonstrate the activity with the whole class. Draw everyone’s attention to the two lists of the words on the board. Then roll a dice twice. Tell the students which numbers you have landed on. Explain that the first number corresponds to a word from List A and the second number to a word from List B. Think of and explain a connection / association between the two words. Organise the class into groups and ask them to continue playing the game.
Conduct feedback with the whole class. Ask them if they had any combinations of words for which they found it hard to think of a connection / association. Elicit suggestions from the whole class.
5 Picture associations #1
You will need a set of approximately eight pictures for this activity. These should be visually interesting and can be randomly chosen. If you do not have a set of pictures, you could ask the students to flick through their coursebooks and find a set of images that they find interesting or attractive. Tell them to note the page numbers. Alternatively, you could use pictures from the classroom: these might include posters on the walls, views out of the window, a mental picture of the teacher’s desk, a mental picture generated by imagining the whiteboard as a mirror, etc.
In the procedure described below, the students select the items they wish to practise. However, you may wish to select the items yourself. Make sure that students have access to dictionaries (print or online) during the lesson.
Ask the students to flip through their flashcard set or word list and make a list of the words that they are finding hardest to remember. They should do this with a partner and, together, should come up with a list of twelve or more words. The students should then find an association between each of the words on their list and one of the pictures that they select. They discuss their ideas with their partner, before comparing their ideas with a new partner.
6 Picture associations #2
Using the pictures and word lists (as in the activity above), students should select one picture, without telling their partner which picture they have selected. They should then look at the word list and choose four words from this list which they can associate with that picture. They then tell their four words to their partner, whose task is to guess which picture the other student was thinking of.
7 Rhyme associations
Prepare a list of approximately eight words that you want to recycle and write these on the board.
Ask the students to look at the words on the board. Tell them to work in pairs and find a word (in either English or their own language) which rhymes with each of the words on the list. If they cannot find a rhyming word, allow them to choose a word which sounds similar even if it is not a perfect rhyme.
The pairs should now find some sort of connection between each of the words on the list and their rhyming partners. When everyone has had enough time to find connections / associations, combine the pairs into groups of four, and ask them to exchange their ideas. Ask them to decide, for each word, which rhyming word and connection will be the most helpful in remembering this vocabulary.
Conduct feedback with the whole class.
8 Associations: truth and lies
In the procedure described below, no preparation is required. However, instead of asking the students to select the items they wish to practise, you may wish to select the items yourself. Make sure that students have access to dictionaries (print or online) during the lesson.
Ask students to flip through their flashcard set or word list and make a list of the words that they are finding hardest to remember. Individually, they should then write a series of sentences which contain these words: the sentences can contain one, two, or more of their target words. Half of the sentences should contain true personal information; the other half should contain false personal information.
Students then work with a partner, read their sentences aloud, and the partner must decide which sentences are true and which are false.
9 Associations: questions and answers
Prepare a list of between 12 and 20 items that you want the students to practise. Write these on the board (in any order) or distribute them as a handout.
Demonstrate the activity with the whole class before putting students into pairs. Make a question beginning with Why / How do you … / Why / How did you … / Why / How were you … which includes one of the target items from the list. The questions can be rather strange or divorced from reality. For example, if one of the words on the list were ankle, you could ask How did you break your ankle yesterday? Pretend that you are wracking your brain to think of an answer while looking at the other words on the board. Then, provide an answer, using one of the other words from the list. For example, if one of the other words were upset, you might answer I was feeling very upset about something and I wasn’t thinking about what I was doing. I fell down some steps. If necessary, do another example with the whole class to ensure that everyone understand the activity.
Tell the students to work in pairs, taking it in turns to ask and answer questions in the same way.
Conduct feedback with the whole class. Ask if there were any particularly strange questions or answers.
(I first came across a variation of this idea in a blog post by Alex Case ‘Playing with our Word Bag’
10 Associations: question and answer fortune telling
Prepare for yourself a list of items that you want to recycle. Number this list. (You will not need to show the list to anyone.)
Organise the class into pairs. Ask each pair to prepare four or five questions about the future. These questions could be personal or about the wider world around them. Give a few examples to make sure everyone understands: How many children will I have? What kind of job will I have five years from now? Who will win the next World Cup?
Tell the class that you have the answers to their questions. Hold up the list of words that you have prepared (without showing what is written on it). Elicit a question from one pair. Tell them that they must choose a number from 1 to X (depending on how many words you have on your list). Say the word aloud or write it on the board.
Tell the class that this is the answer to the question, but the answer must be ‘interpreted’. Ask the students to discuss in pairs the interpretation of the answer. You may need to demonstrate this the first time. If the question was How many children will I have? and the answer selected was precious, you might suggest that Your child will be very precious to you, but you will only have one. This activity requires a free imagination, and some classes will need some time to get used to the idea.
Continue with more questions and more answers selected blindly from the list, with students working in pairs to interpret these answers. Each time, conduct feedback with the whole class to find out who has the best interpretation.
11 Associations: narratives
In the procedure described below, no preparation is required. However, instead of asking the students to select the items they wish to practise, you may wish to select the items yourself. Make sure that students have access to dictionaries (print or online) during the lesson.
This activity often works best if it is used as a follow-up to ‘Picture Associations’. The story that the students prepare and tell should be connected to the picture that they focused on.
Ask students to flip through their flashcard set and make a list of the words that they are finding hardest to remember. They should do this with a partner and, together, should come up with a list of twelve or more words.
Still in pairs, they should prepare a short story which contains at least seven of the items in their list. After preparing their story, they should rehearse it before exchanging stories with another student / pair of students.
To extend this activity, the various stories can be ‘passed around’ the class in the manner of the game ‘Chinese Whispers’ (‘Broken Telephone’).
12 Associations: the sentence game
Prepare a list of approximately 25 items that you want the class to practise. Write these, in any order, on one side of the whiteboard.
Explain to the class that they are going to play a game. The object of the game is to score points by making grammatically correct sentences using the words on the board. If the students use just one of these words in a sentence, they will get one point. If they use two of the words, they’ll get two points. With three words, they’ll get three points. The more ambitious they are, the more points they can score. But if their sentence is incorrect, they will get no points and they will miss their turn. Tell the class that the sentences (1) must be grammatically correct, (2) must make logical sense, (3) must be single sentences. If there is a problem with a sentence, you, the teacher, will say that it is wrong, but you will not make a correction.
Put the class into groups of four students each. Give the groups some time to begin preparing sentences which contain one or more of the words from the list.
Ask a member from one group to come to the board and write one of the sentences they have prepared. If it is an appropriate sentence, award points. Cross out the word(s) that has been used from the list on the board: this word can no longer be used. If the sentence was incorrect, explain that there is a problem and turn to a member of the next group. This person can either (1) write a new sentence that their group has prepared, or (2) try, with the help of other members of their group to correct a sentence that is on the board. If their correction is correct, they score all the points for that sentence. If their correction is incorrect, they score no points and it is the end of their turn.
The game continues in this way with each group taking it in turns to make or correct sentences on the board.

(There are a number of comedy sketches about word associations. My favourite is this one. I’ve used it from time to time in presentations on this topic, but it has absolutely no pedagogical value (… unlike the next autoplay suggestion that was made for me, which has no comedy value).

word associations

A few years ago, I wrote a couple of posts about the sorts of things that teachers can do in classrooms to encourage the use of vocabulary apps and to deepen the learning of the learning items. You can find these here and here. In this and a future post, I want to take this a little further. These activities will be useful and appropriate for any teachers wanting to recycle target vocabulary in the classroom.

The initial deliberate learning of vocabulary usually focuses on the study of word meanings (e.g. target items along with translations), but for these items to be absorbed into the learner’s active vocabulary store, learners will need opportunities to use them in meaningful ways. Classrooms can provide rich opportunities for this. However, before setting up activities that offer learners the chance to do this, teachers will need in some way to draw attention to the items that will be practised. The simplest way of doing this is simply to ask students to review, for a few minutes, the relevant word set in their vocabulary apps or the relevant section of the word list. Here are some more interesting alternatives.

The post after this will suggest a range of activities that promote communicative, meaningful use of the target items (after they have been ‘activated’ using one or more of the activities below).

1             Memory check

Ask the students to spend a few minutes reviewing the relevant word set in their vocabulary apps or the relevant section of the word list (up to about 20 items). Alternatively, project / write the target items on the board. After a minute or two, tell the students to stop looking at the target items. Clean the board, if necessary.

Tell students to work individually and write down all the items they can remember. Allow a minute or two. Then, put the students into pairs: tell them to (1) combine their lists, (2) check their spelling, (3) check that they can translate (or define) the items they have, and (4) add to the lists. After a few minutes, tell the pairs to compare their lists with the work of another pair. Finally, allow students to see the list of target items so they can see which words they forgot.

2             Simple dictation

Tell the class that they are going to do a simple dictation, and ask them to write the numbers 1 to X (depending on how many words you wish to recycle: about 15 is recommended) on a piece of paper or in their notebooks. Dictate the words. Tell the students to work with a partner and check (1) their spelling, and (2) that they can remember the meanings of these words. Allow the students to check their answers in the vocabulary app / a dictionary / their word list / their coursebook.

3             Missing vowels dictation

As above (‘Simple dictation’), but tell the students that they must only write the consonants of the dictated words. When comparing their answers with a partner, they must reinsert the missing vowels.

4             Collocation dictation

As above (‘Simple dictation’), but instead of single words, dictate simple collocations (e.g. verb – complement combinations, adjective – noun pairings, adverb – adjective pairings). Students write down the collocations. When comparing their answers with a partner, they have an additional task: dictate the collocations again and identify one word that the students must underline. In pairs, students must think of one or two different words that can collocate with the underlined word.

5             Simple translation dictation

As above (‘Simple dictation’), but tell the students that must only write down the translation into their own language of the word (or phrase) that you have given them. Afterwards, when they are working with a partner, they must write down the English word. (This activity works well with multilingual groups – students do not need to speak the same language as their partner.)

6             Word count dictation

As above (‘Simple translation dictation’): when the students are doing the dictation, tell them that they must first silently count the number of letters in the English word and write down this number. They must also write down the translation into their own language. Afterwards, when they are working with a partner, they must write down the English word. As an addition / alternative, you can ask them to write down the first letter of the English word. (This activity works well with multilingual groups – students do not need to speak the same language as their partner.)

I first came across this activity in Morgan, J. & M. Rinvolucri (2004) Vocabulary 2nd edition. (Oxford: Oxford University Press).

7             Dictations with tables

Before dictating the target items, draw a simple table on the board of three or more columns. At the top of each column, write the different stress patterns of the words you will dictate. Explain to the students that they must write the words you dictate into the appropriate column.

Stress patterns

As an alternative to stress patterns, you could use different categories for the columns. Examples include: numbers of syllables, vowel sounds that feature in the target items, parts of speech, semantic fields, items that students are confident about / less confident about, etc.

8             Bilingual sentence dictation

Prepare a set of short sentences (eight maximum), each of which contains one of the words that you want to recycle. These sentences could be from a vocabulary exercise that the students have previously studied in their coursebooks or example sentences from vocab apps.

Tell the class that they are going to do a dictation. Tell them that you will read some sentences in English, but they must only write down translations into their own language of these sentences. Dictate the sentences, allowing ample time for students to write their translations. Put the students into pairs or small groups. Ask them to translate these sentences back into English. (This activity works well with multilingual groups – students do not need to speak the same language as their partner.) Conduct feedback with the whole class, or allow the students to check their answers with their apps / the coursebook.

From definitions (or translations) to words

An alternative to providing learners with the learning items and asking them to check the meanings is to get them to work towards the items from the meanings. There are a very wide variety of ways of doing this and a selection of these follows below.

9             Eliciting race

Prepare a list of words that you want to recycle. These lists will need to be printed on a handout. You will need at least two copies of this handout, but for some variations of the game you will need more copies.

Divide the class into two teams. Get one student from each team to come to the front of the class and hand them the list of words. Explain that their task is to elicit from their team each of the words on the list. They must not say the word that they are trying to elicit. The first team to produce the target word wins a point, and everyone moves on to the next word.

The race can also be played with students working in pairs. One student has the list and elicits from their partner.

10          Eliciting race against the clock

As above (‘Eliciting race’), but the race is played ‘against the clock’. The teams have different lists of words (or the same lists but in a different order). Set a time limit. How many words can be elicited in, say, three minutes?

11          Mime eliciting race

As above (‘Eliciting race’), but you can ask the students who are doing the eliciting to do this silently, using mime and gesture only. A further alternative is to get students to do the eliciting by drawing pictures (as in the game of Pictionary).

12          The fly-swatting game

Write the items to be reviewed all over the board. Divide the class into two teams. Taking turns, one member of each group comes to the board. Each of the students at the board is given a fly-swatter (if this is not possible, they can use the palms of their hands). Choose one of the items and define it in some way. The students must find the word and swat it. The first person to do so wins a point for their team. You will probably want to introduce a rule where students are only allowed one swat: this means that if they swat the wrong word, the other player can take as much time as they like (and consult with their tem members) before swatting a word.

13          Word grab

Prepare the target items on one or more sets of pieces of paper / card (one item per piece of paper). With a smallish class of about 8 students, one set is enough. With larger classes, prepare one set per group (of between 4 – 8 students). Students sit in a circle with the pieces of paper spread out on a table or on the floor in the middle. The teacher calls out the definitions and the students try to be the first person to grab the appropriate piece of paper.

As an alternative to this teacher-controlled version of the game, students can work in groups of three or four (more sets of pieces of paper will be needed). One student explains a word and the others compete to grab the right word. The student with the most words at the end is the ‘winner’. In order to cover a large number of items for recycling, each table can have a different set of words. Once a group of students has exhausted the words on their table, they can exchange tables with another group.

14          Word hold-up

The procedures above can be very loud and chaotic! For a calmer class, ensure that everyone (or every group) has a supply of blank pieces of paper. Do the eliciting yourself. The first student or team to hold up the correct answer on a piece of paper wins the point.

15          Original contexts

Find the words in the contexts in which they were originally presented (e.g. in the coursebook); write the sentences with gaps on the board (or prepare this for projection). First, students work with a partner to complete the gaps. Before checking that their answers are correct, insert the first letter of each missing word so students can check their own answers. If you wish, you may also add a second letter. Once the missing words have been checked, ask the students to try to find as many different alternatives (i.e. other words that will fit syntactically and semantically) as they can for the missing words they have just inserted.

Quick follow-up activities

16          Word grouping

Once the learning items for revision have been ‘activated’ using one of the activities above, you may wish to do a quick follow-up activity before moving on to more communicative practice. A simple task type is to ask students (in pairs, so that there is some discussion and sharing of ideas) to group the learning items in one or more ways. Here are a few suggestions for ways that students can be asked to group the words: (1) words they remembered easily / words they had forgotten; (2) words they like / dislike; (3) words they think will be useful to them / will not be useful to them; (4) words that remind them of a particular time or experience (or person) in their life; (5) words they would pack in their holiday bags / words they would put in the deep-freeze and forget about for the time being (thanks to Jeremy Harmer for this last idea).

I’m a sucker for meta-analyses, those aggregates of multiple studies that generate an effect size, and I am even fonder of meta-meta analyses. I skip over the boring stuff about inclusion criteria and statistical procedures and zoom in on the results and discussion. I’ve pored over Hattie (2009) and, more recently, Dunlosky et al (2013), and quoted both more often than is probably healthy. Hardly surprising, then, that I was eager to read Luke Plonsky and Nicole Ziegler’s ‘The CALL–SLA interface: insights from a second-order synthesis’ (Plonsky & Ziegler, 2016), an analysis of nearly 30 meta-analyses (later whittled down to 14) looking at the impact of technology on L2 learning. The big question they were looking to find an answer to? How effective is computer-assisted language learning compared to face-to-face contexts?

Plonsky & Ziegler

Plonsky and Ziegler found that there are unequivocally ‘positive effects of technology on language learning’. In itself, this doesn’t really tell us anything, simply because there are too many variables. It’s a statistical soundbite, ripe for plucking by anyone with an edtech product to sell. Much more useful is to understand which technologies used in which ways are likely to have a positive effect on learning. It appears from Plonsky and Ziegler’s work that the use of CALL glosses (to develop reading comprehension and vocabulary development) provides the strongest evidence of technology’s positive impact on learning. The finding is reinforced by the fact that this particular technology was the most well-represented research area in the meta-analyses under review.

What we know about glosses

gloss_gloss_WordA gloss is ‘a brief definition or synonym, either in L1 or L2, which is provided with [a] text’ (Nation, 2013: 238). They can take many forms (e.g. annotations in the margin or at the foot a printed page), but electronic or CALL glossing is ‘an instant look-up capability – dictionary or linked’ (Taylor, 2006; 2009) which is becoming increasingly standard in on-screen reading. One of the most widely used is probably the translation function in Microsoft Word: here’s the French gloss for the word ‘gloss’.

Language learning tools and programs are making increasing use of glosses. Here are two examples. The first is Lingro , a dictionary tool that learners can have running alongside any webpage: clicking on a word brings up a dictionary entry, and the word can then be exported into a wordlist which can be practised with spaced repetition software. The example here is using the English-English dictionary, but a number of bilingual pairings are available. The second is from Bliu Bliu , a language learning app that I unkindly reviewed here .Lingro_example

Bliu_Bliu_example_2

So, what did Plonsky and Ziegler discover about glosses? There were two key takeways:

  • both L1 and L2 CALL glossing can be beneficial to learners’ vocabulary development (Taylor, 2006, 2009, 2013)
  • CALL / electronic glosses lead to more learning gains than paper-based glosses (p.22)

On the surface, this might seem uncontroversial, but if you took a good look at the three examples (above) of online glosses, you’ll be thinking that something is not quite right here. Lingro’s gloss is a fairly full dictionary entry: it contains too much information for the purpose of a gloss. Cognitive Load Theory suggests that ‘new information be provided concisely so as not to overwhelm the learner’ (Khezrlou et al, 2017: 106): working out which definition is relevant here (the appropriate definition is actually the sixth in this list) will overwhelm many learners and interfere with the process of reading … which the gloss is intended to facilitate. In addition, the language of the definitions is more difficult than the defined item. Cognitive load is, therefore, further increased. Lingro needs to use a decent learner’s dictionary (with a limited defining vocabulary), rather than relying on the free Wiktionary.

Nation (2013: 240) cites research which suggests that a gloss is most effective when it provides a ‘core meaning’ which users will have to adapt to what is in the text. This is relatively unproblematic, from a technological perspective, but few glossing tools actually do this. The alternative is to use NLP tools to identify the context-specific meaning: our ability to do this is improving all the time but remains some way short of total accuracy. At the very least, NLP tools are needed to identify part of speech (which will increase the probability of hitting the right meaning). Bliu Bliu gets things completely wrong, confusing the verb and the adjective ‘own’.

Both Lingro and Bliu Bliu fail to meet the first requirement of a gloss: ‘that it should be understood’ (Nation, 2013: 239). Neither is likely to contribute much to the vocabulary development of learners. We will need to modify Plonsky and Ziegler’s conclusions somewhat: they are contingent on the quality of the glosses. This is not, however, something that can be assumed …. as will be clear from even the most cursory look at the language learning tools that are available.

Nation (2013: 447) also cites research that ‘learning is generally better if the meaning is written in the learner’s first language. This is probably because the meaning can be easily understood and the first language meaning already has many rich associations for the learner. Laufer and Shmueli (1997) found that L1 glosses are superior to L2 glosses in both short-term and long-term (five weeks) retention and irrespective of whether the words are learned in lists, sentences or texts’. Not everyone agrees, and a firm conclusion either way is probably not possible: learner variables (especially learner preferences) preclude anything conclusive, which is why I’ve highlighted Nation’s use of the word ‘generally’. If we have a look at Lingro’s bilingual gloss, I think you’ll agree that the monolingual and bilingual glosses are equally unhelpful, equally unlikely to lead to better learning, whether it’s vocabulary acquisition or reading comprehension.bilingual lingro

 

The issues I’ve just discussed illustrate the complexity of the ‘glossing’ question, but they only scratch the surface. I’ll dig a little deeper.

1 Glosses are only likely to be of value to learning if they are used selectively. Nation (2013: 242) suggests that ‘it is best to assume that the highest density of glossing should be no more than 5% and preferably around 3% of the running words’. Online glosses make the process of look-up extremely easy. This is an obvious advantage over look-ups in a paper dictionary, but there is a real risk, too, that the ease of online look-up encourages unnecessary look-ups. More clicks do not always lead to more learning. The value of glosses cannot therefore be considered independently of a consideration of the level (i.e. appropriacy) of the text that they are being used with.

2 A further advantage of online glosses is that they can offer a wide range of information, e.g. pronunciation, L1 translation, L2 definition, visuals, example sentences. The review of literature by Khezrlou et al (2017: 107) suggests that ‘multimedia glosses can promote vocabulary learning but uncertainty remains as to whether they also facilitate reading comprehension’. Barcroft (2015), however, warns that pictures may help learners with meaning, but at the cost of retention of word form, and the research of Boers et al did not find evidence to support the use of pictures. Even if we were to accept the proposition that pictures might be helpful, we would need to hold two caveats. First, the amount of multimodal support should not lead to cognitive overload. Second, pictures need to be clear and appropriate: a condition that is rarely met in online learning programs. The quality of multimodal glosses is more important than their inclusion / exclusion.

3 It’s a commonplace to state that learners will learn more if they are actively engaged or involved in the learning, rather than simply (receptively) looking up a gloss. So, it has been suggested that cognitive engagement can be stimulated by turning the glosses into a multiple-choice task, and a fair amount of research has investigated this possibility. Barcroft (2015: 143) reports research that suggests that ‘multiple-choice glosses [are] more effective than single glosses’, but Nation (2013: 246) argues that ‘multiple choice glosses are not strongly supported by research’. Basically, we don’t know and even if we have replication studies to re-assess the benefits of multimodal glosses (as advocated by Boers et al, 2017), it is again likely that learner variables will make it impossible to reach a firm conclusion.

Learning from meta-analyses

Discussion of glosses is not new. Back in the late 19th century, ‘most of the Reform Movement teachers, took the view that glossing was a sensible technique’ (Howatt, 2004: 191). Sensible, but probably not all that important in the broader scheme of language learning and teaching. Online glosses offer a number of potential advantages, but there is a huge number of variables that need to be considered if the potential is to be realised. In essence, I have been arguing that asking whether online glosses are more effective than print glosses is the wrong question. It’s not a question that can provide us with a useful answer. When you look at the details of the research that has been brought together in the meta-analysis, you simply cannot conclude that there are unequivocally positive effects of technology on language learning, if the most positive effects are to be found in the digital variation of an old sensible technique.

Interesting and useful as Plonsky and Ziegler’s study is, I think it needs to be treated with caution. More generally, we need to be cautious about using meta-analyses and effect sizes. Mura Nava has a useful summary of an article by Adrian Simpson (Simpson, 2017), that looks at inclusion criteria and statistical procedures and warns us that we cannot necessarily assume that the findings of meta-meta-analyses are educationally significant. More directly related to technology and language learning, Boulton’s paper (Boulton, 2016) makes a similar point: ‘Meta-analyses need interpreting with caution: in particular, it is tempting to seize on a single figure as the ultimate answer to the question: Does it work? […] More realistically, we need to look at variation in what works’.

For me, the greatest value in Plonsky and Ziegler’s paper was nothing to do with effect sizes and big answers to big questions. It was the bibliography … and the way it forced me to be rather more critical about meta-analyses.

References

Barcroft, J. 2015. Lexical Input Processing and Vocabulary Learning. Amsterdam: John Benjamins

Boers, F., Warren, P., He, L. & Deconinck, J. 2017. ‘Does adding pictures to glosses enhance vocabulary uptake from reading?’ System 66: 113 – 129

Boulton, A. 2016. ‘Quantifying CALL: significance, effect size and variation’ in S. Papadima-Sophocleus, L. Bradley & S. Thouësny (eds.) CALL Communities and Culture – short papers from Eurocall 2016 pp.55 – 60 http://files.eric.ed.gov/fulltext/ED572012.pdf

Dunlosky, J., Rawson, K.A., Marsh, E.J., Nathan, M.J. & Willingham, D.T. 2013. ‘Improving Students’ Learning With Effective Learning Techniques’ Psychological Science in the Public Interest 14 / 1: 4 – 58

Hattie, J.A.C. 2009. Visible Learning. Abingdon, Oxon.: Routledge

Howatt, A.P.R. 2004. A History of English Language Teaching 2nd edition. Oxford: Oxford University Press

Khezrlou, S., Ellis, R. & K. Sadeghi 2017. ‘Effects of computer-assisted glosses on EFL learners’ vocabulary acquisition and reading comprehension in three learning conditions’ System 65: 104 – 116

Laufer, B. & Shmueli, K. 1997. ‘Memorizing new words: Does teaching have anything to do with it?’ RELC Journal 28 / 1: 89 – 108

Nation, I.S.P. 2013. Learning Vocabulary in Another Language. Cambridge: Cambridge University Press

Plonsky, L. & Ziegler, N. 2016. ‘The CALL–SLA interface:  insights from a second-order synthesis’ Language Learning & Technology 20 / 2: 17 – 37

Simpson, A. 2017. ‘The misdirection of public policy: Comparing and combining standardised effect sizes’ Journal of Education Policy, 32 / 4: 450-466

Taylor, A. M. 2006. ‘The effects of CALL versus traditional L1 glosses on L2 reading comprehension’. CALICO Journal, 23, 309–318.

Taylor, A. M. 2009. ‘CALL-based versus paper-based glosses: Is there a difference in reading comprehension?’ CALICO Journal, 23, 147–160.

Taylor, A. M. 2013. CALL versus paper: In which context are L1 glosses more effective? CALICO Journal, 30, 63-8

Every now and then, someone recommends me to take a look at a flashcard app. It’s often interesting to see what developers have done with design, gamification and UX features, but the content is almost invariably awful. Most recently, I was encouraged to look at Word Pash. The screenshots below are from their promotional video.

word-pash-1 word-pash-2 word-pash-3 word-pash-4

The content problems are immediately apparent: an apparently random selection of target items, an apparently random mix of high and low frequency items, unidiomatic language examples, along with definitions and distractors that are less frequent than the target item. I don’t know if these are representative of the rest of the content. The examples seem to come from ‘Stage 1 Level 3’, whatever that means. (My confidence in the product was also damaged by the fact that the Word Pash website includes one testimonial from a certain ‘Janet Reed – Proud Mom’, whose son ‘was able to increase his score and qualify for academic scholarships at major universities’ after using the app. The picture accompanying ‘Janet Reed’ is a free stock image from Pexels and ‘Janet Reed’ is presumably fictional.)

According to the website, ‘WordPash is a free-to-play mobile app game for everyone in the global audience whether you are a 3rd grader or PhD, wordbuff or a student studying for their SATs, foreign student or international business person, you will become addicted to this fast paced word game’. On the basis of the promotional video, the app couldn’t be less appropriate for English language learners. It seems unlikely that it would help anyone improve their ACT or SAT test scores. The suggestion that the vocabulary development needs of 9-year-olds and doctoral students are comparable is pure chutzpah.

The deliberate study of more or less random words may be entertaining, but it’s unlikely to lead to very much in practical terms. For general purposes, the deliberate learning of the highest frequency words, up to about a frequency ranking of #7500, makes sense, because there’s a reasonably high probability that you’ll come across these items again before you’ve forgotten them. Beyond that frequency level, the value of the acquisition of an additional 1000 words tails off very quickly. Adding 1000 words from frequency ranking #8000 to #9000 is likely to result in an increase in lexical understanding of general purpose texts of about 0.2%. When we get to frequency ranks #19,000 to #20,000, the gain in understanding decreases to 0.01%[1]. In other words, deliberate vocabulary learning needs to be targeted. The data is relatively recent, but the principle goes back to at least the middle of the last century when Michael West argued that a principled approach to vocabulary development should be driven by a comparison of the usefulness of a word and its ‘learning cost’[2]. Three hundred years before that, Comenius had articulated something very similar: ‘in compiling vocabularies, my […] concern was to select the words in most frequent use[3].

I’ll return to ‘general purposes’ later in this post, but, for now, we should remember that very few language learners actually study a language for general purposes. Globally, the vast majority of English language learners study English in an academic (school) context and their immediate needs are usually exam-specific. For them, general purpose frequency lists are unlikely to be adequate. If they are studying with a coursebook and are going to be tested on the lexical content of that book, they will need to use the wordlist that matches the book. Increasingly, publishers make such lists available and content producers for vocabulary apps like Quizlet and Memrise often use them. Many examinations, both national and international, also have accompanying wordlists. Examples of such lists produced by examination boards include the Cambridge English young learners’ exams (Starters, Movers and Flyers) and Cambridge English Preliminary. Other exams do not have official word lists, but reasonably reliable lists have been produced by third parties. Examples include Cambridge First, IELTS and SAT. There are, in addition, well-researched wordlists for academic English, including the Academic Word List (AWL)  and the Academic Vocabulary List  (AVL). All of these make sensible starting points for deliberate vocabulary learning.

When we turn to other, out-of-school learners the number of reasons for studying English is huge. Different learners have different lexical needs, and working with a general purpose frequency list may be, at least in part, a waste of time. EFL and ESL learners are likely to have very different needs, as will EFL and ESP learners, as will older and younger learners, learners in different parts of the world, learners who will find themselves in English-speaking countries and those who won’t, etc., etc. For some of these demographics, specialised corpora (from which frequency-based wordlists can be drawn) exist. For most learners, though, the ideal list simply does not exist. Either it will have to be created (requiring a significant amount of time and expertise[4]) or an available best-fit will have to suffice. Paul Nation, in his recent ‘Making and Using Word Lists for Language Learning and Testing’ (John Benjamins, 2016) includes a useful chapter on critiquing wordlists. For anyone interested in better understanding the issues surrounding the development and use of wordlists, three good articles are freely available online. These are:making-and-using-word-lists-for-language-learning-and-testing

Lessard-Clouston, M. 2012 / 2013. ‘Word Lists for Vocabulary Learning and Teaching’ The CATESOL Journal 24.1: 287- 304

Lessard-Clouston, M. 2016. ‘Word lists and vocabulary teaching: options and suggestions’ Cornerstone ESL Conference 2016

Sorell, C. J. 2013. A study of issues and techniques for creating core vocabulary lists for English as an International Language. Doctoral thesis.

But, back to ‘general purposes’ …. Frequency lists are the obvious starting point for preparing a wordlist for deliberate learning, but they are very problematic. Frequency rankings depend on the corpus on which they are based and, since these are different, rankings vary from one list to another. Even drawing on just one corpus, rankings can be a little strange. In the British National Corpus, for example, ‘May’ (the month) is about twice as frequent as ‘August’[5], but we would be foolish to infer from this that the learning of ‘May’ should be prioritised over the learning of ‘August’. An even more striking example from the same corpus is the fact that ‘he’ is about twice as frequent as ‘she’[6]: should, therefore, ‘he’ be learnt before ‘she’?

List compilers have to make a number of judgement calls in their work. There is not space here to consider these in detail, but two particularly tricky questions concerning the way that words are chosen may be mentioned: Is a verb like ‘list’, with two different and unrelated meanings, one word or two? Should inflected forms be considered as separate words? The judgements are not usually informed by considerations of learners’ needs. Learners will probably best approach vocabulary development by building their store of word senses: attempting to learn all the meanings and related forms of any given word is unlikely to be either useful or successful.

Frequency lists, in other words, are not statements of scientific ‘fact’: they are interpretative documents. They have been compiled for descriptive purposes, not as ways of structuring vocabulary learning, and it cannot be assumed they will necessarily be appropriate for a purpose for which they were not designed.

A further major problem concerns the corpus on which the frequency list is based. Large databases, such as the British National Corpus or the Corpus of Contemporary American English, are collections of language used by native speakers in certain parts of the world, usually of a restricted social class. As such, they are of relatively little value to learners who will be using English in contexts that are not covered by the corpus. A context where English is a lingua franca is one such example.

A different kind of corpus is the Cambridge Learner Corpus (CLC), a collection of exam scripts produced by candidates in Cambridge exams. This has led to the development of the English Vocabulary Profile (EVP) , where word senses are tagged as corresponding to particular levels in the Common European Framework scale. At first glance, this looks like a good alternative to frequency lists based on native-speaker corpora. But closer consideration reveals many problems. The design of examination tasks inevitably results in the production of language of a very different kind from that produced in other contexts. Many high frequency words simply do not appear in the CLC because it is unlikely that a candidate would use them in an exam. Other items are very frequent in this corpus just because they are likely to be produced in examination tasks. Unsurprisingly, frequency rankings in EVP do not correlate very well with frequency rankings from other corpora. The EVP, then, like other frequency lists, can only serve, at best, as a rough guide for the drawing up of target item vocabulary lists in general purpose apps or coursebooks[7].

There is no easy solution to the problems involved in devising suitable lexical content for the ‘global audience’. Tagging words to levels (i.e. grouping them into frequency bands) will always be problematic, unless very specific user groups are identified. Writers, like myself, of general purpose English language teaching materials are justifiably irritated by some publishers’ insistence on allocating words to levels with numerical values. The policy, taken to extremes (as is increasingly the case), has little to recommend it in linguistic terms. But it’s still a whole lot better than the aleatory content of apps like Word Pash.

[1] See Nation, I.S.P. 2013. Learning Vocabulary in Another Language 2nd edition. (Cambridge: Cambridge University Press) p. 21 for statistical tables. See also Nation, P. & R. Waring 1997. ‘Vocabulary size, text coverage and word lists’ in Schmitt & McCarthy (eds.) 1997. Vocabulary: Description, Acquisition and Pedagogy. (Cambridge: Cambridge University Press) pp. 6 -19

[2] See Kelly, L.G. 1969. 25 Centuries of Language Teaching. (Rowley, Mass.: Rowley House) p.206 for a discussion of West’s ideas.

[3] Kelly, L.G. 1969. 25 Centuries of Language Teaching. (Rowley, Mass.: Rowley House) p. 184

[4] See Timmis, I. 2015. Corpus Linguistics for ELT (Abingdon: Routledge) for practical advice on doing this.

[5] Nation, I.S.P. 2016. Making and Using Word Lists for Language Learning and Testing. (Amsterdam: John Benjamins) p.58

[6] Taylor, J.R. 2012. The Mental Corpus. (Oxford: Oxford University Press) p.151

[7] For a detailed critique of the limitations of using the CLC as a guide to syllabus design and textbook development, see Swan, M. 2014. ‘A Review of English Profile Studies’ ELTJ 68/1: 89-96

In December last year, I posted a wish list for vocabulary (flashcard) apps. At the time, I hadn’t read a couple of key research texts on the subject. It’s time for an update.

First off, there’s an article called ‘Intentional Vocabulary Learning Using Digital Flashcards’ by Hsiu-Ting Hung. It’s available online here. Given the lack of empirical research into the use of digital flashcards, it’s an important article and well worth a read. Its basic conclusion is that digital flashcards are more effective as a learning tool than printed word lists. No great surprises there, but of more interest, perhaps, are the recommendations that (1) ‘students should be educated about the effective use of flashcards (e.g. the amount and timing of practice), and this can be implemented through explicit strategy instruction in regular language courses or additional study skills workshops ‘ (Hung, 2015: 111), and (2) that digital flashcards can be usefully ‘repurposed for collaborative learning tasks’ (Hung, ibid.).

nakataHowever, what really grabbed my attention was an article by Tatsuya Nakata. Nakata’s research is of particular interest to anyone interested in vocabulary learning, but especially so to those with an interest in digital possibilities. A number of his research articles can be freely accessed via his page at ResearchGate, but the one I am interested in is called ‘Computer-assisted second language vocabulary learning in a paired-associate paradigm: a critical investigation of flashcard software’. Don’t let the title put you off. It’s a review of a pile of web-based flashcard programs: since the article is already five years old, many of the programs have either changed or disappeared, but the critical approach he takes is more or less as valid now as it was then (whether we’re talking about web-based stuff or apps).

Nakata divides his evaluation for criteria into two broad groups.

Flashcard creation and editing

(1) Flashcard creation: Can learners create their own flashcards?

(2) Multilingual support: Can the target words and their translations be created in any language?

(3) Multi-word units: Can flashcards be created for multi-word units as well as single words?

(4) Types of information: Can various kinds of information be added to flashcards besides the word meanings (e.g. parts of speech, contexts, or audios)?

(5) Support for data entry: Does the software support data entry by automatically supplying information about lexical items such as meaning, parts of speech, contexts, or frequency information from an internal database or external resources?

(6) Flashcard set: Does the software allow learners to create their own sets of flashcards?

Learning

(1) Presentation mode: Does the software have a presentation mode, where new items are introduced and learners familiarise themselves with them?

(2) Retrieval mode: Does the software have a retrieval mode, which asks learners to recall or choose the L2 word form or its meaning?

(3) Receptive recall: Does the software ask learners to produce the meanings of target words?

(4) Receptive recognition: Does the software ask learners to choose the meanings of target words?

(5) Productive recall: Does the software ask learners to produce the target word forms corresponding to the meanings provided?

(6) Productive recognition: Does the software ask learners to choose the target word forms corresponding to the meanings provided?

(7) Increasing retrieval effort: For a given item, does the software arrange exercises in the order of increasing difficulty?

(8) Generative use: Does the software encourage generative use of words, where learners encounter or use previously met words in novel contexts?

(9) Block size: Can the number of words studied in one learning session be controlled and altered?

(10) Adaptive sequencing: Does the software change the sequencing of items based on learners’ previous performance on individual items?

(11) Expanded rehearsal: Does the software help implement expanded rehearsal, where the intervals between study trials are gradually increased as learning proceeds? (Nakata, T. (2011): ‘Computer-assisted second language vocabulary learning in a paired-associate paradigm: a critical investigation of flashcard software’ Computer Assisted Language Learning, 24:1, 17-38)

It’s a rather different list from my own (there’s nothing I would disagree with here), because mine is more general and his is exclusively oriented towards learning principles. Nakata makes the point towards the end of the article that it would ‘be useful to investigate learners’ reactions to computer-based flashcards to examine whether they accept flashcard programs developed according to learning principles’ (p. 34). It’s far from clear, he points out, that conformity to learning principles are at the top of learners’ agendas. More than just users’ feelings about computer-based flashcards in general, a key concern will be the fact that there are ‘large individual differences in learners’ perceptions of [any flashcard] program’ (Nakata, N. 2008. ‘English vocabulary learning with word lists, word cards and computers: implications from cognitive psychology research for optimal spaced learning’ ReCALL 20(1), p. 18).

I was trying to make a similar point in another post about motivation and vocabulary apps. In the end, as with any language learning material, research-driven language learning principles can only take us so far. User experience is a far more difficult creature to pin down or to make generalisations about. A user’s reaction to graphics, gamification, uploading time and so on are so powerful and so subjective that learning principles will inevitably play second fiddle. That’s not to say, of course, that Nakata’s questions are not important: it’s merely to wonder whether the bigger question is truly answerable.

Nakata’s research identifies plenty of room for improvement in digital flashcards, and although the article is now quite old, not a lot had changed. Key areas to work on are (1) the provision of generative use of target words, (2) the need to increase retrieval effort, (3) the automatic provision of information about meaning, parts of speech, or contexts (in order to facilitate flashcard creation), and (4) the automatic generation of multiple-choice distractors.

In the conclusion of his study, he identifies one flashcard program which is better than all the others. Unsurprisingly, five years down the line, the software he identifies is no longer free, others have changed more rapidly in the intervening period, and who knows will be out in front next week?

 

About two and a half years ago when I started writing this blog, there was a lot of hype around adaptive learning and the big data which might drive it. Two and a half years are a long time in technology. A look at Google Trends suggests that interest in adaptive learning has been pretty static for the last couple of years. It’s interesting to note that 3 of the 7 lettered points on this graph are Knewton-related media events (including the most recent, A, which is Knewton’s latest deal with Hachette) and 2 of them concern McGraw-Hill. It would be interesting to know whether these companies follow both parts of Simon Cowell’s dictum of ‘Create the hype, but don’t ever believe it’.

Google_trends

A look at the Hype Cycle (see here for Wikipedia’s entry on the topic and for criticism of the hype of Hype Cycles) of the IT research and advisory firm, Gartner, indicates that both big data and adaptive learning have now slid into the ‘trough of disillusionment’, which means that the market has started to mature, becoming more realistic about how useful the technologies can be for organizations.

A few years ago, the Gates Foundation, one of the leading cheerleaders and financial promoters of adaptive learning, launched its Adaptive Learning Market Acceleration Program (ALMAP) to ‘advance evidence-based understanding of how adaptive learning technologies could improve opportunities for low-income adults to learn and to complete postsecondary credentials’. It’s striking that the program’s aims referred to how such technologies could lead to learning gains, not whether they would. Now, though, with the publication of a report commissioned by the Gates Foundation to analyze the data coming out of the ALMAP Program, things are looking less rosy. The report is inconclusive. There is no firm evidence that adaptive learning systems are leading to better course grades or course completion. ‘The ultimate goal – better student outcomes at lower cost – remains elusive’, the report concludes. Rahim Rajan, a senior program office for Gates, is clear: ‘There is no magical silver bullet here.’

The same conclusion is being reached elsewhere. A report for the National Education Policy Center (in Boulder, Colorado) concludes: Personalized Instruction, in all its many forms, does not seem to be the transformational technology that is needed, however. After more than 30 years, Personalized Instruction is still producing incremental change. The outcomes of large-scale studies and meta-analyses, to the extent they tell us anything useful at all, show mixed results ranging from modest impacts to no impact. Additionally, one must remember that the modest impacts we see in these meta-analyses are coming from blended instruction, which raises the cost of education rather than reducing it (Enyedy, 2014: 15 -see reference at the foot of this post). In the same vein, a recent academic study by Meg Coffin Murray and Jorge Pérez (2015, ‘Informing and Performing: A Study Comparing Adaptive Learning to Traditional Learning’) found that ‘adaptive learning systems have negligible impact on learning outcomes’.

future-ready-learning-reimagining-the-role-of-technology-in-education-1-638In the latest educational technology plan from the U.S. Department of Education (‘Future Ready Learning: Reimagining the Role of Technology in Education’, 2016) the only mentions of the word ‘adaptive’ are in the context of testing. And the latest OECD report on ‘Students, Computers and Learning: Making the Connection’ (2015), finds, more generally, that information and communication technologies, when they are used in the classroom, have, at best, a mixed impact on student performance.

There is, however, too much money at stake for the earlier hype to disappear completely. Sponsored cheerleading for adaptive systems continues to find its way into blogs and national magazines and newspapers. EdSurge, for example, recently published a report called ‘Decoding Adaptive’ (2016), sponsored by Pearson, that continues to wave the flag. Enthusiastic anecdotes take the place of evidence, but, for all that, it’s a useful read.

In the world of ELT, there are plenty of sales people who want new products which they can call ‘adaptive’ (and gamified, too, please). But it’s striking that three years after I started following the hype, such products are rather thin on the ground. Pearson was the first of the big names in ELT to do a deal with Knewton, and invested heavily in the company. Their relationship remains close. But, to the best of my knowledge, the only truly adaptive ELT product that Pearson offers is the PTE test.

Macmillan signed a contract with Knewton in May 2013 ‘to provide personalized grammar and vocabulary lessons, exam reviews, and supplementary materials for each student’. In December of that year, they talked up their new ‘big tree online learning platform’: ‘Look out for the Big Tree logo over the coming year for more information as to how we are using our partnership with Knewton to move forward in the Language Learning division and create content that is tailored to students’ needs and reactive to their progress.’ I’ve been looking out, but it’s all gone rather quiet on the adaptive / platform front.

In September 2013, it was the turn of Cambridge to sign a deal with Knewton ‘to create personalized learning experiences in its industry-leading ELT digital products for students worldwide’. This year saw the launch of a major new CUP series, ‘Empower’. It has an online workbook with personalized extra practice, but there’s nothing (yet) that anyone would call adaptive. More recently, Cambridge has launched the online version of the 2nd edition of Touchstone. Nothing adaptive there, either.

Earlier this year, Cambridge published The Cambridge Guide to Blended Learning for Language Teaching, edited by Mike McCarthy. It contains a chapter by M.O.Z. San Pedro and R. Baker on ‘Adaptive Learning’. It’s an enthusiastic account of the potential of adaptive learning, but it doesn’t contain a single reference to language learning or ELT!

So, what’s going on? Skepticism is becoming the order of the day. The early hype of people like Knewton’s Jose Ferreira is now understood for what it was. Companies like Macmillan got their fingers badly burnt when they barked up the wrong tree with their ‘Big Tree’ platform.

Noel Enyedy captures a more contemporary understanding when he writes: Personalized Instruction is based on the metaphor of personal desktop computers—the technology of the 80s and 90s. Today’s technology is not just personal but mobile, social, and networked. The flexibility and social nature of how technology infuses other aspects of our lives is not captured by the model of Personalized Instruction, which focuses on the isolated individual’s personal path to a fixed end-point. To truly harness the power of modern technology, we need a new vision for educational technology (Enyedy, 2014: 16).

Adaptive solutions aren’t going away, but there is now a much better understanding of what sorts of problems might have adaptive solutions. Testing is certainly one. As the educational technology plan from the U.S. Department of Education (‘Future Ready Learning: Re-imagining the Role of Technology in Education’, 2016) puts it: Computer adaptive testing, which uses algorithms to adjust the difficulty of questions throughout an assessment on the basis of a student’s responses, has facilitated the ability of assessments to estimate accurately what students know and can do across the curriculum in a shorter testing session than would otherwise be necessary. In ELT, Pearson and EF have adaptive tests that have been well researched and designed.

Vocabulary apps which deploy adaptive technology continue to become more sophisticated, although empirical research is lacking. Automated writing tutors with adaptive corrective feedback are also developing fast, and I’ll be writing a post about these soon. Similarly, as speech recognition software improves, we can expect to see better and better automated adaptive pronunciation tutors. But going beyond such applications, there are bigger questions to ask, and answers to these will impact on whatever direction adaptive technologies take. Large platforms (LMSs), with or without adaptive software, are already beginning to look rather dated. Will they be replaced by integrated apps, or are apps themselves going to be replaced by bots (currently riding high in the Hype Cycle)? In language learning and teaching, the future of bots is likely to be shaped by developments in natural language processing (another topic about which I’ll be blogging soon). Nobody really has a clue where the next two and a half years will take us (if anywhere), but it’s becoming increasingly likely that adaptive learning will be only one very small part of it.

 

Enyedy, N. 2014. Personalized Instruction: New Interest, Old Rhetoric, Limited Results, and the Need for a New Direction for Computer-Mediated Learning. Boulder, CO: National Education Policy Center. Retrieved 17.07.16 from http://nepc.colorado.edu/publication/personalized-instruction