Posts Tagged ‘content’


Allowing learners to determine the amount of time they spend studying, and, therefore (in theory at least) the speed of their progress is a key feature of most personalized learning programs. In cases where learners follow a linear path of pre-determined learning items, it is often the only element of personalization that the programs offer. In the Duolingo program that I am using, there are basically only two things that can be personalized: the amount of time I spend studying each day, and the possibility of jumping a number of learning items by ‘testing out’.

Self-regulated learning or self-pacing, as this is commonly referred to, has enormous intuitive appeal. It is clear that different people learn different things at different rates. We’ve known for a long time that ‘the developmental stages of child growth and the individual differences among learners make it impossible to impose a single and ‘correct’ sequence on all curricula’ (Stern, 1983: 439). It therefore follows that it makes even less sense for a group of students (typically determined by age) to be obliged to follow the same curriculum at the same pace in a one-size-fits-all approach. We have probably all experienced, as students, the frustration of being behind, or ahead of, the rest of our colleagues in a class. One student who suffered from the lockstep approach was Sal Khan, founder of the Khan Academy. He has described how he was fed up with having to follow an educational path dictated by his age and how, as a result, individual pacing became an important element in his educational approach (Ferster, 2014: 132-133). As teachers, we have all experienced the challenges of teaching a piece of material that is too hard or too easy for many of the students in the class.

Historical attempts to facilitate self-paced learning

Charles_W__Eliot_cph_3a02149An interest in self-paced learning can be traced back to the growth of mass schooling and age-graded classes in the 19th century. In fact, the ‘factory model’ of education has never existed without critics who saw the inherent problems of imposing uniformity on groups of individuals. These critics were not marginal characters. Charles Eliot (president of Harvard from 1869 – 1909), for example, described uniformity as ‘the curse of American schools’ and argued that ‘the process of instructing students in large groups is a quite sufficient school evil without clinging to its twin evil, an inflexible program of studies’ (Grittner, 1975: 324).

Attempts to develop practical solutions were not uncommon and these are reasonably well-documented. One of the earliest, which ran from 1884 to 1894, was launched in Pueblo, Colorado and was ‘a self-paced plan that required each student to complete a sequence of lessons on an individual basis’ (Januszewski, 2001: 58-59). More ambitious was the Burk Plan (at its peak between 1912 and 1915), named after Frederick Burk of the San Francisco State Normal School, which aimed to allow students to progress through materials (including language instruction materials) at their own pace with only a limited amount of teacher presentations (Januszewski, ibid.). Then, there was the Winnetka Plan (1920s), developed by Carlton Washburne, an associate of Frederick Burk and the superintendent of public schools in Winnetka, Illinois, which also ‘allowed learners to proceed at different rates, but also recognised that learners proceed at different rates in different subjects’ (Saettler, 1990: 65). The Winnetka Plan is especially interesting in the way it presaged contemporary attempts to facilitate individualized, self-paced learning. It was described by its developers in the following terms:

A general technique [consisting] of (a) breaking up the common essentials curriculum into very definite units of achievement, (b) using complete diagnostic tests to determine whether a child has mastered each of these units, and, if not, just where his difficulties lie and, (c) the full use of self-instructive, self corrective practice materials. (Washburne, C., Vogel, M. & W.S. Gray. 1926. A Survey of the Winnetka Public Schools. Bloomington, IL: Public School Press)

Not dissimilar was the Dalton (Massachusetts) Plan in the 1920s which also used a self-paced program to accommodate the different ability levels of the children and deployed contractual agreements between students and teachers (something that remains common educational practice around the world). There were many others, both in the U.S. and other parts of the world.

The personalization of learning through self-pacing was not, therefore, a minor interest. Between 1910 and 1924, nearly 500 articles can be documented on the subject of individualization (Grittner, 1975: 328). In just three years (1929 – 1932) of one publication, The Education Digest, there were fifty-one articles dealing with individual instruction and sixty-three entries treating individual differences (Chastain, 1975: 334). Foreign language teaching did not feature significantly in these early attempts to facilitate self-pacing, but see the Burk Plan described above. Only a handful of references to language learning and self-pacing appeared in articles between 1916 and 1924 (Grittner, 1975: 328).

Disappointingly, none of these initiatives lasted long. Both costs and management issues had been significantly underestimated. Plans such as those described above were seen as progress, but not the hoped-for solution. Problems included the fact that the materials themselves were not individualized and instructional methods were too rigid (Pendleton, 1930: 199). However, concomitant with the interest in individualization (mostly, self-pacing), came the advent of educational technology.

Sidney L. Pressey, the inventor of what was arguably the first teaching machine, was inspired by his experiences with schoolchildren in rural Indiana in the 1920s where he ‘was struck by the tremendous variation in their academic abilities and how they were forced to progress together at a slow, lockstep pace that did not serve all students well’ (Ferster, 2014: 52). Although Pressey failed in his attempts to promote his teaching machines, he laid the foundation stones in the synthesizing of individualization and technology.Pressey machine

Pressey may be seen as the direct precursor of programmed instruction, now closely associated with B. F. Skinner (see my post on Behaviourism and Adaptive Learning). It is a quintessentially self-paced approach and is described by John Hattie as follows:

Programmed instruction is a teaching method of presenting new subject matter to students in graded sequence of controlled steps. A book version, for example, presents a problem or issue, then, depending on the student’s answer to a question about the material, the student chooses from optional answers which refers them to particular pages of the book to find out why they were correct or incorrect – and then proceed to the next part of the problem or issue. (Hattie, 2009: 231)

Programmed instruction was mostly used for the teaching of mathematics, but it is estimated that 4% of programmed instruction programs were for foreign languages (Saettler, 1990: 297). It flourished in the 1960s and 1970s, but even by 1968 foreign language instructors were sceptical (Valdman, 1968). A survey carried out by the Center for Applied Linguistics revealed then that only about 10% of foreign language teachers at college and university reported the use of programmed materials in their departments. (Valdman, 1968: 1).grolier min max

Research studies had failed to demonstrate the effectiveness of programmed instruction (Saettler, 1990: 303). Teachers were often resistant and students were often bored, finding ‘ingenious ways to circumvent the program, including the destruction of their teaching machines!’ (Saettler, ibid.).

In the case of language learning, there were other problems. For programmed instruction to have any chance of working, it was necessary to specify rigorously the initial and terminal behaviours of the learner so that the intermediate steps leading from the former to the latter could be programmed. As Valdman (1968: 4) pointed out, this is highly problematic when it comes to languages (a point that I have made repeatedly in this blog). In addition, students missed the personal interaction that conventional instruction offered, got bored and lacked motivation (Valdman, 1968: 10).

Programmed instruction worked best when teachers were very enthusiastic, but perhaps the most significant lesson to be learned from the experiments was that it was ‘a difficult, time-consuming task to introduce programmed instruction’ (Saettler, 1990: 299). It entailed changes to well-established practices and attitudes, and for such changes to succeed there must be consideration of the social, political, and economic contexts. As Saettler (1990: 306), notes, ‘without the support of the community and the entire teaching staff, sustained innovation is unlikely’. In this light, Hattie’s research finding that ‘when comparisons are made between many methods, programmed instruction often comes near the bottom’ (Hattie, 2009: 231) comes as no great surprise.

Just as programmed instruction was in its death throes, the world of language teaching discovered individualization. Launched as a deliberate movement in the early 1970s at the Stanford Conference (Altman & Politzer, 1971), it was a ‘systematic attempt to allow for individual differences in language learning’ (Stern, 1983: 387). Inspired, in part, by the work of Carl Rogers, this ‘humanistic turn’ was a recognition that ‘each learner is unique in personality, abilities, and needs. Education must be personalized to fit the individual; the individual must not be dehumanized in order to meet the needs of an impersonal school system’ (Disick, 1975:38). In ELT, this movement found many adherents and remains extremely influential to this day.

In language teaching more generally, the movement lost impetus after a few years, ‘probably because its advocates had underestimated the magnitude of the task they had set themselves in trying to match individual learner characteristics with appropriate teaching techniques’ (Stern, 1983: 387). What precisely was meant by individualization was never adequately defined or agreed (a problem that remains to the present time). What was left was self-pacing. In 1975, it was reported that ‘to date the majority of the programs in second-language education have been characterized by a self-pacing format […]. Practice seems to indicate that ‘individualized’ instruction is being defined in the class room as students studying individually’ (Chastain, 1975: 344).

Lessons to be learned

This brief account shows that historical attempts to facilitate self-pacing have largely been characterised by failure. The starting point of all these attempts remains as valid as ever, but it is clear that practical solutions are less than simple. To avoid the insanity of doing the same thing over and over again and expecting different results, we should perhaps try to learn from the past.

One of the greatest challenges that teachers face is dealing with different levels of ability in their classes. In any blended scenario where the online component has an element of self-pacing, the challenge will be magnified as ability differentials are likely to grow rather than decrease as a result of the self-pacing. Bart Simpson hit the nail on the head in a memorable line: ‘Let me get this straight. We’re behind the rest of the class and we’re going to catch up to them by going slower than they are? Coo coo!’ Self-pacing runs into immediate difficulties when it comes up against standardised tests and national or state curriculum requirements. As Ferster observes, ‘the notion of individual pacing [remains] antithetical to […] a graded classroom system, which has been the model of schools for the past century. Schools are just not equipped to deal with students who do not learn in age-processed groups, even if this system is clearly one that consistently fails its students (Ferster, 2014: 90-91).bart_simpson

Ability differences are less problematic if the teacher focusses primarily on communicative tasks in F2F time (as opposed to more teaching of language items), but this is a big ‘if’. Many teachers are unsure of how to move towards a more communicative style of teaching, not least in large classes in compulsory schooling. Since there are strong arguments that students would benefit from a more communicative, less transmission-oriented approach anyway, it makes sense to focus institutional resources on equipping teachers with the necessary skills, as well as providing support, before a shift to a blended, more self-paced approach is implemented.

Such issues are less important in private institutions, which are not age-graded, and in self-study contexts. However, even here there may be reasons to proceed cautiously before buying into self-paced approaches. Self-pacing is closely tied to autonomous goal-setting (which I will look at in more detail in another post). Both require a degree of self-awareness at a cognitive and emotional level (McMahon & Oliver, 2001), but not all students have such self-awareness (Magill, 2008). If students do not have the appropriate self-regulatory strategies and are simply left to pace themselves, there is a chance that they will ‘misregulate their learning, exerting control in a misguided or counterproductive fashion and not achieving the desired result’ (Kirschner & van Merriënboer, 2013: 177). Before launching students on a path of self-paced language study, ‘thought needs to be given to the process involved in users becoming aware of themselves and their own understandings’ (McMahon & Oliver, 2001: 1304). Without training and support provided both before and during the self-paced study, the chances of dropping out are high (as we see from the very high attrition rate in language apps).

However well-intentioned, many past attempts to facilitate self-pacing have also suffered from the poor quality of the learning materials. The focus was more on the technology of delivery, and this remains the case today, as many posts on this blog illustrate. Contemporary companies offering language learning programmes show relatively little interest in the content of the learning (take Duolingo as an example). Few app developers show signs of investing in experienced curriculum specialists or materials writers. Glossy photos, contemporary videos, good UX and clever gamification, all of which become dull and repetitive after a while, do not compensate for poorly designed materials.

Over forty years ago, a review of self-paced learning concluded that the evidence on its benefits was inconclusive (Allison, 1975: 5). Nothing has changed since. For some people, in some contexts, for some of the time, self-paced learning may work. Claims that go beyond that cannot be substantiated.


Allison, E. 1975. ‘Self-Paced Instruction: A Review’ The Journal of Economic Education 7 / 1: 5 – 12

Altman, H.B. & Politzer, R.L. (eds.) 1971. Individualizing Foreign Language Instruction: Proceedings of the Stanford Conference, May 6 – 8, 1971. Washington, D.C.: Office of Education, U.S. Department of Health, Education, and Welfare

Chastain, K. 1975. ‘An Examination of the Basic Assumptions of “Individualized” Instruction’ The Modern Language Journal 59 / 7: 334 – 344

Disick, R.S. 1975 Individualizing Language Instruction: Strategies and Methods. New York: Harcourt Brace Jovanovich

Ferster, B. 2014. Teaching Machines. Baltimore: John Hopkins University Press

Grittner, F. M. 1975. ‘Individualized Instruction: An Historical Perspective’ The Modern Language Journal 59 / 7: 323 – 333

Hattie, J. 2009. Visible Learning. Abingdon, Oxon.: Routledge

Januszewski, A. 2001. Educational Technology: The Development of a Concept. Englewood, Colorado: Libraries Unlimited

Kirschner, P. A. & van Merriënboer, J. J. G. 2013. ‘Do Learners Really Know Best? Urban Legends in Education’ Educational Psychologist, 48:3, 169-183

Magill, D. S. 2008. ‘What Part of Self-Paced Don’t You Understand?’ University of Wisconsin 24th Annual Conference on Distance Teaching & Learning Conference Proceedings.

McMahon, M. & Oliver, R. 2001. ‘Promoting self-regulated learning in an on-line environment’ in C. Montgomerie & J. Viteli (eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2001 (pp. 1299-1305). Chesapeake, VA: AACE

Pendleton, C. S. 1930. ‘Personalizing English Teaching’ Peabody Journal of Education 7 / 4: 195 – 200

Saettler, P. 1990. The Evolution of American Educational Technology. Denver: Libraries Unlimited

Stern, H.H. 1983. Fundamental Concepts of Language Teaching. Oxford: Oxford University Press

Valdman, A. 1968. ‘Programmed Instruction versus Guided Learning in Foreign Language Acquisition’ Die Unterrichtspraxis / Teaching German 1 / 2: 1 – 14


Every now and then, someone recommends me to take a look at a flashcard app. It’s often interesting to see what developers have done with design, gamification and UX features, but the content is almost invariably awful. Most recently, I was encouraged to look at Word Pash. The screenshots below are from their promotional video.

word-pash-1 word-pash-2 word-pash-3 word-pash-4

The content problems are immediately apparent: an apparently random selection of target items, an apparently random mix of high and low frequency items, unidiomatic language examples, along with definitions and distractors that are less frequent than the target item. I don’t know if these are representative of the rest of the content. The examples seem to come from ‘Stage 1 Level 3’, whatever that means. (My confidence in the product was also damaged by the fact that the Word Pash website includes one testimonial from a certain ‘Janet Reed – Proud Mom’, whose son ‘was able to increase his score and qualify for academic scholarships at major universities’ after using the app. The picture accompanying ‘Janet Reed’ is a free stock image from Pexels and ‘Janet Reed’ is presumably fictional.)

According to the website, ‘WordPash is a free-to-play mobile app game for everyone in the global audience whether you are a 3rd grader or PhD, wordbuff or a student studying for their SATs, foreign student or international business person, you will become addicted to this fast paced word game’. On the basis of the promotional video, the app couldn’t be less appropriate for English language learners. It seems unlikely that it would help anyone improve their ACT or SAT test scores. The suggestion that the vocabulary development needs of 9-year-olds and doctoral students are comparable is pure chutzpah.

The deliberate study of more or less random words may be entertaining, but it’s unlikely to lead to very much in practical terms. For general purposes, the deliberate learning of the highest frequency words, up to about a frequency ranking of #7500, makes sense, because there’s a reasonably high probability that you’ll come across these items again before you’ve forgotten them. Beyond that frequency level, the value of the acquisition of an additional 1000 words tails off very quickly. Adding 1000 words from frequency ranking #8000 to #9000 is likely to result in an increase in lexical understanding of general purpose texts of about 0.2%. When we get to frequency ranks #19,000 to #20,000, the gain in understanding decreases to 0.01%[1]. In other words, deliberate vocabulary learning needs to be targeted. The data is relatively recent, but the principle goes back to at least the middle of the last century when Michael West argued that a principled approach to vocabulary development should be driven by a comparison of the usefulness of a word and its ‘learning cost’[2]. Three hundred years before that, Comenius had articulated something very similar: ‘in compiling vocabularies, my […] concern was to select the words in most frequent use[3].

I’ll return to ‘general purposes’ later in this post, but, for now, we should remember that very few language learners actually study a language for general purposes. Globally, the vast majority of English language learners study English in an academic (school) context and their immediate needs are usually exam-specific. For them, general purpose frequency lists are unlikely to be adequate. If they are studying with a coursebook and are going to be tested on the lexical content of that book, they will need to use the wordlist that matches the book. Increasingly, publishers make such lists available and content producers for vocabulary apps like Quizlet and Memrise often use them. Many examinations, both national and international, also have accompanying wordlists. Examples of such lists produced by examination boards include the Cambridge English young learners’ exams (Starters, Movers and Flyers) and Cambridge English Preliminary. Other exams do not have official word lists, but reasonably reliable lists have been produced by third parties. Examples include Cambridge First, IELTS and SAT. There are, in addition, well-researched wordlists for academic English, including the Academic Word List (AWL)  and the Academic Vocabulary List  (AVL). All of these make sensible starting points for deliberate vocabulary learning.

When we turn to other, out-of-school learners the number of reasons for studying English is huge. Different learners have different lexical needs, and working with a general purpose frequency list may be, at least in part, a waste of time. EFL and ESL learners are likely to have very different needs, as will EFL and ESP learners, as will older and younger learners, learners in different parts of the world, learners who will find themselves in English-speaking countries and those who won’t, etc., etc. For some of these demographics, specialised corpora (from which frequency-based wordlists can be drawn) exist. For most learners, though, the ideal list simply does not exist. Either it will have to be created (requiring a significant amount of time and expertise[4]) or an available best-fit will have to suffice. Paul Nation, in his recent ‘Making and Using Word Lists for Language Learning and Testing’ (John Benjamins, 2016) includes a useful chapter on critiquing wordlists. For anyone interested in better understanding the issues surrounding the development and use of wordlists, three good articles are freely available online. These are:making-and-using-word-lists-for-language-learning-and-testing

Lessard-Clouston, M. 2012 / 2013. ‘Word Lists for Vocabulary Learning and Teaching’ The CATESOL Journal 24.1: 287- 304

Lessard-Clouston, M. 2016. ‘Word lists and vocabulary teaching: options and suggestions’ Cornerstone ESL Conference 2016

Sorell, C. J. 2013. A study of issues and techniques for creating core vocabulary lists for English as an International Language. Doctoral thesis.

But, back to ‘general purposes’ …. Frequency lists are the obvious starting point for preparing a wordlist for deliberate learning, but they are very problematic. Frequency rankings depend on the corpus on which they are based and, since these are different, rankings vary from one list to another. Even drawing on just one corpus, rankings can be a little strange. In the British National Corpus, for example, ‘May’ (the month) is about twice as frequent as ‘August’[5], but we would be foolish to infer from this that the learning of ‘May’ should be prioritised over the learning of ‘August’. An even more striking example from the same corpus is the fact that ‘he’ is about twice as frequent as ‘she’[6]: should, therefore, ‘he’ be learnt before ‘she’?

List compilers have to make a number of judgement calls in their work. There is not space here to consider these in detail, but two particularly tricky questions concerning the way that words are chosen may be mentioned: Is a verb like ‘list’, with two different and unrelated meanings, one word or two? Should inflected forms be considered as separate words? The judgements are not usually informed by considerations of learners’ needs. Learners will probably best approach vocabulary development by building their store of word senses: attempting to learn all the meanings and related forms of any given word is unlikely to be either useful or successful.

Frequency lists, in other words, are not statements of scientific ‘fact’: they are interpretative documents. They have been compiled for descriptive purposes, not as ways of structuring vocabulary learning, and it cannot be assumed they will necessarily be appropriate for a purpose for which they were not designed.

A further major problem concerns the corpus on which the frequency list is based. Large databases, such as the British National Corpus or the Corpus of Contemporary American English, are collections of language used by native speakers in certain parts of the world, usually of a restricted social class. As such, they are of relatively little value to learners who will be using English in contexts that are not covered by the corpus. A context where English is a lingua franca is one such example.

A different kind of corpus is the Cambridge Learner Corpus (CLC), a collection of exam scripts produced by candidates in Cambridge exams. This has led to the development of the English Vocabulary Profile (EVP) , where word senses are tagged as corresponding to particular levels in the Common European Framework scale. At first glance, this looks like a good alternative to frequency lists based on native-speaker corpora. But closer consideration reveals many problems. The design of examination tasks inevitably results in the production of language of a very different kind from that produced in other contexts. Many high frequency words simply do not appear in the CLC because it is unlikely that a candidate would use them in an exam. Other items are very frequent in this corpus just because they are likely to be produced in examination tasks. Unsurprisingly, frequency rankings in EVP do not correlate very well with frequency rankings from other corpora. The EVP, then, like other frequency lists, can only serve, at best, as a rough guide for the drawing up of target item vocabulary lists in general purpose apps or coursebooks[7].

There is no easy solution to the problems involved in devising suitable lexical content for the ‘global audience’. Tagging words to levels (i.e. grouping them into frequency bands) will always be problematic, unless very specific user groups are identified. Writers, like myself, of general purpose English language teaching materials are justifiably irritated by some publishers’ insistence on allocating words to levels with numerical values. The policy, taken to extremes (as is increasingly the case), has little to recommend it in linguistic terms. But it’s still a whole lot better than the aleatory content of apps like Word Pash.

[1] See Nation, I.S.P. 2013. Learning Vocabulary in Another Language 2nd edition. (Cambridge: Cambridge University Press) p. 21 for statistical tables. See also Nation, P. & R. Waring 1997. ‘Vocabulary size, text coverage and word lists’ in Schmitt & McCarthy (eds.) 1997. Vocabulary: Description, Acquisition and Pedagogy. (Cambridge: Cambridge University Press) pp. 6 -19

[2] See Kelly, L.G. 1969. 25 Centuries of Language Teaching. (Rowley, Mass.: Rowley House) p.206 for a discussion of West’s ideas.

[3] Kelly, L.G. 1969. 25 Centuries of Language Teaching. (Rowley, Mass.: Rowley House) p. 184

[4] See Timmis, I. 2015. Corpus Linguistics for ELT (Abingdon: Routledge) for practical advice on doing this.

[5] Nation, I.S.P. 2016. Making and Using Word Lists for Language Learning and Testing. (Amsterdam: John Benjamins) p.58

[6] Taylor, J.R. 2012. The Mental Corpus. (Oxford: Oxford University Press) p.151

[7] For a detailed critique of the limitations of using the CLC as a guide to syllabus design and textbook development, see Swan, M. 2014. ‘A Review of English Profile Studies’ ELTJ 68/1: 89-96

In December last year, I posted a wish list for vocabulary (flashcard) apps. At the time, I hadn’t read a couple of key research texts on the subject. It’s time for an update.

First off, there’s an article called ‘Intentional Vocabulary Learning Using Digital Flashcards’ by Hsiu-Ting Hung. It’s available online here. Given the lack of empirical research into the use of digital flashcards, it’s an important article and well worth a read. Its basic conclusion is that digital flashcards are more effective as a learning tool than printed word lists. No great surprises there, but of more interest, perhaps, are the recommendations that (1) ‘students should be educated about the effective use of flashcards (e.g. the amount and timing of practice), and this can be implemented through explicit strategy instruction in regular language courses or additional study skills workshops ‘ (Hung, 2015: 111), and (2) that digital flashcards can be usefully ‘repurposed for collaborative learning tasks’ (Hung, ibid.).

nakataHowever, what really grabbed my attention was an article by Tatsuya Nakata. Nakata’s research is of particular interest to anyone interested in vocabulary learning, but especially so to those with an interest in digital possibilities. A number of his research articles can be freely accessed via his page at ResearchGate, but the one I am interested in is called ‘Computer-assisted second language vocabulary learning in a paired-associate paradigm: a critical investigation of flashcard software’. Don’t let the title put you off. It’s a review of a pile of web-based flashcard programs: since the article is already five years old, many of the programs have either changed or disappeared, but the critical approach he takes is more or less as valid now as it was then (whether we’re talking about web-based stuff or apps).

Nakata divides his evaluation for criteria into two broad groups.

Flashcard creation and editing

(1) Flashcard creation: Can learners create their own flashcards?

(2) Multilingual support: Can the target words and their translations be created in any language?

(3) Multi-word units: Can flashcards be created for multi-word units as well as single words?

(4) Types of information: Can various kinds of information be added to flashcards besides the word meanings (e.g. parts of speech, contexts, or audios)?

(5) Support for data entry: Does the software support data entry by automatically supplying information about lexical items such as meaning, parts of speech, contexts, or frequency information from an internal database or external resources?

(6) Flashcard set: Does the software allow learners to create their own sets of flashcards?


(1) Presentation mode: Does the software have a presentation mode, where new items are introduced and learners familiarise themselves with them?

(2) Retrieval mode: Does the software have a retrieval mode, which asks learners to recall or choose the L2 word form or its meaning?

(3) Receptive recall: Does the software ask learners to produce the meanings of target words?

(4) Receptive recognition: Does the software ask learners to choose the meanings of target words?

(5) Productive recall: Does the software ask learners to produce the target word forms corresponding to the meanings provided?

(6) Productive recognition: Does the software ask learners to choose the target word forms corresponding to the meanings provided?

(7) Increasing retrieval effort: For a given item, does the software arrange exercises in the order of increasing difficulty?

(8) Generative use: Does the software encourage generative use of words, where learners encounter or use previously met words in novel contexts?

(9) Block size: Can the number of words studied in one learning session be controlled and altered?

(10) Adaptive sequencing: Does the software change the sequencing of items based on learners’ previous performance on individual items?

(11) Expanded rehearsal: Does the software help implement expanded rehearsal, where the intervals between study trials are gradually increased as learning proceeds? (Nakata, T. (2011): ‘Computer-assisted second language vocabulary learning in a paired-associate paradigm: a critical investigation of flashcard software’ Computer Assisted Language Learning, 24:1, 17-38)

It’s a rather different list from my own (there’s nothing I would disagree with here), because mine is more general and his is exclusively oriented towards learning principles. Nakata makes the point towards the end of the article that it would ‘be useful to investigate learners’ reactions to computer-based flashcards to examine whether they accept flashcard programs developed according to learning principles’ (p. 34). It’s far from clear, he points out, that conformity to learning principles are at the top of learners’ agendas. More than just users’ feelings about computer-based flashcards in general, a key concern will be the fact that there are ‘large individual differences in learners’ perceptions of [any flashcard] program’ (Nakata, N. 2008. ‘English vocabulary learning with word lists, word cards and computers: implications from cognitive psychology research for optimal spaced learning’ ReCALL 20(1), p. 18).

I was trying to make a similar point in another post about motivation and vocabulary apps. In the end, as with any language learning material, research-driven language learning principles can only take us so far. User experience is a far more difficult creature to pin down or to make generalisations about. A user’s reaction to graphics, gamification, uploading time and so on are so powerful and so subjective that learning principles will inevitably play second fiddle. That’s not to say, of course, that Nakata’s questions are not important: it’s merely to wonder whether the bigger question is truly answerable.

Nakata’s research identifies plenty of room for improvement in digital flashcards, and although the article is now quite old, not a lot had changed. Key areas to work on are (1) the provision of generative use of target words, (2) the need to increase retrieval effort, (3) the automatic provision of information about meaning, parts of speech, or contexts (in order to facilitate flashcard creation), and (4) the automatic generation of multiple-choice distractors.

In the conclusion of his study, he identifies one flashcard program which is better than all the others. Unsurprisingly, five years down the line, the software he identifies is no longer free, others have changed more rapidly in the intervening period, and who knows will be out in front next week?


Ok, let’s be honest here. This post is about teacher training, but ‘development’ sounds more respectful, more humane, more modern. Teacher development (self-initiated, self-evaluated, collaborative and holistic) could be adaptive, but it’s unlikely that anyone will want to spend the money on developing an adaptive teacher development platform any time soon. Teacher training (top-down, pre-determined syllabus and externally evaluated) is another matter. If you’re not too clear about this distinction, see Penny Ur’s article in The Language Teacher.

decoding_adaptive jpgThe main point of adaptive learning tools is to facilitate differentiated instruction. They are, as Pearson’s latest infomercial booklet describes them, ‘educational technologies that can respond to a student’s interactions in real-time by automatically providing the student with individual support’. Differentiation or personalization (or whatever you call it) is, as I’ve written before  , the declared goal of almost everyone in educational power these days. What exactly it is may be open to question (see Michael Feldstein’s excellent article), as may be the question of whether or not it is actually such a desideratum (see, for example, this article ). But, for the sake of argument, let’s agree that it’s mostly better than one-size-fits-all.

Teachers around the world are being encouraged to adopt a differentiated approach with their students, and they are being encouraged to use technology to do so. It is technology that can help create ‘robust personalized learning environments’ (says the White House)  . Differentiation for language learners could be facilitated by ‘social networking systems, podcasts, wikis, blogs, encyclopedias, online dictionaries, webinars, online English courses,’ etc. (see Alexandra Chistyakova’s post on eltdiary ).

But here’s the crux. If we want teachers to adopt a differentiated approach, they really need to have experienced it themselves in their training. An interesting post on edweek  sums this up: If professional development is supposed to lead to better pedagogy that will improve student learning AND we are all in agreement that modeling behaviors is the best way to show people how to do something, THEN why not ensure all professional learning opportunities exhibit the qualities we want classroom teachers to have?

Differentiated teacher development / training is rare. According to the Center for Public Education’s Teaching the Teachers report , almost all teachers participate in ‘professional development’ (PD) throughout the year. However, a majority of those teachers find the PD in which they participate ineffective. Typically, the development is characterised by ‘drive-by’ workshops, one-size-fits-all presentations, ‘been there, done that’ topics, little or no modelling of what is being taught, a focus on rotating fads and a lack of follow-up. This report is not specifically about English language teachers, but it will resonate with many who are working in English language teaching around the world.cindy strickland

The promotion of differentiated teacher development is gaining traction: see here or here , for example, or read Cindy A. Strickland’s ‘Professional Development for Differentiating Instruction’.

Remember, though, that it’s really training, rather than development, that we’re talking about. After all, if one of the objectives is to equip teachers with a skills set that will enable them to become more effective instructors of differentiated learning, this is most definitely ‘training’ (notice the transitivity of the verbs ‘enable’ and ‘equip’!). In this context, a necessary starting point will be some sort of ‘knowledge graph’ (which I’ve written about here ). For language teachers, these already exist, including the European Profiling Grid , the Eaquals Framework for Language Teacher Training and Development, the Cambridge English Teaching Framework and the British Council’s Continuing Professional Development Framework (CPD) for Teachers  . We can expect these to become more refined and more granularised, and a partial move in this direction is the Cambridge English Digital Framework for Teachers  . Once a knowledge graph is in place, the next step will be to tag particular pieces of teacher training content (e.g. webinars, tasks, readings, etc.) to locations in the framework that is being used. It would not be too complicated to engineer dynamic frameworks which could be adapted to individual or institutional needs.cambridge_english_teaching_framework jpg

This process will be facilitated by the fact that teacher training content is already being increasingly granularised. Whether it’s an MA in TESOL or a shorter, more practically oriented course, things are getting more and more bite-sized, with credits being awarded to these short bites, as course providers face stiffer competition and respond to market demands.

Visible classroom home_page_screenshotClassroom practice could also form part of such an adaptive system. One tool that could be deployed would be Visible Classroom , an automated system for providing real-time evaluative feedback for teachers. There is an ‘online dashboard providing teachers with visual information about their teaching for each lesson in real-time. This includes proportion of teacher talk to student talk, number and type of questions, and their talking speed.’ John Hattie, who is behind this project, says that teachers ‘account for about 30% of the variance in student achievement and [are] the largest influence outside of individual student effort.’ Teacher development with a tool like Visible Classroom is ultimately all about measuring teacher performance (against a set of best-practice benchmarks identified by Hattie’s research) in order to improve the learning outcomes of the students.Visible_classroom_panel_image jpg

You may have noticed the direction in which this part of this blog post is going. I began by talking about social networking systems, podcasts, wikis, blogs and so on, and just now I’ve mentioned the summative, credit-bearing possibilities of an adaptive teacher development training programme. It’s a tension that is difficult to resolve. There’s always a paradox in telling anyone that they are going to embark on a self-directed course of professional development. Whoever pays the piper calls the tune and, if an institution decides that it is worth investing significant amounts of money in teacher development, they will want a return for their money. The need for truly personalised teacher development is likely to be overridden by the more pressing need for accountability, which, in turn, typically presupposes pre-determined course outcomes, which can be measured in some way … so that quality (and cost-effectiveness and so on) can be evaluated.

Finally, it’s worth asking if language teaching (any more than language learning) can be broken down into small parts that can be synthesized later into a meaningful and valuable whole. Certainly, there are some aspects of language teaching (such as the ability to use a dashboard on an LMS) which lend themselves to granularisation. But there’s a real danger of losing sight of the forest of teaching if we focus on the individual trees that can be studied and measured.

If you’re going to teach vocabulary, you need to organise it in some way. Almost invariably, this organisation is topical, with words grouped into what are called semantic sets. In coursebooks, the example below (from Rogers, M., Taylore-Knowles, J. & S. Taylor-Knowles. 2010. Open Mind Level 1. London: Macmillan, p.68) is fairly typical.

open mind

Coursebooks are almost always organised in a topical way. The example above comes in a unit (of 10 pages), entitled ‘You have talent!’, which contains two main vocabulary sections. It’s unsurprising to find a section called ‘personality adjectives’ in such a unit. What’s more, such an approach lends itself to the requisite, but largely, spurious ‘can-do’ statement in the self-evaluation section: I can talk about people’s positive qualities. We must have clearly identifiable learning outcomes, after all.

There is, undeniably, a certain intuitive logic in this approach. An alternative might entail a radical overhaul of coursebook architecture – this might not be such a bad thing, but might not go down too well in the markets. How else, after all, could the vocabulary strand of the syllabus be organised?

Well, there are a number of ways in which a vocabulary syllabus could be organised. Including the standard approach described above, here are four possibilities:

1 semantic sets (e.g. bee, butterfly, fly, mosquito, etc.)

2 thematic sets (e.g. ‘pets’: cat, hate, flea, feed, scratch, etc.)

3 unrelated sets

4 sets determined by a group of words’ occurrence in a particular text

Before reading further, you might like to guess what research has to say about the relative effectiveness of these four approaches.

The answer depends, to some extent, on the level of the learner. For advanced learners, it appears to make no, or little, difference (Al-Jabri, 2005, cited by Ellis & Shintani, 2014: 106). But, for the vast majority of English language learners (i.e. those at or below B2 level), the research is clear: the most effective way of organising vocabulary items to be learnt is by grouping them into thematic sets (2) or by mixing words together in a semantically unrelated way (3) – not by teaching sets like ‘personality adjectives’. It is surprising how surprising this finding is to so many teachers and materials writers. It goes back at least to 1988 and West’s article on ‘Catenizing’ in ELTJ, which argued that semantic grouping made little sense from a psycho-linguistic perspective. Since then, a large amount of research has taken place. This is succinctly summarised by Paul Nation (2013: 128) in the following terms: Avoid interference from related words. Words which are similar in form (Laufer, 1989) or meaning (Higa, 1963; Nation, 2000; Tinkham, 1993; Tinkham, 1997; Waring, 1997) are more difficult to learn together than they are to learn separately. For anyone who is interested, the most up-to-date review of this research that I can find is in chapter 11 of Barcroft (2105).

The message is clear. So clear that you have to wonder how it is not getting through to materials designers. Perhaps, coursebooks are different. They regularly eschew research findings for commercial reasons. But vocabulary apps? There is rarely, if ever, any pressure on the content-creation side of vocabulary apps (except those that are tied to coursebooks) to follow the popular misconceptions that characterise so many coursebooks. It wouldn’t be too hard to organise vocabulary into thematic sets (like, for example, the approach in the A2 level of Memrise German that I’m currently using). Is it simply because the developers of so many vocabulary apps just don’t know much about language learning?


Barcroft, J. 2015. Lexical Input Processing and Vocabulary Learning. Amsterdam: John Benjamins

Nation, I. S. P. 2013. Learning Vocabulary in Another Language 2nd edition. Cambridge: Cambridge University Press

Ellis, R. & N. Shintani, N. 2014. Exploring Language Pedagogy through Second Language Acquisition Research. Abingdon, Oxon: Routledge

West, M. 1988. ‘Catenizing’ English Language Teaching Journal 6: 147 – 151

Having spent a lot of time recently looking at vocabulary apps, I decided to put together a Christmas wish list of the features of my ideal vocabulary app. The list is not exhaustive and I’ve given more attention to some features than others. What (apart from testing) have I missed out?

1             Spaced repetition

Since the point of a vocabulary app is to help learners memorise vocabulary items, it is hard to imagine a decent system that does not incorporate spaced repetition. Spaced repetition algorithms offer one well-researched way of improving the brain’s ‘forgetting curve’. These algorithms come in different shapes and sizes, and I am not technically competent to judge which is the most efficient. However, as Peter Ellis Jones, the developer of a flashcard system called CardFlash, points out, efficiency is only one half of the rote memorisation problem. If you are not motivated to learn, the cleverness of the algorithm is moot. Fundamentally, learning software needs to be fun, rewarding, and give a solid sense of progression.

2             Quantity, balance and timing of new and ‘old’ items

A spaced repetition algorithm determines the optimum interval between repetitions, but further algorithms will be needed to determine when and with what frequency new items will be added to the deck. Once a system knows how many items a learner needs to learn and the time in which they have to do it, it is possible to determine the timing and frequency of the presentation of new items. But the system cannot know in advance how well an individual learner will learn the items (for any individual, some items will be more readily learnable than others) nor the extent to which learners will live up to their own positive expectations of time spent on-app. As most users of flashcard systems know, it is easy to fall behind, feel swamped and, ultimately, give up. An intelligent system needs to be able to respond to individual variables in order to ensure that the learning load is realistic.

3             Task variety

A standard flashcard system which simply asks learners to indicate whether they ‘know’ a target item before they flip over the card rapidly becomes extremely boring. A system which tests this knowledge soon becomes equally dull. There needs to be a variety of ways in which learners interact with an app, both for reasons of motivation and learning efficiency. It may be the case that, for an individual user, certain task types lead to more rapid gains in learning. An intelligent, adaptive system should be able to capture this information and modify the selection of task types.

Most younger learners and some adult learners will respond well to the inclusion of games within the range of task types. Examples of such games include the puzzles developed by Oliver Rose in his Phrase Maze app to accompany Quizlet practice.Phrase Maze 1Phrase Maze 2

4             Generative use

Memory researchers have long known about the ‘Generation Effect’ (see for example this piece of research from the Journal of Verbal Learning and Learning Behavior, 1978). Items are better learnt when the learner has to generate, in some (even small) way, the target item, rather than simply reading it. In vocabulary learning, this could be, for example, typing in the target word or, more simply, inserting some missing letters. Systems which incorporate task types that require generative use are likely to result in greater learning gains than simple, static flashcards with target items on one side and definitions or translations on the other.

5             Receptive and productive practice

The most basic digital flashcard systems require learners to understand a target item, or to generate it from a definition or translation prompt. Valuable as this may be, it won’t help learners much to use these items productively, since these systems focus exclusively on meaning. In order to do this, information must be provided about collocation, colligation, register, etc and these aspects of word knowledge will need to be focused on within the range of task types. At the same time, most vocabulary apps that I have seen focus primarily on the written word. Although any good system will offer an audio recording of the target item, and many will offer the learner the option of recording themselves, learners are invariably asked to type in their answers, rather than say them. For the latter, speech recognition technology will be needed. Ideally, too, an intelligent system will compare learner recordings with the audio models and provide feedback in such a way that the learner is guided towards a closer reproduction of the model.

6             Scaffolding and feedback

feebuMost flashcard systems are basically low-stakes, practice self-testing. Research (see, for example, Dunlosky et al’s metastudy ‘Improving Students’ Learning With Effective Learning Techniques: Promising Directions From Cognitive and Educational Psychology’) suggests that, as a learning strategy, practice testing has high utility – indeed, of higher utility than other strategies like keyword mnemonics or highlighting. However, an element of tutoring is likely to enhance practice testing, and, for this, scaffolding and feedback will be needed. If, for example, a learner is unable to produce a correct answer, they will probably benefit from being guided towards it through hints, in the same way as a teacher would elicit in a classroom. Likewise, feedback on why an answer is wrong (as opposed to simply being told that you are wrong), followed by encouragement to try again, is likely to enhance learning. Such feedback might, for example, point out that there is perhaps a spelling problem in the learner’s attempted answer, that the attempted answer is in the wrong part of speech, or that it is semantically close to the correct answer but does not collocate with other words in the text. The incorporation of intelligent feedback of this kind will require a number of NLP tools, since it will never be possible for a human item-writer to anticipate all the possible incorrect answers. A current example of intelligent feedback of this kind can be found in the Oxford English Vocabulary Trainer app.

7             Content

At the very least, a decent vocabulary app will need good definitions and translations (how many different languages?), and these will need to be tagged to the senses of the target items. These will need to be supplemented with all the other information that you find in a good learner’s dictionary: syntactic patterns, collocations, cognates, an indication of frequency, etc. The only way of getting this kind of high-quality content is by paying to license it from a company with expertise in lexicography. It doesn’t come cheap.

There will also need to be example sentences, both to illustrate meaning / use and for deployment in tasks. Dictionary databases can provide some of these, but they cannot be relied on as a source. This is because the example sentences in dictionaries have been selected and edited to accompany the other information provided in the dictionary, and not as items in practice exercises, which have rather different requirements. Once more, the solution doesn’t come cheap: experienced item writers will be needed.

Dictionaries describe and illustrate how words are typically used. But examples of typical usage tend to be as dull as they are forgettable. Learning is likely to be enhanced if examples are cognitively salient: weird examples with odd collocations, for example. Another thing for the item writers to think about.

A further challenge for an app which is not level-specific is that both the definitions and example sentences need to be level-specific. An A1 / A2 learner will need the kind of content that is found in, say, the Oxford Essential dictionary; B2 learners and above will need content from, say, the OALD.

8             Artwork and design

My wordbook2It’s easy enough to find artwork or photos of concrete nouns, but try to find or commission a pair of pictures that differentiate, for example, the adjectives ‘wild’ and ‘dangerous’ … What kind of pictures might illustrate simple verbs like ‘learn’ or ‘remember’? Will such illustrations be clear enough when squeezed into a part of a phone screen? Animations or very short video clips might provide a solution in some cases, but these are more expensive to produce and video files are much heavier.

With a few notable exceptions, such as the British Councils’s MyWordBook 2, design in vocabulary apps has been largely forgotten.

9             Importable and personalisable lists

Many learners will want to use a vocabulary app in association with other course material (e.g. coursebooks). Teachers, however, will inevitably want to edit these lists, deleting some items, adding others. Learners will want to do the same. This is a huge headache for app designers. If new items are going to be added to word lists, how will the definitions, example sentences and illustrations be generated? Will the database contain audio recordings of these words? How will these items be added to the practice tasks (if these include task types that go beyond simple double-sided flashcards)? NLP tools are not yet good enough to trawl a large corpus in order to select (and possibly edit) sentences that illustrate the right meaning and which are appropriate for interactive practice exercises. We can personalise the speed of learning and even the types of learning tasks, so long as the target language is predetermined. But as soon as we allow for personalisation of content, we run into difficulties.

10          Gamification

Maintaining motivation to use a vocabulary app is not easy. Gamification may help. Measuring progress against objectives will be a start. Stars and badges and leaderboards may help some users. Rewards may help others. But gamification features need to be built into the heart of the system, into the design and selection of tasks, rather than simply tacked on as an afterthought. They need to be trialled and tweaked, so analytics will be needed.

11          Teacher support

Although the use of vocabulary flashcards is beginning to catch on with English language teachers, teachers need help with ways to incorporate them in the work they do with their students. What can teachers do in class to encourage use of the app? In what ways does app use require teachers to change their approach to vocabulary work in the classroom? Reporting functions can help teachers know about the progress their students are making and provide very detailed information about words that are causing problems. But, as anyone involved in platform-based course materials knows, teachers need a lot of help.

12          And, of course, …

Apps need to be usable with different operating systems. Ideally, they should be (partially) usable offline. Loading times need to be short. They need to be easy and intuitive to use.

It’s unlikely that I’ll be seeing a vocabulary app with all of these features any time soon. Or, possibly, ever. The cost of developing something that could do all this would be extremely high, and there is no indication that there is a market that would be ready to pay the sort of prices that would be needed to cover the costs of development and turn a profit. We need to bear in mind, too, the fact that vocabulary apps can only ever assist in the initial acquisition of vocabulary: apps alone can’t solve the vocabulary learning problem (despite the silly claims of some app developers). The need for meaningful communicative use, extensive reading and listening, will not go away because a learner has been using an app. So, how far can we go in developing better and better vocabulary apps before users decide that a cheap / free app, with all its shortcomings, is actually good enough?

I posted a follow up to this post in October 2016.

VocApp – a review

Posted: October 28, 2015 in apps
Tags: , , ,

Go to an app store and you’ll find a number of unrelated products called VocApp. One of them, from a Polish-based outfit, has the url. From over 30 products in the catalogue, I selected the free ‘Top 1000 English Words’: this is, after all, the showcase app which will show you how fast and easy you can learn with us (sic). VocApp Founder, Marcin Młodzki, writes that learning languages and mobile devices are my two greatest passions. Unfortunately there wasn’t any language app on the market which satisfied me in 100% (or even in 70%…). Anki, Babel, DuoLingo, Memorize, Quizlet – each of them has some serious disadvantages. So I decided to create my own app. Prof. Ewa Lajer-Burchardt of Harvard University says it’s undoubtedly one of the best flashcard applications for learning foreign languages on the educational market. This is presumably the eminent Ewa Lajer-Burcharth, a Polish art historian and author of Necklines: The Art of Jacques-Louis David After the Terror. So, how does the app stand up? Will users raise their understanding up to 83%? I was impatient to find out.common english wordsIt’s a flashcard system with spaced repetition. This particular app has target items and audio recordings on one side of the flashcard, definitions in English, along with illustrations, on the other. It is, the makers say, multisensory. Users are then given two self-evaluation options.


And that, I’m afraid, is about all there is to say. Apart, that is, from the content. Many of the definitions have been culled from Wiktionary, not perhaps the best source of definitions for A1 / A2 learners. Others appear to have been made up in-house. Here is an opportunity to raise your own understanding by up to 83%. Look at the VocApp definitions below and see if you can guess what the target word is (answers below[i]).

1 a piece of a whole

2 a) a kind of box b) a formal word for a situation

3 something people do every day e.g. from 10 o’clock to 4 o’clock to get money

4 a group of people who deal with politics and who give new rules

5 when we are born our life begins, when we die our life comes to an end.

6 an object

7 a) where the cars drive b) a method of doing something

8 The place where we live, not only the Earth, everything which exists; ‘world’ is a general world

9 a location of something

10 a) 24 hours b) when the sun is up, not night

Sorry, Marcin. I’m afraid your app didn’t satisfy me in 100% (or even in 70%…).

[i] Answers: 1 part 2 case 3 work 4 government 5 life 6 thing 7 way 8 world 9 place 10 day

MosaLingua  (with the obligatory capital letter in the middle) is a vocabulary app, available for iOS and Android. There are packages for a number of languages and English variations include general English, business English, vocabulary for TOEFL and vocabulary for TOEIC. The company follows the freemium model, with free ‘Lite’ versions and fuller content selling for €4.99. I tried the ‘Lite’ general English app, opting for French as my first language. Since the app is translation-based, you need to have one of the language pairings that are on offer (the other languages are currently Italian, Spanish, Portuguese and German).Mosalingua

The app I looked at is basically a phrase book with spaced repetition. Even though this particular app was general English, it appeared to be geared towards the casual business traveller. It uses the same algorithm as Anki, and users are taken through a sequence of (1) listening to an audio recording of the target item (word or phrase) along with the possibility of comparing a recording of yourself with the recording provided, (2) standard bilingual flashcard practice, (3) a practice stage where you are given the word or phrase in your own language and you have to unscramble words or letters to form the equivalent in English, and (4) a self-evaluation stage where users select from one of four options (“review”, “hard”, “good”, “perfect”) where the choice made will influence the re-presentation of the item within the spaced repetition.

In addition to these words and phrases, there are a number of dialogues where you (1) listen to the dialogue (‘without worrying about understanding everything’), (2) are re-exposed to the dialogue with English subtitles, (3) see it again with subtitles in your own language, (4) practise it with standard flashcards.

The developers seem to be proud of their Mosa Learning Method®: they’ve registered this as a trademark. At its heart is spaced repetition. This is supplemented by what they refer to as ‘Active Recall’, the notion that things are better memorised if the learner has to make some sort of cognitive effort, however minimal, in recalling the target items. The principle is, at least to me, unquestionable, but the realisation (unjumbling words or letters) becomes rather repetitive and, ultimately, tedious. Then, there is what they call ‘metacognition’. Again, this is informed by research, even if the realisation (self-evaluation of learning difficulty into four levels) is extremely limited. Then there is the Pareto principle  – the 80-20 rule. I couldn’t understand the explanation of what this has to do with the trademarked method. Here’s the MosaLingua explanation  – figure it out for yourself:

Did you know that the 100 most common words in English account for half of the written corpus?

Evidently, you shouldn’t quit after learning only 100 words. Instead, you should concentrate on the most frequently used words and you’ll make spectacular progress. What’s more, globish (global English) has shown that it’s possible to express yourself using only 1500 well-chosen words (which would take less than 3 months with only 10 minutes per day with MosaLingua). Once you’ve acquired this base, MosaLingua proposes specialized vocabulary suited to your needs (the application has over 3000 words).

Finally, there’s some stuff about motivation and learner psychology. This boils down to That’s why we offer free learning help via email, presenting the Web’s best resources, as well as tips through bonus material or the learning community on the MosaLingua blog. We’ll give you all the tools you need to develop your own personalized learning method that is adapted to your needs. Some of these tips are not at all bad, but there’s precious little in the way of gamification or other forms of easy motivation.

In short, it’s all reasonably respectable, despite the predilection for sciency language in the marketing blurb. But what really differentiates this product from Anki, as the founder, Samuel Michelot, points out is the content. Mosalingua has lists of vocabulary and phrases that were created by professors. The word ‘professors’ set my alarm bells ringing, and I wasn’t overly reassured when all I could find out about these ‘professors’ was the information about the MosaLingua team .professors

Despite what some people  claim, content is, actually, rather important when it comes to language learning. I’ll leave you with some examples of MosaLingua content (one dialogue and a selection of words / phrases organised by level) and you can make up your own mind.


Hi there, have a seat. What seems to be the problem?

I haven’t been feeling well since this morning. I have a very bad headache and I feel sick.

Do you feel tired? Have you had cold sweats?

Yes, I’m very tired and have had cold sweats. I have been feeling like that since this morning.

Have you been out in the sun?

Yes, this morning I was at the beach with my friends for a couple hours.

OK, it’s nothing serious. It’s just a bad case of sunstroke. You must drink lots of water and rest. I’ll prescribe you something for the headache and some after sun lotion.

Great, thank you, doctor. Bye.

You’re welcome. Bye.

Level 1: could you help me, I would like a …, I need to …, I don’t know, it’s okay, I (don’t) agree, do you speak English, to drink, to sleep, bank, I’m going to call the police

Level 2: I’m French, cheers, can you please repeat that, excuse me how can I get to …, map, turn left, corner, far (from), distance, thief, can you tell me where I can find …

Level 3: what does … mean, I’m learning English, excuse my English, famous, there, here, until, block, from, to turn, street corner, bar, nightclub, I have to be at the airport tomorrow morning

Level 4: OK, I’m thirty (years old), I love this country, how do you say …, what is it, it’s a bit like …, it’s a sort of …, it’s as small / big as …, is it far, where are we, where are we going, welcome, thanks but I can’t, how long have you been here, is this your first trip to England, take care, district / neighbourhood, in front (of)

Level 5: of course, can I ask you a question, you speak very well, I can’t find the way, David this is Julia, we meet at last, I would love to, where do you want to go, maybe another day, I’ll miss you, leave me alone, don’t touch me, what’s you email

Level 6: I’m here on a business trip, I came with some friends, where are the nightclubs, I feel like going to a bar, I can pick you up at your house, let’s go to see a movie, we had a lot of fun, come again, thanks for the invitation

Back in December 2013, in an interview with eltjam , David Liu, COO of the adaptive learning company, Knewton, described how his company’s data analysis could help ELT publishers ‘create more effective learning materials’. He focused on what he calls ‘content efficacy[i]’ (he uses the word ‘efficacy’ five times in the interview), a term which he explains below:

A good example is when we look at the knowledge graph of our partners, which is a map of how concepts relate to other concepts and prerequisites within their product. There may be two or three prerequisites identified in a knowledge graph that a student needs to learn in order to understand a next concept. And when we have hundreds of thousands of students progressing through a course, we begin to understand the efficacy of those said prerequisites, which quite frankly were made by an author or set of authors. In most cases they’re quite good because these authors are actually good in what they do. But in a lot of cases we may find that one of those prerequisites actually is not necessary, and not proven to be useful in achieving true learning or understanding of the current concept that you’re trying to learn. This is interesting information that can be brought back to the publisher as they do revisions, as they actually begin to look at the content as a whole.

One commenter on the post, Tom Ewens, found the idea interesting. It could, potentially, he wrote, give us new insights into how languages are learned much in the same way as how corpora have given us new insights into how language is used. Did Knewton have any plans to disseminate the information publicly, he asked. His question remains unanswered.

At the time, Knewton had just raised $51 million (bringing their total venture capital funding to over $105 million). Now, 16 months later, Knewton have launched their new product, which they are calling Knewton Content Insights. They describe it as the world’s first and only web-based engine to automatically extract statistics comparing the relative quality of content items — enabling us to infer more information about student proficiency and content performance than ever before possible.

The software analyses particular exercises within the learning content (and particular items within them). It measures the relative difficulty of individual items by, for example, analysing how often a question is answered incorrectly and how many tries it takes each student to answer correctly. It also looks at what they call ‘exhaustion’ – how much content students are using in a particular area – and whether they run out of content. The software can correlate difficulty with exhaustion. Lastly, it analyses what they call ‘assessment quality’ – how well  individual questions assess a student’s understanding of a topic.

Knewton’s approach is premised on the idea that learning (in this case language learning) can be broken down into knowledge graphs, in which the information that needs to be learned can be arranged and presented hierarchically. The ‘granular’ concepts are then ‘delivered’ to the learner, and Knewton’s software can optimise the delivery. The first problem, as I explored in a previous post, is that language is a messy, complex system: it doesn’t lend itself terribly well to granularisation. The second problem is that language learning does not proceed in a linear, hierarchical way: it is also messy and complex. The third is that ‘language learning content’ cannot simply be delivered: a process of mediation is unavoidable. Are the people at Knewton unaware of the extensive literature devoted to the differences between synthetic and analytic syllabuses, of the differences between product-oriented and process-oriented approaches? It would seem so.

Knewton’s ‘Content Insights’ can only, at best, provide some sort of insight into the ‘language knowledge’ part of any learning content. It can say nothing about the work that learners do to practise language skills, since these are not susceptible to granularisation: you simply can’t take a piece of material that focuses on reading or listening and analyse its ‘content efficacy at the concept level’. Because of this, I predicted (in the post about Knowledge Graphs) that the likely focus of Knewton’s analytics would be discrete item, sentence-level grammar (typically tenses). It turns out that I was right.

Knewton illustrate their new product with screen shots such as those below.















They give a specific example of the sort of questions their software can answer. It is: do students generally find the present simple tense easier to understand than the present perfect tense? Doh!

It may be the case that Knewton Content Insights might optimise the presentation of this kind of grammar, but optimisation of this presentation and practice is highly unlikely to have any impact on the rate of language acquisition. Students are typically required to study the present perfect at every level from ‘elementary’ upwards. They have to do this, not because the presentation in, say, Headway, is not optimised. What they need is to spend a significantly greater proportion of their time on ‘language use’ and less on ‘language knowledge’. This is not just my personal view: it has been extensively researched, and I am unaware of any dissenting voices.

The number-crunching in Knewton Content Insights is unlikely, therefore, to lead to any actionable insights. It is, however, very likely to lead (as writer colleagues at Pearson and other publishers are finding out) to an obsession with measuring the ‘efficacy’ of material which, quite simply, cannot meaningfully be measured in this way. It is likely to distract from much more pressing issues, notably the question of how we can move further and faster away from peddling sentence-level, discrete-item grammar.

In the long run, it is reasonable to predict that the attempt to optimise the delivery of language knowledge will come to be seen as an attempt to tackle the wrong question. It will make no significant difference to language learners and language learning. In the short term, how much time and money will be wasted?

[i] ‘Efficacy’ is the buzzword around which Pearson has built its materials creation strategy, a strategy which was launched around the same time as this interview. Pearson is a major investor in Knewton.

‘Sticky’ – as in ‘sticky learning’ or ‘sticky content’ (as opposed to ‘sticky fingers’ or a ‘sticky problem’) – is itself fast becoming a sticky word. If you check out ‘sticky learning’ on Google Trends, you’ll see that it suddenly spiked in September 2011, following the slightly earlier appearance of ‘sticky content’. The historical rise in this use of the word coincides with the exponential growth in the number of references to ‘big data’.

I am often asked if adaptive learning really will take off as a big thing in language learning. Will adaptivity itself be a sticky idea? When the question is asked, people mean the big data variety of adaptive learning, rather than the much more limited adaptivity of spaced repetition algorithms, which, I think, is firmly here and here to stay. I can’t answer the question with any confidence, but I recently came across a book which suggests a useful way of approaching the question.

41u+NEyWjnL._SY344_BO1,204,203,200_‘From the Ivory Tower to the Schoolhouse’ by Jack Schneider (Harvard Education Press, 2014) investigates the reasons why promising ideas from education research fail to get taken up by practitioners, and why other, less-than-promising ideas, from a research or theoretical perspective, become sticky quite quickly. As an example of the former, Schneider considers Robert Sternberg’s ‘Triarchic Theory’. As an example of the latter, he devotes a chapter to Howard Gardner’s ‘Multiple Intelligences Theory’.

Schneider argues that educational ideas need to possess four key attributes in order for teachers to sit up, take notice and adopt them.

  1. perceived significance: the idea must answer a question central to the profession – offering a big-picture understanding rather than merely one small piece of a larger puzzle
  2. philosophical compatibility: the idea must clearly jibe with closely held [teacher] beliefs like the idea that teachers are professionals, or that all children can learn
  3. occupational realism: it must be possible for the idea to be put easily into immediate use
  4. transportability: the idea needs to find its practical expression in a form that teachers can access and use at the time that they need it – it needs to have a simple core that can travel through pre-service coursework, professional development seminars, independent study and peer networks

To what extent does big data adaptive learning possess these attributes? It certainly comes up trumps with respect to perceived significance. The big question that it attempts to answer is the question of how we can make language learning personalized / differentiated / individualised. As its advocates never cease to remind us, adaptive learning holds out the promise of moving away from a one-size-fits-all approach. The extent to which it can keep this promise is another matter, of course. For it to do so, it will never be enough just to offer different pathways through a digitalised coursebook (or its equivalent). Much, much more content will be needed: at least five or six times the content of a one-size-fits-all coursebook. At the moment, there is little evidence of the necessary investment into content being made (quite the opposite, in fact), but the idea remains powerful nevertheless.

When it comes to philosophical compatibility, adaptive learning begins to run into difficulties. Despite the decades of edging towards more communicative approaches in language teaching, research (e.g. the research into English teaching in Turkey described in a previous post), suggests that teachers still see explanation and explication as key functions of their jobs. They believe that they know their students best and they know what is best for them. Big data adaptive learning challenges these beliefs head on. It is no doubt for this reason that companies like Knewton make such a point of claiming that their technology is there to help teachers. But Jose Ferreira doth protest too much, methinks. Platform-delivered adaptive learning is a direct threat to teachers’ professionalism, their salaries and their jobs.

Occupational realism is more problematic still. Very, very few language teachers around the world have any experience of truly blended learning, and it’s very difficult to envisage precisely what it is that the teacher should be doing in a classroom. Publishers moving towards larger-scale blended adaptive materials know that this is a big problem, and are actively looking at ways of packaging teacher training / teacher development (with a specific focus on blended contexts) into the learner-facing materials that they sell. But the problem won’t go away. Education ministries have a long history of throwing money at technological ‘solutions’ without thinking about obtaining the necessary buy-in from their employees. It is safe to predict that this is something that is unlikely to change. Moreover, learning how to become a blended teacher is much harder than learning, say, how to make good use of an interactive whiteboard. Since there are as many different blended adaptive approaches as there are different educational contexts, there cannot be (irony of ironies) a one-size-fits-all approach to training teachers to make good use of this software.

Finally, how transportable is big data adaptive learning? Not very, is the short answer, and for the same reasons that ‘occupational realism’ is highly problematic.

Looking at things through Jack Schneider’s lens, we might be tempted to come to the conclusion that the future for adaptive learning is a rocky path, at best. But Schneider doesn’t take political or economic considerations into account. Sternberg’s ‘Triarchic Theory’ never had the OECD or the Gates Foundation backing it up. It never had millions and millions of dollars of investment behind it. As we know from political elections (and the big data adaptive learning issue is a profoundly political one), big bucks can buy opinions.

It may also prove to be the case that the opinions of teachers don’t actually matter much. If the big adaptive bucks can win the educational debate at the highest policy-making levels, teachers will be the first victims of the ‘creative disruption’ that adaptivity promises. If you don’t believe me, just look at what is going on in the U.S.

There are causes for concern, but I don’t want to sound too alarmist. Nobody really has a clue whether big data adaptivity will actually work in language learning terms. It remains more of a theory than a research-endorsed practice. And to end on a positive note, regardless of how sticky it proves to be, it might just provide the shot-in-the-arm realisation that language teachers, at their best, are a lot more than competent explainers of grammar or deliverers of gap-fills.