Screenshot_2016-04-29-09-48-05I call Lern Deutsch a vocabulary app, although it’s more of a game than anything else. Developed by the Goethe Institute, the free app was probably designed primarily as a marketing tool rather than a serious attempt to develop an educational language app. It’s available for speakers of Arabic, English, Spanish, Italian, French, Italian, Portuguese and Russian. It’s aimed at A1 learners.

Users of the app create an avatar and roam around a virtual city, learning new vocabulary and practising situational language. They can interact in language challenges with other players. As they explore, they earn Goethe coins, collect accessories for their avatars and progress up a leader board.Screenshot_2016-04-29-09-50-12

As they explore the virtual city, populated by other avatars, they find objects that can be clicked on to add to their vocabulary list. They hear a recording of an example sentence containing the target word, with the word gapped and three multiple choice possibilities. They are then required to type the missing word (see the image below). After collecting a certain number of words, they complete exercises which include the following task types:

  • Jumbled sentences
  • Audio recording of individual words and multiple choice selection
  • Gapped sentences with multiple choice answers
  • Dictation
  • Example sentences containing target item and multiple choice pictures
  • Typing sentences which are buried in a string of random letters












The developers have focused their attention on providing variety: engagement and ‘fun’ override other considerations. But how does the app stand up as a language learning tool? Surprisingly, for something developed by the Goethe Institute, it’s less than impressive.

The words that you collect as you navigate the virtual city are all nouns (Hotel, Auto, Mann, Banane, etc), but some (e.g. Sehenswurdigkeit) seem out of level. Any app that uses illustrations as the basic means of conveying meaning runs into problems when it moves away from concrete nouns, but a diet of nouns only (as here) is of necessarily limited value. Other parts of speech are introduced via the example sentences, but no help with meaning is provided so when you come across the word for ‘egg’, for example, your example sentence is ‘Ich möchte das Frühstück mit Ei.’ It’s all very well embedding the target vocabulary in example sentences that have a functional value, but example sentences are only of value if they are understandable: the app badly needs a look-up function for the surrounding language.

The practice exercises are varied, too, but they also vary in their level of difficulty. It makes sense to do receptive / recognition tasks before productive ones, but there is no evidence that I could see of pedagogical considerations of this kind. Neither does there seem to be any spaced repetition at work: the app is driven by the needs of the game design rather than any learning principles.

It’s unclear to me who the app is for. The functional language that is presented is adult: the situations are adult situations (buying a bed, booking a hotel room, ordering a beer). However, the graphic design and the gamification features are juvenile (adding a pirate patch to your avatar, for example).

The lack of attention to the business of learning is especially striking in the English of the English language version that I used. The number of examples of dodgy English that I came across do not inspire confidence.

  • Quite alright! You win your first Goethe coin.
  • What sightseeings do you spot in the city center and the train station?
  • Have a picknick in the park. You now have a picnic in the park with the musician.
  • You still search for your teacher. Whom do you meet in the park? What do they work?


All in all, it’s an interesting example of a gamified approach to language, and other app developers may find ideas here that they could do something with. It’s of less interest, though, to anyone who wants to learn a bit of German.

Having spent a lot of time recently looking at vocabulary apps, I decided to put together a Christmas wish list of the features of my ideal vocabulary app. The list is not exhaustive and I’ve given more attention to some features than others. What (apart from testing) have I missed out?

1             Spaced repetition

Since the point of a vocabulary app is to help learners memorise vocabulary items, it is hard to imagine a decent system that does not incorporate spaced repetition. Spaced repetition algorithms offer one well-researched way of improving the brain’s ‘forgetting curve’. These algorithms come in different shapes and sizes, and I am not technically competent to judge which is the most efficient. However, as Peter Ellis Jones, the developer of a flashcard system called CardFlash, points out, efficiency is only one half of the rote memorisation problem. If you are not motivated to learn, the cleverness of the algorithm is moot. Fundamentally, learning software needs to be fun, rewarding, and give a solid sense of progression.

2             Quantity, balance and timing of new and ‘old’ items

A spaced repetition algorithm determines the optimum interval between repetitions, but further algorithms will be needed to determine when and with what frequency new items will be added to the deck. Once a system knows how many items a learner needs to learn and the time in which they have to do it, it is possible to determine the timing and frequency of the presentation of new items. But the system cannot know in advance how well an individual learner will learn the items (for any individual, some items will be more readily learnable than others) nor the extent to which learners will live up to their own positive expectations of time spent on-app. As most users of flashcard systems know, it is easy to fall behind, feel swamped and, ultimately, give up. An intelligent system needs to be able to respond to individual variables in order to ensure that the learning load is realistic.

3             Task variety

A standard flashcard system which simply asks learners to indicate whether they ‘know’ a target item before they flip over the card rapidly becomes extremely boring. A system which tests this knowledge soon becomes equally dull. There needs to be a variety of ways in which learners interact with an app, both for reasons of motivation and learning efficiency. It may be the case that, for an individual user, certain task types lead to more rapid gains in learning. An intelligent, adaptive system should be able to capture this information and modify the selection of task types.

Most younger learners and some adult learners will respond well to the inclusion of games within the range of task types. Examples of such games include the puzzles developed by Oliver Rose in his Phrase Maze app to accompany Quizlet practice.Phrase Maze 1Phrase Maze 2

4             Generative use

Memory researchers have long known about the ‘Generation Effect’ (see for example this piece of research from the Journal of Verbal Learning and Learning Behavior, 1978). Items are better learnt when the learner has to generate, in some (even small) way, the target item, rather than simply reading it. In vocabulary learning, this could be, for example, typing in the target word or, more simply, inserting some missing letters. Systems which incorporate task types that require generative use are likely to result in greater learning gains than simple, static flashcards with target items on one side and definitions or translations on the other.

5             Receptive and productive practice

The most basic digital flashcard systems require learners to understand a target item, or to generate it from a definition or translation prompt. Valuable as this may be, it won’t help learners much to use these items productively, since these systems focus exclusively on meaning. In order to do this, information must be provided about collocation, colligation, register, etc and these aspects of word knowledge will need to be focused on within the range of task types. At the same time, most vocabulary apps that I have seen focus primarily on the written word. Although any good system will offer an audio recording of the target item, and many will offer the learner the option of recording themselves, learners are invariably asked to type in their answers, rather than say them. For the latter, speech recognition technology will be needed. Ideally, too, an intelligent system will compare learner recordings with the audio models and provide feedback in such a way that the learner is guided towards a closer reproduction of the model.

6             Scaffolding and feedback

feebuMost flashcard systems are basically low-stakes, practice self-testing. Research (see, for example, Dunlosky et al’s metastudy ‘Improving Students’ Learning With Effective Learning Techniques: Promising Directions From Cognitive and Educational Psychology’) suggests that, as a learning strategy, practice testing has high utility – indeed, of higher utility than other strategies like keyword mnemonics or highlighting. However, an element of tutoring is likely to enhance practice testing, and, for this, scaffolding and feedback will be needed. If, for example, a learner is unable to produce a correct answer, they will probably benefit from being guided towards it through hints, in the same way as a teacher would elicit in a classroom. Likewise, feedback on why an answer is wrong (as opposed to simply being told that you are wrong), followed by encouragement to try again, is likely to enhance learning. Such feedback might, for example, point out that there is perhaps a spelling problem in the learner’s attempted answer, that the attempted answer is in the wrong part of speech, or that it is semantically close to the correct answer but does not collocate with other words in the text. The incorporation of intelligent feedback of this kind will require a number of NLP tools, since it will never be possible for a human item-writer to anticipate all the possible incorrect answers. A current example of intelligent feedback of this kind can be found in the Oxford English Vocabulary Trainer app.

7             Content

At the very least, a decent vocabulary app will need good definitions and translations (how many different languages?), and these will need to be tagged to the senses of the target items. These will need to be supplemented with all the other information that you find in a good learner’s dictionary: syntactic patterns, collocations, cognates, an indication of frequency, etc. The only way of getting this kind of high-quality content is by paying to license it from a company with expertise in lexicography. It doesn’t come cheap.

There will also need to be example sentences, both to illustrate meaning / use and for deployment in tasks. Dictionary databases can provide some of these, but they cannot be relied on as a source. This is because the example sentences in dictionaries have been selected and edited to accompany the other information provided in the dictionary, and not as items in practice exercises, which have rather different requirements. Once more, the solution doesn’t come cheap: experienced item writers will be needed.

Dictionaries describe and illustrate how words are typically used. But examples of typical usage tend to be as dull as they are forgettable. Learning is likely to be enhanced if examples are cognitively salient: weird examples with odd collocations, for example. Another thing for the item writers to think about.

A further challenge for an app which is not level-specific is that both the definitions and example sentences need to be level-specific. An A1 / A2 learner will need the kind of content that is found in, say, the Oxford Essential dictionary; B2 learners and above will need content from, say, the OALD.

8             Artwork and design

My wordbook2It’s easy enough to find artwork or photos of concrete nouns, but try to find or commission a pair of pictures that differentiate, for example, the adjectives ‘wild’ and ‘dangerous’ … What kind of pictures might illustrate simple verbs like ‘learn’ or ‘remember’? Will such illustrations be clear enough when squeezed into a part of a phone screen? Animations or very short video clips might provide a solution in some cases, but these are more expensive to produce and video files are much heavier.

With a few notable exceptions, such as the British Councils’s MyWordBook 2, design in vocabulary apps has been largely forgotten.

9             Importable and personalisable lists

Many learners will want to use a vocabulary app in association with other course material (e.g. coursebooks). Teachers, however, will inevitably want to edit these lists, deleting some items, adding others. Learners will want to do the same. This is a huge headache for app designers. If new items are going to be added to word lists, how will the definitions, example sentences and illustrations be generated? Will the database contain audio recordings of these words? How will these items be added to the practice tasks (if these include task types that go beyond simple double-sided flashcards)? NLP tools are not yet good enough to trawl a large corpus in order to select (and possibly edit) sentences that illustrate the right meaning and which are appropriate for interactive practice exercises. We can personalise the speed of learning and even the types of learning tasks, so long as the target language is predetermined. But as soon as we allow for personalisation of content, we run into difficulties.

10          Gamification

Maintaining motivation to use a vocabulary app is not easy. Gamification may help. Measuring progress against objectives will be a start. Stars and badges and leaderboards may help some users. Rewards may help others. But gamification features need to be built into the heart of the system, into the design and selection of tasks, rather than simply tacked on as an afterthought. They need to be trialled and tweaked, so analytics will be needed.

11          Teacher support

Although the use of vocabulary flashcards is beginning to catch on with English language teachers, teachers need help with ways to incorporate them in the work they do with their students. What can teachers do in class to encourage use of the app? In what ways does app use require teachers to change their approach to vocabulary work in the classroom? Reporting functions can help teachers know about the progress their students are making and provide very detailed information about words that are causing problems. But, as anyone involved in platform-based course materials knows, teachers need a lot of help.

12          And, of course, …

Apps need to be usable with different operating systems. Ideally, they should be (partially) usable offline. Loading times need to be short. They need to be easy and intuitive to use.

It’s unlikely that I’ll be seeing a vocabulary app with all of these features any time soon. Or, possibly, ever. The cost of developing something that could do all this would be extremely high, and there is no indication that there is a market that would be ready to pay the sort of prices that would be needed to cover the costs of development and turn a profit. We need to bear in mind, too, the fact that vocabulary apps can only ever assist in the initial acquisition of vocabulary: apps alone can’t solve the vocabulary learning problem (despite the silly claims of some app developers). The need for meaningful communicative use, extensive reading and listening, will not go away because a learner has been using an app. So, how far can we go in developing better and better vocabulary apps before users decide that a cheap / free app, with all its shortcomings, is actually good enough?

I posted a follow up to this post in October 2016.

51Fgn6C4sWL__SY344_BO1,204,203,200_Decent research into adaptive learning remains very thin on the ground. Disappointingly, the Journal of Learning Analytics has only managed one issue so far in 2015, compared to three in 2014. But I recently came across an article in Vol. 18 (pp. 111 – 125) of  Informing Science: the International Journal of an Emerging Transdiscipline entitled Informing and performing: A study comparing adaptive learning to traditional learning by Murray, M. C., & Pérez, J. of Kennesaw State University.

The article is worth reading, not least because of the authors’ digestible review of  adaptive learning theory and their discussion of levels of adaptation, including a handy diagram (see below) which they have reproduced from a white paper by Tyton Partners ‘Learning to Adapt: Understanding the Adaptive Learning Supplier Landscape’. Murray and Pérez make clear that adaptive learning theory is closely connected to the belief that learning is improved when instruction is personalized — adapted to individual learning styles, but their approach is surprisingly uncritical. They write, for example, that the general acceptance of learning styles is evidenced in recommended teaching strategies in nearly every discipline, and learning styles continue to inform the evolution of adaptive learning systems, and quote from the much-quoted Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008) Learning styles: concepts and evidence, Psychological Science in the Public Interest, 9, 105–119. But Pashler et al concluded that the current evidence supporting the use of learning style-matched approaches is virtually non-existent (see here for a review of Pashler et al). And, in the world of ELT, an article in the latest edition of ELTJ by Carol Lethaby and Patricia Harries disses learning styles and other neuromyths. Given the close connection between adaptive learning theory and learning styles, one might reasonably predict that a comparative study of adaptive learning and traditional learning would not come out with much evidence in support of the former.

adaptive_taxonomyMurray and Pérez set out, anyway, to explore the hypothesis that adapting instruction to an individual’s learning style results in better learning outcomes. Their study compared adaptive and traditional methods in a university-level digital literacy course. Their conclusion? This study and a few others like it indicate that today’s adaptive learning systems have negligible impact on learning outcomes.

I was, however, more interested in the comments which followed this general conclusion. They point out that learning outcomes are only one measure of quality. Others, such as student persistence and engagement, they claim, can be positively affected by the employment of adaptive systems. I am not convinced. I think it’s simply far too soon to be able to judge this, and we need to wait quite some time for novelty effects to wear off. Murray and Pérez provide two references in support of their claim. One is an article by Josh Jarrett, Bigfoot, Goldilocks, and Moonshots: A Report from the Frontiers of Personalized Learning in Educause. Jarrett is Deputy Director for Postsecondary Success at the Bill & Melinda Gates Foundation and Educause is significantly funded by the Gates Foundation. Not, therefore, an entirely unbiased and trustworthy source. The other is a journalistic piece in Forbes. It’s by Tim Zimmer, entitled Rethinking higher ed: A case for adaptive learning and it reads like an advert. Zimmer is a ‘CCAP contributor’. CCAP is the Centre for College Affordability and Productivity, a libertarian, conservative foundation with a strong privatization agenda. Not, therefore, a particularly reliable source, either.

Despite their own findings, Murray and Pérez follow up their claim about student persistence and engagement with what they describe as a more compelling still argument for adaptive learning. This, they say, is the intuitively appealing case for adaptive learning systems as engines with which institutions can increase access and reduce costs. Ah, now we’re getting to the point!













VocApp – a review

Posted: October 28, 2015 in apps
Tags: , , ,

Go to an app store and you’ll find a number of unrelated products called VocApp. One of them, from a Polish-based outfit, has the url. From over 30 products in the catalogue, I selected the free ‘Top 1000 English Words’: this is, after all, the showcase app which will show you how fast and easy you can learn with us (sic). VocApp Founder, Marcin Młodzki, writes that learning languages and mobile devices are my two greatest passions. Unfortunately there wasn’t any language app on the market which satisfied me in 100% (or even in 70%…). Anki, Babel, DuoLingo, Memorize, Quizlet – each of them has some serious disadvantages. So I decided to create my own app. Prof. Ewa Lajer-Burchardt of Harvard University says it’s undoubtedly one of the best flashcard applications for learning foreign languages on the educational market. This is presumably the eminent Ewa Lajer-Burcharth, a Polish art historian and author of Necklines: The Art of Jacques-Louis David After the Terror. So, how does the app stand up? Will users raise their understanding up to 83%? I was impatient to find out.common english wordsIt’s a flashcard system with spaced repetition. This particular app has target items and audio recordings on one side of the flashcard, definitions in English, along with illustrations, on the other. It is, the makers say, multisensory. Users are then given two self-evaluation options.


And that, I’m afraid, is about all there is to say. Apart, that is, from the content. Many of the definitions have been culled from Wiktionary, not perhaps the best source of definitions for A1 / A2 learners. Others appear to have been made up in-house. Here is an opportunity to raise your own understanding by up to 83%. Look at the VocApp definitions below and see if you can guess what the target word is (answers below[i]).

1 a piece of a whole

2 a) a kind of box b) a formal word for a situation

3 something people do every day e.g. from 10 o’clock to 4 o’clock to get money

4 a group of people who deal with politics and who give new rules

5 when we are born our life begins, when we die our life comes to an end.

6 an object

7 a) where the cars drive b) a method of doing something

8 The place where we live, not only the Earth, everything which exists; ‘world’ is a general world

9 a location of something

10 a) 24 hours b) when the sun is up, not night

Sorry, Marcin. I’m afraid your app didn’t satisfy me in 100% (or even in 70%…).

[i] Answers: 1 part 2 case 3 work 4 government 5 life 6 thing 7 way 8 world 9 place 10 day

MosaLingua  (with the obligatory capital letter in the middle) is a vocabulary app, available for iOS and Android. There are packages for a number of languages and English variations include general English, business English, vocabulary for TOEFL and vocabulary for TOEIC. The company follows the freemium model, with free ‘Lite’ versions and fuller content selling for €4.99. I tried the ‘Lite’ general English app, opting for French as my first language. Since the app is translation-based, you need to have one of the language pairings that are on offer (the other languages are currently Italian, Spanish, Portuguese and German).Mosalingua

The app I looked at is basically a phrase book with spaced repetition. Even though this particular app was general English, it appeared to be geared towards the casual business traveller. It uses the same algorithm as Anki, and users are taken through a sequence of (1) listening to an audio recording of the target item (word or phrase) along with the possibility of comparing a recording of yourself with the recording provided, (2) standard bilingual flashcard practice, (3) a practice stage where you are given the word or phrase in your own language and you have to unscramble words or letters to form the equivalent in English, and (4) a self-evaluation stage where users select from one of four options (“review”, “hard”, “good”, “perfect”) where the choice made will influence the re-presentation of the item within the spaced repetition.

In addition to these words and phrases, there are a number of dialogues where you (1) listen to the dialogue (‘without worrying about understanding everything’), (2) are re-exposed to the dialogue with English subtitles, (3) see it again with subtitles in your own language, (4) practise it with standard flashcards.

The developers seem to be proud of their Mosa Learning Method®: they’ve registered this as a trademark. At its heart is spaced repetition. This is supplemented by what they refer to as ‘Active Recall’, the notion that things are better memorised if the learner has to make some sort of cognitive effort, however minimal, in recalling the target items. The principle is, at least to me, unquestionable, but the realisation (unjumbling words or letters) becomes rather repetitive and, ultimately, tedious. Then, there is what they call ‘metacognition’. Again, this is informed by research, even if the realisation (self-evaluation of learning difficulty into four levels) is extremely limited. Then there is the Pareto principle  – the 80-20 rule. I couldn’t understand the explanation of what this has to do with the trademarked method. Here’s the MosaLingua explanation  – figure it out for yourself:

Did you know that the 100 most common words in English account for half of the written corpus?

Evidently, you shouldn’t quit after learning only 100 words. Instead, you should concentrate on the most frequently used words and you’ll make spectacular progress. What’s more, globish (global English) has shown that it’s possible to express yourself using only 1500 well-chosen words (which would take less than 3 months with only 10 minutes per day with MosaLingua). Once you’ve acquired this base, MosaLingua proposes specialized vocabulary suited to your needs (the application has over 3000 words).

Finally, there’s some stuff about motivation and learner psychology. This boils down to That’s why we offer free learning help via email, presenting the Web’s best resources, as well as tips through bonus material or the learning community on the MosaLingua blog. We’ll give you all the tools you need to develop your own personalized learning method that is adapted to your needs. Some of these tips are not at all bad, but there’s precious little in the way of gamification or other forms of easy motivation.

In short, it’s all reasonably respectable, despite the predilection for sciency language in the marketing blurb. But what really differentiates this product from Anki, as the founder, Samuel Michelot, points out is the content. Mosalingua has lists of vocabulary and phrases that were created by professors. The word ‘professors’ set my alarm bells ringing, and I wasn’t overly reassured when all I could find out about these ‘professors’ was the information about the MosaLingua team .professors

Despite what some people  claim, content is, actually, rather important when it comes to language learning. I’ll leave you with some examples of MosaLingua content (one dialogue and a selection of words / phrases organised by level) and you can make up your own mind.


Hi there, have a seat. What seems to be the problem?

I haven’t been feeling well since this morning. I have a very bad headache and I feel sick.

Do you feel tired? Have you had cold sweats?

Yes, I’m very tired and have had cold sweats. I have been feeling like that since this morning.

Have you been out in the sun?

Yes, this morning I was at the beach with my friends for a couple hours.

OK, it’s nothing serious. It’s just a bad case of sunstroke. You must drink lots of water and rest. I’ll prescribe you something for the headache and some after sun lotion.

Great, thank you, doctor. Bye.

You’re welcome. Bye.

Level 1: could you help me, I would like a …, I need to …, I don’t know, it’s okay, I (don’t) agree, do you speak English, to drink, to sleep, bank, I’m going to call the police

Level 2: I’m French, cheers, can you please repeat that, excuse me how can I get to …, map, turn left, corner, far (from), distance, thief, can you tell me where I can find …

Level 3: what does … mean, I’m learning English, excuse my English, famous, there, here, until, block, from, to turn, street corner, bar, nightclub, I have to be at the airport tomorrow morning

Level 4: OK, I’m thirty (years old), I love this country, how do you say …, what is it, it’s a bit like …, it’s a sort of …, it’s as small / big as …, is it far, where are we, where are we going, welcome, thanks but I can’t, how long have you been here, is this your first trip to England, take care, district / neighbourhood, in front (of)

Level 5: of course, can I ask you a question, you speak very well, I can’t find the way, David this is Julia, we meet at last, I would love to, where do you want to go, maybe another day, I’ll miss you, leave me alone, don’t touch me, what’s you email

Level 6: I’m here on a business trip, I came with some friends, where are the nightclubs, I feel like going to a bar, I can pick you up at your house, let’s go to see a movie, we had a lot of fun, come again, thanks for the invitation

In ELT circles, ‘behaviourism’ is a boo word. In the standard history of approaches to language teaching (characterised as a ‘procession of methods’ by Hunter & Smith 2012: 432[1]), there were the bad old days of behaviourism until Chomsky came along, savaged the theory in his review of Skinner’s ‘Verbal Behavior’, and we were all able to see the light. In reality, of course, things weren’t quite like that. The debate between Chomsky and the behaviourists is far from over, behaviourism was not the driving force behind the development of audiolingual approaches to language teaching, and audiolingualism is far from dead. For an entertaining and eye-opening account of something much closer to reality, I would thoroughly recommend a post on Russ Mayne’s Evidence Based ELT blog, along with the discussion which follows it. For anyone who would like to understand what behaviourism is, was, and is not (before they throw the term around as an insult), I’d recommend John A. Mills’ ‘Control: A History of Behavioral Psychology’ (New York University Press, 1998) and John Staddon’s ‘The New Behaviorism 2nd edition’ (Psychology Press, 2014).

There is a close connection between behaviourism and adaptive learning. Audrey Watters, no fan of adaptive technology, suggests that ‘any company touting adaptive learning software’ has been influenced by Skinner. In a more extended piece, ‘Education Technology and Skinner’s Box, Watters explores further her problems with Skinner and the educational technology that has been inspired by behaviourism. But writers much more sympathetic to adaptive learning, also see close connections to behaviourism. ‘The development of adaptive learning systems can be considered as a transformation of teaching machines,’ write Kara & Sevim[2] (2013: 114 – 117), although they go on to point out the differences between the two. Vendors of adaptive learning products, like DreamBox Learning©, are not shy of associating themselves with behaviourism: ‘Adaptive learning has been with us for a while, with its history of adaptive learning rooted in cognitive psychology, beginning with the work of behaviorist B.F. Skinner in the 1950s, and continuing through the artificial intelligence movement of the 1970s.’

That there is a strong connection between adaptive learning and behaviourism is indisputable, but I am not interested in attempting to establish the strength of that connection. This would, in any case, be an impossible task without some reductionist definition of both terms. Instead, my interest here is to explore some of the parallels between the two, and, in the spirit of the topic, I’d like to do this by comparing the behaviours of behaviourists and adaptive learning scientists.

Data and theory

Both behaviourism and adaptive learning (in its big data form) are centrally concerned with behaviour – capturing and measuring it in an objective manner. In both, experimental observation and the collection of ‘facts’ (physical, measurable, behavioural occurrences) precede any formulation of theory. John Mills’ description of behaviourists could apply equally well to adaptive learning scientists: theory construction was a seesaw process whereby one began with crude outgrowths from observations and slowly created one’s theory in such a way that one could make more and more precise observations, building those observations into the theory at each stage. No behaviourist ever considered the possibility of taking existing comprehensive theories of mind and testing or refining them.[3]

Positivism and the panopticon

Both behaviourism and adaptive learning are pragmatically positivist, believing that truth can be established by the study of facts. J. B. Watson, the founding father of behaviourism whose article ‘Psychology as the Behaviorist Views Itset the behaviourist ball rolling, believed that experimental observation could ‘reveal everything that can be known about human beings’[4]. Jose Ferreira of Knewton has made similar claims: We get five orders of magnitude more data per user than Google does. We get more data about people than any other data company gets about people, about anything — and it’s not even close. We’re looking at what you know, what you don’t know, how you learn best. […] We know everything about what you know and how you learn best because we get so much data. Digital data analytics offer something that Watson couldn’t have imagined in his wildest dreams, but he would have approved.

happiness industryThe revolutionary science

Big data (and the adaptive learning which is a part of it) is presented as a game-changer: The era of big data challenges the way we live and interact with the world. […] Society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what. This overturns centuries of established practices and challenges our most basic understanding of how to make decisions and comprehend reality[5]. But the reverence for technology and the ability to reach understandings of human beings by capturing huge amounts of behavioural data was adumbrated by Watson a century before big data became a widely used term. Watson’s 1913 lecture at Columbia University was ‘a clear pitch’[6] for the supremacy of behaviourism, and its potential as a revolutionary science.

Prediction and controlnudge

The fundamental point of both behaviourism and adaptive learning is the same. The research practices and the theorizing of American behaviourists until the mid-1950s, writes Mills[7] were driven by the intellectual imperative to create theories that could be used to make socially useful predictions. Predictions are only useful to the extent that they can be used to manipulate behaviour. Watson states this very baldly: the theoretical goal of psychology is the prediction and control of behaviour[8]. Contemporary iterations of behaviourism, such as behavioural economics or nudge theory (see, for example, Thaler & Sunstein’s best-selling ‘Nudge’, Penguin Books, 2008), or the British government’s Behavioural Insights Unit, share the same desire to divert individual activity towards goals (selected by those with power), ‘without either naked coercion or democratic deliberation’[9]. Jose Ferreira of Knewton has an identical approach: We can predict failure in advance, which means we can pre-remediate it in advance. We can say, “Oh, she’ll struggle with this, let’s go find the concept from last year’s materials that will help her not struggle with it.” Like the behaviourists, Ferreira makes grand claims about the social usefulness of his predict-and-control technology: The end is a really simple mission. Only 22% of the world finishes high school, and only 55% finish sixth grade. Those are just appalling numbers. As a species, we’re wasting almost four-fifths of the talent we produce. […] I want to solve the access problem for the human race once and for all.


Because they rely on capturing large amounts of personal data, both behaviourism and adaptive learning quickly run into ethical problems. Even where informed consent is used, the subjects must remain partly ignorant of exactly what is being tested, or else there is the fear that they might adjust their behaviour accordingly. The goal is to minimise conscious understanding of what is going on[10]. For adaptive learning, the ethical problem is much greater because of the impossibility of ensuring the security of this data. Everything is hackable.


Behaviourism was seen as a god-send by the world of advertising. J. B. Watson, after a front-page scandal about his affair with a student, and losing his job at John Hopkins University, quickly found employment on Madison Avenue. ‘Scientific advertising’, as practised by the Mad Men from the 1920s onwards, was based on behaviourism. The use of data analytics by Google, Amazon, et al is a direct descendant of scientific advertising, so it is richly appropriate that adaptive learning is the child of data analytics.

[1] Hunter, D. and Smith, R. (2012) ‘Unpacking the past: “CLT” through ELTJ keywords’. ELT Journal, 66/4: 430-439.

[2] Kara, N. & Sevim, N. 2013. ‘Adaptive learning systems: beyond teaching machines’, Contemporary Educational Technology, 4(2), 108-120

[3] Mills, J. A. (1998) Control: A History of Behavioral Psychology. New York: New York University Press, p.5

[4] Davies, W. (2015) The Happiness Industry. London: Verso. p.91

[5] Mayer-Schönberger, V. & Cukier, K. (2013) Big Data. London: John Murray, p.7

[6] Davies, W. (2015) The Happiness Industry. London: Verso. p.87

[7] Mills, J. A. (1998) Control: A History of Behavioral Psychology. New York: New York University Press, p.2

[8] Watson, J. B. (1913) ‘Behaviorism as the Psychologist Views it’ Psychological Review 20: 158

[9] Davies, W. (2015) The Happiness Industry. London: Verso. p.88

[10] Davies, W. (2015) The Happiness Industry. London: Verso. p.92

Then and now in educationThe School of Tomorrow will pay far more attention to individuals than the schools of the past. Each child will be studied and measured repeatedly from many angles, both as a basis of prescriptions for treatment and as a means of controlling development. The new education will be scientific in that it will rest on a fact basis. All development of knowledge and skill will be individualized, and classroom practice and recitation as they exist today in conventional schools will largely disappear. […] Experiments in laboratories and in schools of education [will discover] what everyone should know and the best way to learn essential elements.

This is not, you may be forgiven for thinking, from a Knewton blog post. It was written in 1924 and comes from Otis W. Caldwell & Stuart A. Courtis Then and Now in Education, 1845: 1923 (New York: Appleton) and is cited in Petrina, S. 2002. ‘Getting a Purchase on “The School of Tomorrow” and its Constituent Commodities: Histories and Historiographies of Technologies’ History of Education Quarterly, Vol. 42, No. 1 (Spring, 2002), pp. 75-111.

presseyIn the same year that Caldwell and Courtis predicted the School of Tomorrow, Sidney Pressey, ‘contrived an intelligence testing machine, which he transformed during 1924-1934 into an ‘Automatic Teacher.’ His machine automated and individualized routine classroom processes such as testing and drilling. It could reduce the burden of testing and scoring for teachers and therapeutically treat students after examination and diagnosis’ (Petrina, p. 99). Six years later, the ‘Automatic Teacher’ was recognised as a commercial failure. For more on Pressey’s machine (including a video of Pressey demonstrating it), see Audrey Watter’s excellent piece.

Caldwell, Courtis and Pressey are worth bearing in mind when you read the predictions of people like Knewton’s Jose Ferreira. Here are a few of his ‘Then and Now’ predictions:

“Online learning” will soon be known simply as “learning.” All of the world’s education content is being digitized right now, and that process will be largely complete within five years. (01.09.2010)

There will soon be lots of wonderful adaptive learning apps: adaptive quizzing apps, flashcard apps, textbook apps, simulation apps — if you can imagine it, someone will make it. In a few years, every education app will be adaptive. Everyone will be an adaptive learning app maker. (23.04.13)

Right now about 22 percent of the people in the world graduate high school or the equivalent. That’s pathetic. In one generation we could get close to 100 percent, almost for free. (19.07.13)

95% of materials (textbooks, software, etc used for classes, tutoring, corp training…) will be purely online in 5-10 years. That’s a $200B global industry. And people predict that 50% of higher ed and 25% of K-12 will eventually be purely online classes. If so, that would create a new, $3 trillion or so industry. (25.11.2013)

Photo 01-07-2015 16 23 47Flovoco is a vocabulary app developed by ELTjam. This review was written by Mike Harrison and first appeared on his blog. After the review, I’ve added a response by Jo Sayers (of ELTjam). Many thanks to Mike for allowing me to repost his review, and thanks to Jo for allowing me to repost his comment.

I first became aware of this mobile application around July 2014, when ELTjam first posted about their product development. There was a fair amount of heat – the Edtech-meets-ELT specialists Nick Robinson, Laurie Harrison, Tim Gifford, and newbie at the time Jo Sayers, pitched us their product-in-waiting. A year or so later, I saw a presentation at IATEFL in Manchester where Jo talked about reviewing educational and ELT apps. And so to this review of an ELTjam ELT app.

This review follows a model presented by Jo, reviewing the app across four categories: pedagogy and methodology; instructional design; user experience; cost and access.

The version of the app I’m reviewing is 1.0 in the Apple AppStore, and I’m working on an iPhone 5S.

Note – the app is only available in Spanish and English at the moment.

Initial impressions

But first, it is almost impossible to comment on an app without some ‘at first glance’ context, so here it is. I initially thought that ELTjam’s marketing was a little audacious – creating a landing page for an app that didn’t exist – but following conversations recently I now realise this is fairly common practice, and a good way to generate leads for email lists and such. I do still balk a little at the audacity of fanfare around the app (more on that to come under ‘pedagogy and methodology’). On opening the app for the first time, I was impressed by how slick it looked and felt (and more on that under ‘user experience’!) – I have to hold my hands up and say that I have yet to be blown away by an educational app, whether designed for ELT or more general educational fields. Let’s see whether this will change.

untitledhimPedagogy and methodology


The overarching principle behind Flovoco is that words are important. And that in order to get better at using a language, the best thing to do is bump up what you know about the most common words in that language. So far , so good. Flovoco aims to help learners learn more about words by presenting them with a number of activities focusing on different information about a given word – its meaning, pronunciation, how it might be used in a phrase, and then again with words deemed to be confusing. These different areas are named as Translation, Listening, Usage, and Confused Words – and they are presented as ‘levels’, Translation being the first (easiest?) level and Confused Words being the fourth level (and most difficult of the four?). This all seems fairly logical.

But then it all starts to go a bit wrong. In the Translation activity, words are presented with possible matches that aren’t even the same part of speech. Alright, wrong answers are clearly marked with a red light and wrong buzzer sound – but this is behaviourism at best. At worst this may actually confuse learners. Not to mention that this only works by looking at a single meaning of a word – so there won’t be many colloquial meanings and translations included here.

Photo 30-06-2015 18 32 10 eggsThe Listening level does the same thing. Words that are presented alongside each other are completely different. Sometimes phrases are presented, but the audio only comprises of one word in the phrase. For example, you might see ‘pay for something’ but only hear ‘play’. There may be less potential for confusion among words here, but having such a mismatch between the audio and text may be more problematic.

The next two levels are essentially the same – gap fills featuring the words that are being studied. Level three, Usage, again often presents possible answers that are completely different parts of speech. Confused Words is more of the same, but this time the words presented in the three answer slots are potentially easily confused – this seems to be focusing mainly on spelling and/or pronunciation.

Overall, pedagogically Flovoco has a noble aim – looking at the different things it’s useful to know about a word – but in practice it’s a bit confused. Methodologically, the structure of feedback is very behaviouristic. Didn’t we leave Skinner behind a while back?

Instructional design


Admittedly I’m getting my head around the jargon of app development here, but I can’t see this. There isn’t any adaptivity (is that a word) built in to the app. You just work your way up the levels, aiming to get all 500 words into the Your Word Collection at the top level. Nothing adapts down if I keep getting a word wrong – there’s no help other than hoping that I’ll see the red ‘wrong’ button highlight enough times to learn. If I’m racing through the words, there isn’t any extension to what can be done with the words. Maybe there is space for some kind of freer practice, perhaps based around the community of language learners that ELTjam probably hope to cultivate with the app.

Photo 01-07-2015 16 43 09 word collectionPhoto 22-04-2015 08 01 36 372Photo 01-07-2015 16 42 56 profilePhoto 01-07-2015 16 43 05 usage

User experience


It looks pretty. It’s fairly easy to tap-and-go in getting started working with the app. There aren’t too many (any?) instructions, so it does rather rely on learners using the app to work things out by trial and error. The user Profile screen is relatively clean and looks straigtforward. But I haven’t worked out exactly how the Daily Word Goal works (maybe because I don’t use the app regularly) and navigating the Profile screens is a little laggy.

There is a playful feel to the app, from the Flovoco logo font to the coloured circles and stars representing the different levels. If they wanted to make learning words look like a game, that’s the impression you get.

Cost and access

Free. Spanish and English only. iOS only (the website says an Android version is in the works).

Technical requirements (iOS):

Category: Education

Updated: Feb 25, 2015

Version: 1.1.4

Size: 21.1 MB

Compatibility: Requires iOS 7.1 or later. Compatible with iPhone, iPad, and iPod touch. This app is optimized for iPhone 5.

Not yet available on Google Play.

You can get the iOS version from here:

Overall score:


This app certainly got people talking when ELTjam first posted about it, and there is the intent to do something different in the world of ELT apps. But from what I see so far, putting it into practice leaves a fair amount to be desired. Further thought needs to be made about plausible distractors, the focus on the word level meaning may need to be expanded to make this truly different, and sorting out some kind of support/challenge for more and less able learners is something I think is quite key.

Flovoco a go? There is some potential for this app to be really good. But sorry Flovoco, at the moment it’s a no from me.

The response from Jo Sayers:

Hi Mike,

Firstly, thanks very much for writing a review; we really appreciate the feedback and reviews like this will help us as we build Flovoco to a state where both learners and industry professionals can clearly see the value it delivers. This is still our v1.0, as you point out, and so there are many things that we had to do in a way that was good enough to ship it, but necessarily not as finalised as we’d want it to be in an ideal world. The hope was that we would test some key assumptions with this version and learn things that will help us to build it out in a direction that sees it add real value to learners.

The review matches well with a lot of what we have already identified as the strengths and weaknesses of the app and we’re really happy that there are enough positive things in this early version that it already gets 2.5 out of 5. There are a few comments in the review that I wanted to respond to though:

The behaviourist approach. Our intention with the app is not to offer a complete language learning solution, rather to offer a way of quickly acquiring key vocabulary that will act as a foundation for other language development. We feel that lexical acquisition is an area that lends itself well to a more systematic (behaviourist?) approach, and also helps to achieve the flow state that we were aiming for.

Multiple senses. We made a decision to focus only on the primary senses of the words in this version to avoid the complexity of having to demonstrate what is a second or other sense of a word that they’ve encountered before. What you’re actually pushing through the levels in the app is a single sense of a word, rather than a word itself.

The distractors. The distractors for levels 1 and 2 were chosen automatically by an algorithm. I’m not sure I agree that they should all be the same part of speech. This would actually have been very straightforward for us to achieve programmatically, but we felt that it wasn’t any more pedagogically sound than mixing parts of speech. In level 2, the algorithm choosing the distractors selected words with the same initial letter and as close to the same total number of letters as possible, but as the distractors are chosen only from the 500 words in the initial pool this reduced their effectiveness. We have a plan for the next version which should make the selection much more effective. What we wanted to test here, ultimately, was that the algorithm approach worked and didn’t interfere with the learner’s ability to complete the activities.

Levels 3 and 4 being the same. Level 3 focuses on collocation by keeping the target word in the sentence (in bold) and removing a collocate; the distractors are, where possible, similar words that don’t collocate with the target word. Level 4 initially aimed to focus on derived forms of the target word and in this case the target itself is removed and other words which are similar to it act as distractors. As a lot of the words at A1 don’t have strong and obvious collocates and many don’t have derived and inflected forms this was more challenging than it would be at higher levels with fewer functional words and lower frequency content.

Instructional design. Yes, there is no adaptivity. But if you get words wrong they move to the level below, and if you get them right they move up to the next level. In terms of the feedback, the word list on the results page gives a bit of additional information about the words that you have seen. We have plans to incorporate in some community aspects to the product too. However, our plan is not to build that up within ELTjam. As I have discussed in this post for a product website, we have made a shift away from offering this directly to consumers, and now see this as an opportunity to work with publishers and offer the product to their learners through partnerships with them. The B2C version acts as a way of gaining market validation.

Daily word goal. Yes, there are bugs in this which affect the calendar and the lag you mentioned. A new version was built out last month and we should be able to ship the updated version soon.

I hope that helps to put a few of the things you noted into context. It’s also worth pointing out that during the first two months of the app being out and promoted (we are no longer promoting this version as we have learned the things we hoped to) the average session length was over 9 minutes; way more than average in most industries and even more than the average for music apps, according to this data. We also had around 8% of sessions that were over 30 minutes long. So there were a core group of people who were incredibly motivated to use the app. And given that very little is done to incentivise returning to the app (no push notifications, leaderboards, 2 player games etc.) we were really pleased to see that around 8% of learners came back for an 8th visit. So, while there are definitely things to improve, the core offering resonates with learners in what is clearly a very busy and highly competitive marketplace.

Thanks again for the review, it would be really interesting to see what your thoughts are on future iterations. Looking forward to chatting it over next time we meet.


Adaptive learning providers make much of their ability to provide learners with personalised feedback and to provide teachers with dashboard feedback on the performance of both individuals and groups. All well and good, but my interest here is in the automated feedback that software could provide on very specific learning tasks. Scott Thornbury, in a recent talk, ‘Ed Tech: The Mouse that Roared?’, listed six ‘problems’ of language acquisition that educational technology for language learning needs to address. One of these he framed as follows: ‘The feedback problem, i.e. how does the learner get optimal feedback at the point of need?’, and suggested that technological applications ‘have some way to go.’ He was referring, not to the kind of feedback that dashboards can provide, but to the kind of feedback that characterises a good language teacher: corrective feedback (CF) – the way that teachers respond to learner utterances (typically those containing errors, but not necessarily restricted to these) in what Ellis and Shintani call ‘form-focused episodes’[1]. These responses may include a direct indication that there is an error, a reformulation, a request for repetition, a request for clarification, an echo with questioning intonation, etc. Basically, they are correction techniques.

These days, there isn’t really any debate about the value of CF. There is a clear research consensus that it can aid language acquisition. Discussing learning in more general terms, Hattie[2] claims that ‘the most powerful single influence enhancing achievement is feedback’. The debate now centres around the kind of feedback, and when it is given. Interestingly, evidence[3] has been found that CF is more effective in the learning of discrete items (e.g. some grammatical structures) than in communicative activities. Since it is precisely this kind of approach to language learning that we are more likely to find in adaptive learning programs, it is worth exploring further.

What do we know about CF in the learning of discrete items? First of all, it works better when it is explicit than when it is implicit (Li, 2010), although this needs to be nuanced. In immediate post-tests, explicit CF is better than implicit variations. But over a longer period of time, implicit CF provides better results. Secondly, formative feedback (as opposed to right / wrong testing-style feedback) strengthens retention of the learning items: this typically involves the learner repairing their error, rather than simply noticing that an error has been made. This is part of what cognitive scientists[4] sometimes describe as the ‘generation effect’. Whilst learners may benefit from formative feedback without repairing their errors, Ellis and Shintani (2014: 273) argue that the repair may result in ‘deeper processing’ and, therefore, assist learning. Thirdly, there is evidence that some delay in receiving feedback aids subsequent recall, especially over the longer term. Ellis and Shintani (2014: 276) suggest that immediate CF may ‘benefit the development of learners’ procedural knowledge’, while delayed CF is ‘perhaps more likely to foster metalinguistic understanding’. You can read a useful summary of a meta-analysis of feedback effects in online learning here, or you can buy the whole article here.

I have yet to see an online language learning program which can do CF well, but I think it’s a matter of time before things improve significantly. First of all, at the moment, feedback is usually immediate, or almost immediate. This is unlikely to change, for a number of reasons – foremost among them being the pride that ed tech takes in providing immediate feedback, and the fact that online learning is increasingly being conceptualised and consumed in bite-sized chunks, something you do on your phone between doing other things. What will change in better programs, however, is that feedback will become more formative. As things stand, tasks are usually of a very closed variety, with drag-and-drop being one of the most popular. Only one answer is possible and feedback is usually of the right / wrong-and-here’s-the-correct-answer kind. But tasks of this kind are limited in their value, and, at some point, tasks are needed where more than one answer is possible.

Here’s an example of a translation task from Duolingo, where a simple sentence could be translated into English in quite a large number of ways.

i_am_doing_a_basketDecontextualised as it is, the sentence could be translated in the way that I have done it, although it’s unlikely. The feedback, however, is of relatively little help to the learner, who would benefit from guidance of some sort. The simple reason that Duolingo doesn’t offer useful feedback is that the programme is static. It has been programmed to accept certain answers (e.g. in this case both the present simple and the present continuous are acceptable), but everything else will be rejected. Why? Because it would take too long and cost too much to anticipate and enter in all the possible answers. Why doesn’t it offer formative feedback? Because in order to do so, it would need to identify the kind of error that has been made. If we can identify the kind of error, we can make a reasonable guess about the cause of the error, and select appropriate CF … this is what good teachers do all the time.

Analysing the kind of error that has been made is the first step in providing appropriate CF, and it can be done, with increasing accuracy, by current technology, but it requires a lot of computing. Let’s take spelling as a simple place to start. If you enter ‘I am makeing a basket for my mother’ in the Duolingo translation above, the program tells you ‘Nice try … there’s a typo in your answer’. Given the configuration of keyboards, it is highly unlikely that this is a typo. It’s a simple spelling mistake and teachers recognise it as such because they see it so often. For software to achieve the same insight, it would need, as a start, to trawl a large English dictionary database and a large tagged database of learner English. The process is quite complicated, but it’s perfectably do-able, and learners could be provided with CF in the form of a ‘spelling hint’.i_am_makeing_a_basket

Rather more difficult is the error illustrated in my first screen shot. What’s the cause of this ‘error’? Teachers know immediately that this is probably a classic confusion of ‘do’ and ‘make’. They know that the French verb ‘faire’ can be translated into English as ‘make’ or ‘do’ (among other possibilities), and the error is a common language transfer problem. Software could do the same thing. It would need a large corpus (to establish that ‘make’ collocates with ‘a basket’ more often than ‘do’), a good bilingualised dictionary (plenty of these now exist), and a tagged database of learner English. Again, appropriate automated feedback could be provided in the form of some sort of indication that ‘faire’ is only sometimes translated as ‘make’.

These are both relatively simple examples, but it’s easy to think of others that are much more difficult to analyse automatically. Duolingo rejects ‘I am making one basket for my mother’: it’s not very plausible, but it’s not wrong. Teachers know why learners do this (again, it’s probably a transfer problem) and know how to respond (perhaps by saying something like ‘Only one?’). Duolingo also rejects ‘I making a basket for my mother’ (a common enough error), but is unable to provide any help beyond the correct answer. Automated CF could, however, be provided in both cases if more tools are brought into play. Multiple parsing machines (one is rarely accurate enough on its own) and semantic analysis will be needed. Both the range and the complexity of the available tools are increasing so rapidly (see here for the sort of research that Google is doing and here for an insight into current applications of this research in language learning) that Duolingo-style right / wrong feedback will very soon seem positively antediluvian.

One further development is worth mentioning here, and it concerns feedback and gamification. Teachers know from the way that most learners respond to written CF that they are usually much more interested in knowing what they got right or wrong, rather than the reasons for this. Most students are more likely to spend more time looking at the score at the bottom of a corrected piece of written work than at the laborious annotations of the teacher throughout the text. Getting students to pay close attention to the feedback we provide is not easy. Online language learning systems with gamification elements, like Duolingo, typically reward learners for getting things right, and getting things right in the fewest attempts possible. They encourage learners to look for the shortest or cheapest route to finding the correct answers: learning becomes a sexed-up form of test. If, however, the automated feedback is good, this sort of gamification encourages the wrong sort of learning behaviour. Gamification designers will need to shift their attention away from the current concern with right / wrong, and towards ways of motivating learners to look at and respond to feedback. It’s tricky, because you want to encourage learners to take more risks (and reward them for doing so), but it makes no sense to penalise them for getting things right. The probable solution is to have a dual points system: one set of points for getting things right, another for employing positive learning strategies.

The provision of automated ‘optimal feedback at the point of need’ may not be quite there yet, but it seems we’re on the way for some tasks in discrete-item learning. There will probably always be some teachers who can outperform computers in providing appropriate feedback, in the same way that a few top chess players can beat ‘Deep Blue’ and its scions. But the rest of us had better watch our backs: in the provision of some kinds of feedback, computers are catching up with us fast.

[1] Ellis, R. & N. Shintani (2014) Exploring Language Pedagogy through Second Language Acquisition Research. Abingdon: Routledge p. 249

[2] Hattie, K. (2009) Visible Learning. Abingdon: Routledge p.12

[3] Li, S. (2010) ‘The effectiveness of corrective feedback in SLA: a meta-analysis’ Language Learning 60 / 2: 309 -365

[4] Brown, P.C., Roediger, H.L. & McDaniel, M. A. Make It Stick (Cambridge, Mass.: Belknap Press, 2014)

Back in December 2013, in an interview with eltjam , David Liu, COO of the adaptive learning company, Knewton, described how his company’s data analysis could help ELT publishers ‘create more effective learning materials’. He focused on what he calls ‘content efficacy[i]’ (he uses the word ‘efficacy’ five times in the interview), a term which he explains below:

A good example is when we look at the knowledge graph of our partners, which is a map of how concepts relate to other concepts and prerequisites within their product. There may be two or three prerequisites identified in a knowledge graph that a student needs to learn in order to understand a next concept. And when we have hundreds of thousands of students progressing through a course, we begin to understand the efficacy of those said prerequisites, which quite frankly were made by an author or set of authors. In most cases they’re quite good because these authors are actually good in what they do. But in a lot of cases we may find that one of those prerequisites actually is not necessary, and not proven to be useful in achieving true learning or understanding of the current concept that you’re trying to learn. This is interesting information that can be brought back to the publisher as they do revisions, as they actually begin to look at the content as a whole.

One commenter on the post, Tom Ewens, found the idea interesting. It could, potentially, he wrote, give us new insights into how languages are learned much in the same way as how corpora have given us new insights into how language is used. Did Knewton have any plans to disseminate the information publicly, he asked. His question remains unanswered.

At the time, Knewton had just raised $51 million (bringing their total venture capital funding to over $105 million). Now, 16 months later, Knewton have launched their new product, which they are calling Knewton Content Insights. They describe it as the world’s first and only web-based engine to automatically extract statistics comparing the relative quality of content items — enabling us to infer more information about student proficiency and content performance than ever before possible.

The software analyses particular exercises within the learning content (and particular items within them). It measures the relative difficulty of individual items by, for example, analysing how often a question is answered incorrectly and how many tries it takes each student to answer correctly. It also looks at what they call ‘exhaustion’ – how much content students are using in a particular area – and whether they run out of content. The software can correlate difficulty with exhaustion. Lastly, it analyses what they call ‘assessment quality’ – how well  individual questions assess a student’s understanding of a topic.

Knewton’s approach is premised on the idea that learning (in this case language learning) can be broken down into knowledge graphs, in which the information that needs to be learned can be arranged and presented hierarchically. The ‘granular’ concepts are then ‘delivered’ to the learner, and Knewton’s software can optimise the delivery. The first problem, as I explored in a previous post, is that language is a messy, complex system: it doesn’t lend itself terribly well to granularisation. The second problem is that language learning does not proceed in a linear, hierarchical way: it is also messy and complex. The third is that ‘language learning content’ cannot simply be delivered: a process of mediation is unavoidable. Are the people at Knewton unaware of the extensive literature devoted to the differences between synthetic and analytic syllabuses, of the differences between product-oriented and process-oriented approaches? It would seem so.

Knewton’s ‘Content Insights’ can only, at best, provide some sort of insight into the ‘language knowledge’ part of any learning content. It can say nothing about the work that learners do to practise language skills, since these are not susceptible to granularisation: you simply can’t take a piece of material that focuses on reading or listening and analyse its ‘content efficacy at the concept level’. Because of this, I predicted (in the post about Knowledge Graphs) that the likely focus of Knewton’s analytics would be discrete item, sentence-level grammar (typically tenses). It turns out that I was right.

Knewton illustrate their new product with screen shots such as those below.















They give a specific example of the sort of questions their software can answer. It is: do students generally find the present simple tense easier to understand than the present perfect tense? Doh!

It may be the case that Knewton Content Insights might optimise the presentation of this kind of grammar, but optimisation of this presentation and practice is highly unlikely to have any impact on the rate of language acquisition. Students are typically required to study the present perfect at every level from ‘elementary’ upwards. They have to do this, not because the presentation in, say, Headway, is not optimised. What they need is to spend a significantly greater proportion of their time on ‘language use’ and less on ‘language knowledge’. This is not just my personal view: it has been extensively researched, and I am unaware of any dissenting voices.

The number-crunching in Knewton Content Insights is unlikely, therefore, to lead to any actionable insights. It is, however, very likely to lead (as writer colleagues at Pearson and other publishers are finding out) to an obsession with measuring the ‘efficacy’ of material which, quite simply, cannot meaningfully be measured in this way. It is likely to distract from much more pressing issues, notably the question of how we can move further and faster away from peddling sentence-level, discrete-item grammar.

In the long run, it is reasonable to predict that the attempt to optimise the delivery of language knowledge will come to be seen as an attempt to tackle the wrong question. It will make no significant difference to language learners and language learning. In the short term, how much time and money will be wasted?

[i] ‘Efficacy’ is the buzzword around which Pearson has built its materials creation strategy, a strategy which was launched around the same time as this interview. Pearson is a major investor in Knewton.