Posts Tagged ‘Duolingo’


Posted: December 1, 2016 in ed tech
Tags: , , , , , , , ,


Here’s a conversation I had recently with Cleverbot.


User (me): Hi Cleverbot. What are your views on the future of chatbots in language learning?

Cleverbot: Are you talking to yourself?

User: No, I’m asking you a question.

Cleverbot: Then what’s the question again? I forgot.

User: What are your views on the future of chatbots in language learning?

Cleverbot: Don’t you want me to finish my story?

User: No, I want you to answer my question.

Cleverbot: What is your question?

User: Do you think there is a future for chatbots in language education?

Cleverbot: Sure, if you are talking about the moon.

aliceI also tried to have a conversation with Alice, the avatar from EFL Classroom Bot, listed by Larry Ferlazzo as one of ‘the best online chatbots for practicing English’. I didn’t get any more sense out of her than out of Cleverbot.

Chatbots, apparently, are the next big thing. Again. David Mattin, head of trends and insights at, writes (in the September 2016 issue of ‘Business Life’) that ‘the chatbot revolution is coming’ and that chatbots are a step towards the dream of an interface between user and technology that is so intuitive that the interface ‘simply fades away’. Chatbots have been around for some time. Remember Clippy – the Microsoft Office bot in the late 1990s – which you had to disable in order to stop yourself punching your computer screen? Since then, bots have become ubiquitous. There have been problems, such as Microsoft’s Tay bot that had to be taken down after sixteen hours earlier this year, when, after interacting with other Twitter users, it developed into an abusive Nazi. But chatbots aren’t going away and you’ve probably interacted with one to book a taxi, order food or attempt to talk to your bank. In September this year, the Guardian described them as ‘the talk of the town’ and ‘hot property in Silicon Valley’.

The real interest in chatbots is not, however, in the ‘exciting interface’ possibilities (both user interface and user experience remain pretty crude), but in the way that they are leaner, sit comfortably with the things we actually do on a phone and the fact that they offer a way of cutting out the high fees that developers have to pay to app stores . After so many start-up failures, chatbots offer a glimmer of financial hope to developers.

It’s no surprise, of course, to find the world of English language teaching beginning to sit up and take notice of this technology. A 2012 article by Ben Lehtinen in PeerSpectives enthuses about the possibilities in English language learning and reports the positive feedback of the author’s own students. ELTJam, so often so quick off the mark, developed an ELT Bot over the course of a hackathon weekend in March this year. Disappointingly, it wasn’t really a bot – more a case of humans pretending to be a bot pretending to be humans – but it probably served its exploratory purpose. duolingoAnd a few months ago Duolingo began incorporating bots. These are currently only available for French, Spanish and German learners in the iPhone app, so I haven’t been able to try it out and evaluate it. According to an infomercial in TechCrunch, ‘to make talking to the bots a bit more compelling, the company tried to give its different bots a bit of personality. There’s Chef Robert, Renee the Driver and Officer Ada, for example. They will react differently to your answers (and correct you as necessary), but for the most part, the idea here is to mimic a real conversation. These bots also allow for a degree of flexibility in your answers that most language-learning software simply isn’t designed for. There are plenty of ways to greet somebody, for example, but most services will often only accept a single answer. When you’re totally stumped for words, though, Duolingo offers a ‘help my reply’ button with a few suggested answers.’ In the last twelve months or so, Duolingo has considerably improved its ability to recognize multiple correct ways of expressing a particular idea, and its ability to recognise alternative answers to its translation tasks. However, I’m highly sceptical about its ability to mimic a real conversation any better than Cleverbot or Alice the EFL Bot, or its ability to provide systematically useful corrections.

My reasons lie in the current limitations of AI and NLP (Natural Language Processing). In a nutshell, we simply don’t know how to build a machine that can truly understand human language. Limited exchanges in restricted domains can be done pretty well (such as the early chatbot that did a good job of simulating an encounter with an evasive therapist, or, more recently ordering a taco and having a meaningless, but flirty conversation with a bot), but despite recent advances in semantic computing, we’re a long way from anything that can mimic a real conversation. As Audrey Watters puts it, we’re not even close.

When it comes to identifying language errors made by language learners, we’re not really much better off. Apps like Grammarly are not bad at identifying grammatical errors (but not good enough to be reliable), but pretty hopeless at dealing with lexical appropriacy. Much more reliable feedback to learners can be offered when the software is trained on particular topics and text types. Write & Improve does this with a relatively small selection of Cambridge English examination tasks, but a free conversation ….? Forget it.

So, how might chatbots be incorporated into language teaching / learning? A blog post from December 2015 entitled AI-powered chatbots and the future of language learning suggests one plausible possibility. Using an existing messenger service, such as WhatsApp or Telegram, an adaptive chatbot would send tasks (such as participation in a conversation thread with a predetermined topic, register, etc., or pronunciation practice or translation exercises) to a learner, provide feedback and record the work for later recycling. At the same time, the bot could send out reminders of work that needs to be done or administrative tasks that must be completed.

Kat Robb has written a very practical article about using instant messaging in English language classrooms. Her ideas are interesting (although I find the idea of students in a F2F classroom messaging each other slightly bizarre) and it’s easy to imagine ways in which her activities might be augmented with chatbot interventions. The Write & Improve app, mentioned above, could deploy a chatbot interface to give feedback instead of the flat (and, in my opinion, perfectly adequate) pop-up boxes currently in use. Come to think of it, more or less any digital language learning tool could be pimped up with a bot. Countless revisions can be envisioned.

But the overwhelming question is: would it be worth it? Bots are not likely, any time soon, to revolutionise language learning. What they might just do, however, is help to further reduce language teaching to a series of ‘mechanical and scripted gestures’. More certain is that a lot of money will be thrown down the post-truth edtech drain. Then, in the not too distant future, this latest piece of edtech will fall into the trough of disillusionment, to be replaced by the latest latest thing.



Adaptive learning providers make much of their ability to provide learners with personalised feedback and to provide teachers with dashboard feedback on the performance of both individuals and groups. All well and good, but my interest here is in the automated feedback that software could provide on very specific learning tasks. Scott Thornbury, in a recent talk, ‘Ed Tech: The Mouse that Roared?’, listed six ‘problems’ of language acquisition that educational technology for language learning needs to address. One of these he framed as follows: ‘The feedback problem, i.e. how does the learner get optimal feedback at the point of need?’, and suggested that technological applications ‘have some way to go.’ He was referring, not to the kind of feedback that dashboards can provide, but to the kind of feedback that characterises a good language teacher: corrective feedback (CF) – the way that teachers respond to learner utterances (typically those containing errors, but not necessarily restricted to these) in what Ellis and Shintani call ‘form-focused episodes’[1]. These responses may include a direct indication that there is an error, a reformulation, a request for repetition, a request for clarification, an echo with questioning intonation, etc. Basically, they are correction techniques.

These days, there isn’t really any debate about the value of CF. There is a clear research consensus that it can aid language acquisition. Discussing learning in more general terms, Hattie[2] claims that ‘the most powerful single influence enhancing achievement is feedback’. The debate now centres around the kind of feedback, and when it is given. Interestingly, evidence[3] has been found that CF is more effective in the learning of discrete items (e.g. some grammatical structures) than in communicative activities. Since it is precisely this kind of approach to language learning that we are more likely to find in adaptive learning programs, it is worth exploring further.

What do we know about CF in the learning of discrete items? First of all, it works better when it is explicit than when it is implicit (Li, 2010), although this needs to be nuanced. In immediate post-tests, explicit CF is better than implicit variations. But over a longer period of time, implicit CF provides better results. Secondly, formative feedback (as opposed to right / wrong testing-style feedback) strengthens retention of the learning items: this typically involves the learner repairing their error, rather than simply noticing that an error has been made. This is part of what cognitive scientists[4] sometimes describe as the ‘generation effect’. Whilst learners may benefit from formative feedback without repairing their errors, Ellis and Shintani (2014: 273) argue that the repair may result in ‘deeper processing’ and, therefore, assist learning. Thirdly, there is evidence that some delay in receiving feedback aids subsequent recall, especially over the longer term. Ellis and Shintani (2014: 276) suggest that immediate CF may ‘benefit the development of learners’ procedural knowledge’, while delayed CF is ‘perhaps more likely to foster metalinguistic understanding’. You can read a useful summary of a meta-analysis of feedback effects in online learning here, or you can buy the whole article here.

I have yet to see an online language learning program which can do CF well, but I think it’s a matter of time before things improve significantly. First of all, at the moment, feedback is usually immediate, or almost immediate. This is unlikely to change, for a number of reasons – foremost among them being the pride that ed tech takes in providing immediate feedback, and the fact that online learning is increasingly being conceptualised and consumed in bite-sized chunks, something you do on your phone between doing other things. What will change in better programs, however, is that feedback will become more formative. As things stand, tasks are usually of a very closed variety, with drag-and-drop being one of the most popular. Only one answer is possible and feedback is usually of the right / wrong-and-here’s-the-correct-answer kind. But tasks of this kind are limited in their value, and, at some point, tasks are needed where more than one answer is possible.

Here’s an example of a translation task from Duolingo, where a simple sentence could be translated into English in quite a large number of ways.

i_am_doing_a_basketDecontextualised as it is, the sentence could be translated in the way that I have done it, although it’s unlikely. The feedback, however, is of relatively little help to the learner, who would benefit from guidance of some sort. The simple reason that Duolingo doesn’t offer useful feedback is that the programme is static. It has been programmed to accept certain answers (e.g. in this case both the present simple and the present continuous are acceptable), but everything else will be rejected. Why? Because it would take too long and cost too much to anticipate and enter in all the possible answers. Why doesn’t it offer formative feedback? Because in order to do so, it would need to identify the kind of error that has been made. If we can identify the kind of error, we can make a reasonable guess about the cause of the error, and select appropriate CF … this is what good teachers do all the time.

Analysing the kind of error that has been made is the first step in providing appropriate CF, and it can be done, with increasing accuracy, by current technology, but it requires a lot of computing. Let’s take spelling as a simple place to start. If you enter ‘I am makeing a basket for my mother’ in the Duolingo translation above, the program tells you ‘Nice try … there’s a typo in your answer’. Given the configuration of keyboards, it is highly unlikely that this is a typo. It’s a simple spelling mistake and teachers recognise it as such because they see it so often. For software to achieve the same insight, it would need, as a start, to trawl a large English dictionary database and a large tagged database of learner English. The process is quite complicated, but it’s perfectably do-able, and learners could be provided with CF in the form of a ‘spelling hint’.i_am_makeing_a_basket

Rather more difficult is the error illustrated in my first screen shot. What’s the cause of this ‘error’? Teachers know immediately that this is probably a classic confusion of ‘do’ and ‘make’. They know that the French verb ‘faire’ can be translated into English as ‘make’ or ‘do’ (among other possibilities), and the error is a common language transfer problem. Software could do the same thing. It would need a large corpus (to establish that ‘make’ collocates with ‘a basket’ more often than ‘do’), a good bilingualised dictionary (plenty of these now exist), and a tagged database of learner English. Again, appropriate automated feedback could be provided in the form of some sort of indication that ‘faire’ is only sometimes translated as ‘make’.

These are both relatively simple examples, but it’s easy to think of others that are much more difficult to analyse automatically. Duolingo rejects ‘I am making one basket for my mother’: it’s not very plausible, but it’s not wrong. Teachers know why learners do this (again, it’s probably a transfer problem) and know how to respond (perhaps by saying something like ‘Only one?’). Duolingo also rejects ‘I making a basket for my mother’ (a common enough error), but is unable to provide any help beyond the correct answer. Automated CF could, however, be provided in both cases if more tools are brought into play. Multiple parsing machines (one is rarely accurate enough on its own) and semantic analysis will be needed. Both the range and the complexity of the available tools are increasing so rapidly (see here for the sort of research that Google is doing and here for an insight into current applications of this research in language learning) that Duolingo-style right / wrong feedback will very soon seem positively antediluvian.

One further development is worth mentioning here, and it concerns feedback and gamification. Teachers know from the way that most learners respond to written CF that they are usually much more interested in knowing what they got right or wrong, rather than the reasons for this. Most students are more likely to spend more time looking at the score at the bottom of a corrected piece of written work than at the laborious annotations of the teacher throughout the text. Getting students to pay close attention to the feedback we provide is not easy. Online language learning systems with gamification elements, like Duolingo, typically reward learners for getting things right, and getting things right in the fewest attempts possible. They encourage learners to look for the shortest or cheapest route to finding the correct answers: learning becomes a sexed-up form of test. If, however, the automated feedback is good, this sort of gamification encourages the wrong sort of learning behaviour. Gamification designers will need to shift their attention away from the current concern with right / wrong, and towards ways of motivating learners to look at and respond to feedback. It’s tricky, because you want to encourage learners to take more risks (and reward them for doing so), but it makes no sense to penalise them for getting things right. The probable solution is to have a dual points system: one set of points for getting things right, another for employing positive learning strategies.

The provision of automated ‘optimal feedback at the point of need’ may not be quite there yet, but it seems we’re on the way for some tasks in discrete-item learning. There will probably always be some teachers who can outperform computers in providing appropriate feedback, in the same way that a few top chess players can beat ‘Deep Blue’ and its scions. But the rest of us had better watch our backs: in the provision of some kinds of feedback, computers are catching up with us fast.

[1] Ellis, R. & N. Shintani (2014) Exploring Language Pedagogy through Second Language Acquisition Research. Abingdon: Routledge p. 249

[2] Hattie, K. (2009) Visible Learning. Abingdon: Routledge p.12

[3] Li, S. (2010) ‘The effectiveness of corrective feedback in SLA: a meta-analysis’ Language Learning 60 / 2: 309 -365

[4] Brown, P.C., Roediger, H.L. & McDaniel, M. A. Make It Stick (Cambridge, Mass.: Belknap Press, 2014)

There are a number of reasons why we sometimes need to describe a person’s language competence using a single number. Most of these are connected to the need for a shorthand to differentiate people, in summative testing or in job selection, for example. Numerical (or grade) allocation of this kind is so common (and especially in times when accountability is greatly valued) that it is easy to believe that this number is an objective description of a concrete entity, rather than a shorthand description of an abstract concept. In the process, the abstract concept (language competence) becomes reified and there is a tendency to stop thinking about what it actually is.

Language is messy. It’s a complex, adaptive system of communication which has a fundamentally social function. As Diane Larsen-Freeman and others have argued patterns of use strongly affect how language is acquired, is used, and changes. These processes are not independent of one another but are facets of the same complex adaptive system. […] The system consists of multiple agents (the speakers in the speech community) interacting with one another [and] the structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms.

As such, competence in language use is difficult to measure. There are ways of capturing some of it. Think of the pages and pages of competency statements in the Common European Framework, but there has always been something deeply unsatisfactory about documents of this kind. How, for example, are we supposed to differentiate, exactly and objectively, between, say, can participate fully in an interview (C1) and can carry out an effective, fluent interview (B2)? The short answer is that we can’t. There are too many of these descriptors anyway and, even if we did attempt to use such a detailed tool to describe language competence, we would still be left with a very incomplete picture. There is at least one whole book devoted to attempts to test the untestable in language education (edited by Amos Paran and Lies Sercu, Multilingual Matters, 2010).

So, here is another reason why we are tempted to use shorthand numerical descriptors (such as A1, A2, B1, etc.) to describe something which is very complex and abstract (‘overall language competence’) and to reify this abstraction in the process. From there, it is a very short step to making things even more numerical, more scientific-sounding. Number-creep in recent years has brought us the Pearson Global Scale of English which can place you at a precise point on a scale from 10 to 90. Not to be outdone, Cambridge English Language Assessment now has a scale that runs from 80 points to 230, although Cambridge does, at least, allocate individual scores for four language skills.

As the title of this post suggests (in its reference to Stephen Jay Gould’s The Mismeasure of Man), I am suggesting that there are parallels between attempts to measure language competence and the sad history of attempts to measure ‘general intelligence’. Both are guilty of the twin fallacies of reification and ranking – the ordering of complex information as a gradual ascending scale. These conceptual fallacies then lead us, through the way that they push us to think about language, into making further conceptual errors about language learning. We start to confuse language testing with the ways that language learning can be structured.

We begin to granularise language. We move inexorably away from difficult-to-measure hazy notions of language skills towards what, on the surface at least, seem more readily measurable entities: words and structures. We allocate to them numerical values on our testing scales, so that an individual word can be deemed to be higher or lower on the scale than another word. And then we have a syllabus, a synthetic syllabus, that lends itself to digital delivery and adaptive manipulation. We find ourselves in a situation where materials writers for Pearson, writing for a particular ‘level’, are only allowed to use vocabulary items and grammatical structures that correspond to that ‘level’. We find ourselves, in short, in a situation where the acquisition of a complex and messy system is described as a linear, additive process. Here’s an example from the Pearson website: If you score 29 on the scale, you should be able to identify and order common food and drink from a menu; at 62, you should be able to write a structured review of a film, book or play. And because the GSE is so granular in nature, you can conquer smaller steps more often; and you are more likely to stay motivated as you work towards your goal. It’s a nonsense, a nonsense that is dictated by the needs of testing and adaptive software, but the sciency-sounding numbers help to hide the conceptual fallacies that lie beneath.

Perhaps, though, this doesn’t matter too much for most language learners. In the early stages of language learning (where most language learners are to be found), there are countless millions of people who don’t seem to mind the granularised programmes of Duolingo or Rosetta Stone, or the Grammar McNuggets of coursebooks. In these early stages, anything seems to be better than nothing, and the testing is relatively low-stakes. But as a learner’s interlanguage becomes more complex, and as the language she needs to acquire becomes more complex, attempts to granularise it and to present it in a linearly additive way become more problematic. It is for this reason, I suspect, that the appeal of granularised syllabuses declines so rapidly the more progress a learner makes. It comes as no surprise that, the further up the scale you get, the more that both teachers and learners want to get away from pre-determined syllabuses in coursebooks and software.

Adaptive language learning software is continuing to gain traction in the early stages of learning, in the initial acquisition of basic vocabulary and structures and in coming to grips with a new phonological system. It will almost certainly gain even more. But the challenge for the developers and publishers will be to find ways of making adaptive learning work for more advanced learners. Can it be done? Or will the mismeasure of language make it impossible?

The cheer-leading for big data in education continues unabated. Almost everything you read online on the subject is an advertisement, usually disguised as a piece of news or a blog post, but which can invariably be traced back to an organisation with a vested interest in digital disruption.  A typical example is this advergraphic which comes under a banner that reads ‘Big Data Improves Education’. The site, Datafloq, is selling itself as ‘the one-stop-shop around Big Data.’ Their ‘vision’ is ‘Connecting Data and People and [they] aim to achieve that by spurring the understanding, acceptance and application of Big Data in order to drive innovation and economic growth.’

Critical voices are rare, but growing. There’s a very useful bibliography of recent critiques here. And in the world of English language teaching, I was pleased to see that there’s a version of Gavin Dudeney’s talk, ‘Of Big Data & Little Data’, now up on YouTube. The slides which accompany his talk can be accessed here.

His main interest is in reclaiming the discourse of edtech in ELT, in moving away from the current obsession with numbers, and in returning the focus to what he calls ‘old edtech’ – the everyday technological practices of the vast majority of ELT practitioners.2014-12-01_2233

It’s a stimulating and deadpan-entertaining talk and well worth 40 minutes of your time. Just fast-forward the bit when he talks about me.

If you’re interested in hearing more critical voices, you may also like to listen to a series of podcasts, put together by the IATEFL Learning Technologies and Global Issues Special Interest Groups. In the first of these, I interview Neil Selwyn and, in the second, Lindsay Clandfield interviews Audrey Watters of Hack Education.


Duolingo testing

Posted: September 6, 2014 in testing
Tags: , , , , ,

After a break of two years, I recently returned to Duolingo in an attempt to build my German vocabulary. The attempt lasted a week. A few small things had changed, but the essentials had not, and my amusement at translating sentences like The duck eats oranges, A red dog wears white clothes or The fly is important soon turned to boredom and irritation. There are better, free ways of building vocabulary in another language.

Whilst little is new in the learning experience of Duolingo, there are significant developments at the company. The first of these is a new funding round in which they raised a further $20 million, bringing total investment to close to $40 million. Duolingo now has more than 25 million users, half of whom are described as ‘active’, and, according to Louis von Ahn,  the company’s founder, their ambition is to dominate the language learning market. Approaching their third anniversary, though, Duolingo will need, before long, to turn a profit or, at least, to break even. The original plan, to use the language data generated by users of the site to power a paying translation service, is beginning to bear fruit, with contracts with CNN and BuzzFeed. But Duolingo is going to need other income streams. This may well be part of the reason behind their decision to develop and launch their own test.

Duolingo’s marketing people, however, are trying to get another message across: Every year, over 30 million job seekers and students around the world are forced to take a test to prove that they know English in order to apply for a job or school. For some, these tests can cost their family an entire month’s salary. And not only that, taking them typically requires traveling to distant examination facilities and waiting weeks for the results. We believe there should be a better way. This is why today I’m proud to announce the beta release of the Duolingo Test Center, which was created to give everyone equal access to jobs and educational opportunities. Now anyone can conveniently certify their English skills from home, on their mobile device, and for only $20. That’s 1/10th the cost of existing tests. Talking the creative disruption talk, Duolingo wants to break into the “archaic” industry of language proficiency tests. Basically, then, they want to make the world a better place. I seem to have heard this kind of thing before.

The tests will cost $20. Gina Gotthilf , Duolingo’s head of marketing, explains the pricing strategy: We came up with the smallest value that works for us and that a lot of people can pay. Duolingo’s main markets are now the BRICS countries. In China, for example, 1.5 million people signed up with Duolingo in just one week in April of this year, according to @TECHINASIA . Besides China, Duolingo has expanded into India, Japan, Korea, Taiwan, Hong Kong, Vietnam and Indonesia this year. (Brazil already has 2.4 million users, and there are 1.5 million in Mexico.) That’s a lot of potential customers.

So, what do you get for your twenty bucks? Not a lot, is the short answer. The test lasts about 18 minutes. There are four sections, and adaptive software analyses the testee’s responses to determine the level of difficulty of subsequent questions. The first section requires users to select real English words from a list which includes invented words. The second is a short dictation, the third is a gapfill, and the fourth is a read-aloud task which is recorded and compared to a native-speaker norm. That’s it.Item types

Duolingo claims that the test scores correlate very well with TOEFL, but the claim is based on a single study by a University of Pittsburgh professor that was sponsored by Duolingo. Will further studies replicate the findings? I, for one, wouldn’t bet on it, but I won’t insult your intelligence by explaining my reasons. Test validity and reliability, then, remain to be proved, but even John Lehoczky , interim executive vice president of Carnegie Mellon University (Duolingo was developed by researchers from Carnegie Mellon’s computer science department) acknowledges that at this point [the test] is not a fit vehicle for undergraduate admissions.

Even more of a problem than validity and reliability, however, is the question of security. The test is delivered via the web or smartphone apps (Android and iOS). Testees have to provide photo ID and a photo taken on the device they are using. There are various rules (they must be alone, no headphones, etc) and a human proctor reviews the test after it has been completed. This is unlikely to impress authorities like the British immigration authorities, which recently refused to recognise online TOEFL and TOEIC qualifications, after a BBC documentary revealed ‘systematic fraud’ in the taking of these tests.

There will always be a market of sorts for valueless qualifications (think, for example, of all the cheap TEFL courses that can be taken online), but to break into the monopoly of TOEFL and IELTS (and soon perhaps Pearson), Duolingo will need to deal with the issues of validity, reliability and security. If they don’t, few – if any – institutions of higher education will recognise the test. But if they do, they’ll need to spend more money: a team of applied linguists with expertise in testing would be a good start, and serious proctoring doesn’t come cheap. Will they be able to do this and keep the price down to $20?



Personalization is one of the key leitmotifs in current educational discourse. The message is clear: personalization is good, one-size-fits-all is bad. ‘How to personalize learning and how to differentiate instruction for diverse classrooms are two of the great educational challenges of the 21st century,’ write Trilling and Fadel, leading lights in the Partnership for 21st Century Skills (P21)[1]. Barack Obama has repeatedly sung the praises of, and the need for, personalized learning and his policies are fleshed out by his Secretary of State, Arne Duncan, in speeches and on the White House blog: ‘President Obama described the promise of personalized learning when he launched the ConnectED initiative last June. Technology is a powerful tool that helps create robust personalized learning environments.’ In the UK, personalized learning has been government mantra for over 10 years. The EU, UNESCO, OECD, the Gates Foundation – everyone, it seems, is singing the same tune.

Personalization, we might all agree, is a good thing. How could it be otherwise? No one these days is going to promote depersonalization or impersonalization in education. What exactly it means, however, is less clear. According to a UNESCO Policy Brief[2], the term was first used in the context of education in the 1970s by Victor Garcìa Hoz, a senior Spanish educationalist and member of Opus Dei at the University of Madrid. This UNESCO document then points out that ‘unfortunately, up to this date there is no single definition of this concept’.

In ELT, the term has been used in a very wide variety of ways. These range from the far-reaching ideas of people like Gertrude Moskowitz, who advocated a fundamentally learner-centred form of instruction, to the much more banal practice of getting students to produce a few personalized examples of an item of grammar they have just studied. See Scott Thornbury’s A-Z blog for an interesting discussion of personalization in ELT.

As with education in general, and ELT in particular, ‘personalization’ is also bandied around the adaptive learning table. Duolingo advertises itself as the opposite of one-size-fits-all, and as an online equivalent of the ‘personalized education you can get from a small classroom teacher or private tutor’. Babbel offers a ‘personalized review manager’ and Rosetta Stone’s Classroom online solution allows educational institutions ‘to shift their language program away from a ‘one-size-fits-all-curriculum’ to a more individualized approach’. As far as I can tell, the personalization in these examples is extremely restricted. The language syllabus is fixed and although users can take different routes up the ‘skills tree’ or ‘knowledge graph’, they are totally confined by the pre-determination of those trees and graphs. This is no more personalized learning than asking students to make five true sentences using the present perfect. Arguably, it is even less!

This is not, in any case, the kind of personalization that Obama, the Gates Foundation, Knewton, et al have in mind when they conflate adaptive learning with personalization. Their definition is much broader and summarised in the US National Education Technology Plan of 2010: ‘Personalized learning means instruction is paced to learning needs, tailored to learning preferences, and tailored to the specific interests of different learners. In an environment that is fully personalized, the learning objectives and content as well as the method and pace may all vary (so personalization encompasses differentiation and individualization).’ What drives this is the big data generated by the students’ interactions with the technology (see ‘Part 4: big data and analytics’ of ‘The Guide’ on this blog).

What remains unclear is exactly how this might work in English language learning. Adaptive software can only personalize to the extent that the content of an English language learning programme allows it to do so. It may be true that each student using adaptive software ‘gets a more personalised experience no matter whose content the student is consuming’, as Knewton’s David Liu puts it. But the potential for any really meaningful personalization depends crucially on the nature and extent of this content, along with the possibility of variable learning outcomes. For this reason, we are not likely to see any truly personalized large-scale adaptive learning programs for English any time soon.

Nevertheless, technology is now central to personalized language learning. A good learning platform, which allows learners to connect to ‘social networking systems, podcasts, wikis, blogs, encyclopedias, online dictionaries, webinars, online English courses, various apps’, etc (see Alexandra Chistyakova’s eltdiary), means that personalization could be more easily achieved.

For the time being, at least, adaptive learning systems would seem to work best for ‘those things that can be easily digitized and tested like math problems and reading passages’ writes Barbara Bray . Or low level vocabulary and grammar McNuggets, we might add. Ideal for, say, ‘English Grammar in Use’. But meaningfully personalized language learning?


‘Personalized learning’ sounds very progressive, a utopian educational horizon, and it sounds like it ought to be the future of ELT (as Cleve Miller argues). It also sounds like a pretty good slogan on which to hitch the adaptive bandwagon. But somehow, just somehow, I suspect that when it comes to adaptive learning we’re more likely to see more testing, more data collection and more depersonalization.

[1] Trilling, B. & Fadel, C. 2009 21st Century Skills (San Francisco: Wiley) p.33

[2] Personalized learning: a new ICT­enabled education approach, UNESCO Institute for Information Technologies in Education, Policy Brief March 2012


busuu is an online language learning service. I did not refer to it in the ‘guide’ because it does not seem to use any adaptive learning software yet, but this is set to change. According to founder Bernhard Niesner, the company is already working on incorporation of adaptive software.

A few statistics will show the significance of busuu. The site currently has over 40 million users (El Pais, 8 February 2014) and is growing by 40,000 a day. The basic service is free, but the premium service costs Euro 69.99 a year. The company will not give detailed user statistics, but say that ‘hundreds of thousands’ are paying for the premium service, that turnover was a 7-figure number last year and will rise to 8 figures this year.

It is easy to understand why traditional publishers might be worried about competition like busuu and why they are turning away from print-based courses.

Busuu offers 12 languages, but, as a translation-based service, any one of these languages can only be studied if you speak one of the other languages on offer. The levels of the different courses are tagged to the CEFR.


In some ways, busuu is not so different from competitors like duolingo. Students are presented with bilingual vocabulary sets, accompanied by pictures, which are tested in a variety of ways. As with duolingo, some of this is a little strange. For German at level A1, I did a vocabulary set on ‘pets’ which presented the German words for a ferret, a tortoise and a guinea-pig, among others. There are dialogues, which are both written and recorded, that are sometimes surreal.

Child: Mum, look over there, there’s a dog without a collar, can we take it?

Mother: No, darling, our house is too small to have a dog.

Child: Mum your bedroom is very big, it can sleep with dad and you.

Mother: Come on, I’ll buy you a toy dog.

The dialogues are followed up by multiple choice questions which test your memory of the dialogue. There are also writing exercises where you are given a picture from National Geographic and asked to write about it. It’s not always clear what one is supposed to write. What would you say about a photo that showed a large number of parachutes in the sky, beyond ‘I can see a lot of parachutes’?

There are also many gamification elements. There is a learning carrot where you can set your own learning targets and users can earn ‘busuuberries’ which can then be traded in for animations in a ‘language garden’.


But in one significant respect, busuu differs from its competitors. It combines the usual vocabulary, grammar and dialogue work with social networking. Users can interact with text or video, and feedback on written work comes from other users. My own experience with this was mixed, but the potential is clear. Feedback on other learners’ work is encouraged by the awarding of ‘busuuberries’.

We will have to wait and see what busuu does with adaptive software and what it will do with the big data it is generating. For the moment, its interest lies in illustrating what could be done with a learning platform and adaptive software. The big ELT publishers know they have a new kind of competition and, with a lot more money to invest than busuu, we have to assume that what they will launch a few years from now will do everything that busuu does, and more. Meanwhile, busuu are working on site redesign and adaptivity. They would do well, too, to sort out their syllabus!

‘Adaptive learning’ can mean slightly different things to different people. According to one provider of adaptive learning software (Smart Sparrow, it is ‘an online learning and teaching medium that uses an Intelligent Tutoring System to adapt online learning to the student’s level of knowledge. Adaptive eLearning provides students with customised educational content and the unique feedback that they need, when they need it.’ Essentially, it is software that analyzes the work that a student is doing online, and tailors further learning tasks to the individual learner’s needs (as analyzed by the software).

A relatively simple example of adaptive language learning is Duolingo, a free online service that currently offers seven languages, including English ( ), with over 10 million users in November 2013. Learners progress through a series of translation, dictation and multiple choice exercises that are organised into a ‘skill tree’ of vocabulary and grammar areas. Because translation plays such a central role, the program is only suitable for speakers of one of the languages on offer in combination with one of the other languages on offer. Duolingo’s own blog describes the approach in the following terms: ‘Every time you finish a Duolingo lesson, translation, test, or practice session, you provide valuable data about what you know and what you’re struggling with. Our system uses this info to plan future lessons and select translation tasks specifically for your skills and needs. Similar to how an online store uses your previous purchases to customize your shopping experience, Duolingo uses your learning history to customize your learning experience’ ( skilltree

Example of a ‘skill tree’ from

For anyone with a background in communicative language teaching, the experience can be slightly surreal. Examples of sentences that need to be translated include: The dog eats the bird, the boy has a cow, and the fly is eating bread. The system allows you to compete and communicate with other learners, and to win points and rewards (see ‘Gamification’ next post).

Duolingo describes its crowd-sourced, free, adaptive approach as ‘pretty unique’, but uniquely unique it is not. It is essentially a kind of memory trainer, and there are a number available on the market. One of the most well-known is Cerego’s cloud-based iKnow!, which describes itself as a ‘memory management platform’. Particularly strong in Japan, corporate and individual customers pay a monthly subscription to access its English, Chinese and Japanese language programs. A free trial of some of the products is available at  and I experimented with their ‘Erudite English’ program. This presented a series of words which included ‘defalcate’, ‘fleer’ and ‘kvetch’ through English-only definitions, followed by multiple choice and dictated gap-fill exercises. As with Duolingo, there seemed to be no obvious principle behind the choice of items, and example sentences included things like ‘Michael arrogates a slice of carrot cake, unbeknownst to his sister,’ or ‘She found a place in which to posit the flowerpot.’ Based on a user’s performance, Cerego’s algorithms decide which items will be presented, and select the frequency and timing of opportunities for review. The program can be accessed through ordinary computers, as well as iPhone and Android apps. The platform has been designed in such a way as to allow other content to be imported, and then presented and practised in a similar way.

In a similar vein, the Rosetta Stone software also uses spaced repetition to teach grammar and vocabulary. It describes its adaptive learning as ‘Adaptive Recall™’ According to their website, this provides review activities for each lesson ‘at intervals that are determined by your performance in that review. Exceed the program’s expectations for you and the review gets pushed out further. Fall short and you’ll see it sooner. The program gives you a likely date and automatically notifies you when it’s time to take the review again’. Rosetta Stone has won numerous awards and claims that over 20,000 educational institutions around the world have formed partnerships with them. These include the US military, the University of Barcelona and Harrogate Grammar school in the UK ( ).

Slightly more sophisticated than the memory-trainers described above is the GRE (the Graduate Record Examinations, a test for admission into many graduate schools in the US) online preparation program that is produced by Barron’s ( ). Although this is not an English language course, it provides a useful example of how simple adaptive learning programs can be taken a few steps further. At the time of writing, it is possible to do a free trial, and this gives a good taste of adaptive learning. Barron’s highlights the way that their software delivers individualized study programs: it is not, they say, a case of ‘one size fits all’. After entering the intended test date, the intended number of hours of study, and a simple self-evaluation of different reasoning skills, a diagnostic test completes the information required to set up a personalized ‘prep plan’. This determines the lessons you will be given. As you progress through the course, the ‘prep plan’ adapts to the work that you do, comparing your performance to other students who have taken the course. Measuring your progress and modifying your ‘skill profile’, the order of the lessons and the selection of the 1000+ practice questions can change.