Posts Tagged ‘AI’

Knowble, claims its developers, is a browser extension that will improve English vocabulary and reading comprehension. It also describes itself as an ‘adaptive language learning solution for publishers’. It’s currently beta and free, and sounds right up my street so I decided to give it a run.

Knowble reader

Users are asked to specify a first language (I chose French) and a level (A1 to C2): I chose B1, but this did not seem to impact on anything that subsequently happened. They are then offered a menu of about 30 up-to-date news items, grouped into 5 categories (world, science, business, sport, entertainment). Clicking on one article takes you to the article on the source website. There’s a good selection, including USA Today, CNN, Reuters, the Independent and the Torygraph from Britain, the Times of India, the Independent from Ireland and the Star from Canada. A large number of words are underlined: a single click brings up a translation in the extension box. Double-clicking on all other words will also bring up translations. Apart from that, there is one very short exercise (which has presumably been automatically generated) for each article.

For my trial run, I picked three articles: ‘Woman asks firefighters to help ‘stoned’ raccoon’ (from the BBC, 240 words), ‘Plastic straw and cotton bud ban proposed’ (also from the BBC, 823 words) and ‘London’s first housing market slump since 2009 weighs on UK price growth’ (from the Torygraph, 471 words).


Research suggests that the use of translations, rather than definitions, may lead to more learning gains, but the problem with Knowble is that it relies entirely on Google Translate. Google Translate is fast improving. Take the first sentence of the ‘plastic straw and cotton bud’ article, for example. It’s not a bad translation, but it gets the word ‘bid’ completely wrong, translating it as ‘offre’ (= offer), where ‘tentative’ (= attempt) is needed. So, we can still expect a few problems with Google Translate …

google_translateOne of the reasons that Google Translate has improved is that it no longer treats individual words as individual lexical items. It analyses groups of words and translates chunks or phrases (see, for example, the way it translates ‘as part of’). It doesn’t do word-for-word translation. Knowble, however, have set their software to ask Google for translations of each word as individual items, so the phrase ‘as part of’ is translated ‘comme’ + ‘partie’ + ‘de’. Whilst this example is comprehensible, problems arise very quickly. ‘Cotton buds’ (‘cotons-tiges’) become ‘coton’ + ‘bourgeon’ (= botanical shoots of cotton). Phrases like ‘in time’, ‘run into’, ‘sleep it off’ ‘take its course’, ‘fire station’ or ‘going on’ (all from the stoned raccoon text) all cause problems. In addition, Knowble are not using any parsing tools, so the system does not identify parts of speech, and further translation errors inevitably appear. In the short article of 240 words, about 10% are wrongly translated. Knowble claim to be using NLP tools, but there’s no sign of it here. They’re just using Google Translate rather badly.

Highlighted items

word_listNLP tools of some kind are presumably being used to select the words that get underlined. Exactly how this works is unclear. On the whole, it seems that very high frequency words are ignored and that lower frequency words are underlined. Here, for example, is the list of words that were underlined in the stoned raccoon text. I’ve compared them with (1) the CEFR levels for these words in the English Profile Text Inspector, and (2) the frequency information from the Macmillan dictionary (more stars = more frequent). In the other articles, some extremely high frequency words were underlined (e.g. price, cost, year) while much lower frequency items were not.

It is, of course, extremely difficult to predict which items of vocabulary a learner will know, even if we have a fairly accurate idea of their level. Personal interests play a significant part, so, for example, some people at even a low level will have no problem with ‘cannabis’, ‘stoned’ and ‘high’, even if these are low frequency. First language, however, is a reasonably reliable indicator as cognates can be expected to be easy. A French speaker will have no problem with ‘appreciate’, ‘unique’ and ‘symptom’. A recommendation engine that can meaningfully personalize vocabulary suggestions will, at the very least, need to consider cognates.

In short, the selection and underlining of vocabulary items, as it currently stands in Knowble, appears to serve no clear or useful function.

taskVocabulary learning

Knowble offers a very short exercise for each article. They are of three types: word completion, dictation and drag and drop (see the example). The rationale for the selection of the target items is unclear, but, in any case, these exercises are tokenistic in the extreme and are unlikely to lead to any significant learning gains. More valuable would be the possibility of exporting items into a spaced repetition flash card system.

effectiveThe claim that Knowble’s ‘learning effect is proven scientifically’ seems to me to be without any foundation. If there has been any proper research, it’s not signposted anywhere. Sure, reading lots of news articles (with a look-up function – if it works reliably) can only be beneficial for language learners, but they can do that with any decent dictionary running in the background.

Similar in many ways to, which I reviewed in my last post, Knowble is another example of a technology-driven product that shows little understanding of language learning.


Last month, I wrote a post about the automated generation of vocabulary learning materials. Yesterday, I got an email from Mike Elchik, inviting me to take a look at the product that his company, WeSpeke, has developed in partnership with CNN. Called, it’s a very regularly updated and wide selection of video clips and texts from CNN, which are then used to ‘automatically create a pedagogically structured, leveled and game-ified English lesson‘. Available at the AppStore and Google Play, as well as a desktop version, it’s free. Revenues will presumably be generated through advertising and later sales to corporate clients.

With 6.2 million dollars in funding so far, WeSpeke can leverage some state-of-the-art NLP and AI tools. Co-founder and chief technical adviser of the company is Jaime Carbonell, Director of the Language Technologies Institute at Carnegie Mellon University, described in Wikipedia as one of the gurus of machine learning. I decided to have a closer look.


Users are presented with a menu of CNN content (there were 38 items from yesterday alone), these are tagged with broad categories (Politics, Opinions, Money, Technology, Entertainment, etc.) and given a level, ranging from 1 to 5, although the vast majority of the material is at the two highest levels.


I picked two lessons: a reading text about Mark Zuckerberg’s Congressional hearing (level 5) and a 9 minute news programme of mixed items (level 2 – illustrated above). In both cases, the lesson begins with the text. With the reading, you can click on words to bring up dictionary entries from the Collins dictionary. With the video, you can activate captions and again click on words for definitions. You can also slow down the speed. So far, so good.

There then follows a series of exercises which focus primarily on a set of words that have been automatically selected. This is where the problems began.


It’s far from clear what the levels (1 – 5) refer to. The Zuckerberg text is 930 words long and is rated as B2 by one readability tool. But, using the English Profile Text Inspector, there are 19 types at C1 level, 14 at C2, and 98 which are unlisted. That suggests something substantially higher than B2. The CNN10 video is delivered at breakneck speed (as is often the case with US news shows). Yes, it can be slowed down, but that still won’t help with some passages, such as the one below:

A squirrel recently fell out of a tree in Western New York. Why would that make news?Because she bwoke her widdle leg and needed a widdle cast! Yes, there are casts for squirrels, as you can see in this video from the Orphaned Wildlife Center. A windstorm knocked the animal’s nest out of a tree, and when a woman saw that the baby squirrel was injured, she took her to a local vet. Doctors say she’s going to be just fine in a couple of weeks. Well, why ‘rodent’ she be? She’s been ‘whiskered’ away and cast in both a video and a plaster. And as long as she doesn’t get too ‘squirrelly’ before she heals, she’ll have quite a ‘tail’ to tell.

It’s hard to understand how a text like this got through the algorithms. But, as materials writers know, it is extremely hard to find authentic text that lends itself to language learning at anything below C1. On the evidence here, there is still some way to go before the process of selection can be automated. It may well be the case that CNN simply isn’t a particularly appropriate source.

Target learning items

The primary focus of these lessons is vocabulary learning, and it’s vocabulary learning of a very deliberate kind. Applied linguists are in general agreement that it makes sense for learners to approach the building of their L2 lexicon in a deliberate way (i.e. by studying individual words) for high-frequency items or items that can be identified as having a high surrender value (e.g. items from the AWL for students studying in an EMI context). Once you get to items that are less frequent than, say, the top 8,000 most frequent words, the effort expended in studying new words needs to be offset against their usefulness. Why spend a lot of time studying low frequency words when you’re unlikely to come across them again for some time … and will probably forget them before you do? Vocabulary development at higher levels is better served by extensive reading (and listening), possibly accompanied by glosses.

The target items in the Zuckerberg text were: advocacy, grilled, handicapping, sparked, diagnose, testified, hefty, imminent, deliberative and hesitant. One of these ‘grilled‘ is listed as A2 by English Vocabulary Profile, but that is with its literal, not metaphorical, meaning. Four of them are listed as C2 and the remaining five are off-list. In the CNN10 video, the target items were: strive, humble (verb), amplify, trafficked, enslaved, enacted, algae, trafficking, ink and squirrels. Of these, one is B1, two are C2 and the rest are unlisted. What is the point of studying these essentially random words? Why spend time going through a series of exercises that practise these items? Wouldn’t your time be better spent just doing some more reading? I have no idea how the automated selection of these items takes place, but it’s clear that it’s not working very well.

Practice exercises

There is plenty of variety of task-type but there are,  I think, two reasons to query the claim that these lessons are ‘pedagogically structured’. The first is the nature of the practice exercises; the second is the sequencing of the exercises. I’ll restrict my observations to a selection of the tasks.

1. Users are presented with a dictionary definition and an anagrammed target item which they must unscramble. For example:

existing for the purpose of discussing or planning something     VLREDBETEIIA

If you can’t solve the problem, you can always scroll through the text to find the answer. Burt the problem is in the task design. Dictionary definitions have been written to help language users decode a word. They simply don’t work very well when they are used for another purpose (as prompts for encoding).

2. Users are presented with a dictionary definition for which they must choose one of four words. There are many potential problems here, not the least of which is that definitions are often more complex than the word they are defining, or they present other challenges. As an example: cause to be unpretentious for to humble. On top of that, lexicographers often need or choose to embed the target item in the definition. For example:

a hefty amount of something, especially money, is very large

an event that is imminent, especially an unpleasant one, will happen very soon

When this is the case, it makes no sense to present these definitions and ask learners to find the target item from a list of four.

The two key pieces of content in this product – the CNN texts and the Collins dictionaries – are both less than ideal for their purposes.

3. Users are presented with a box of jumbled words which they must unscramble to form sentences that appeared in the text.


The sentences are usually long and hard to reconstruct. You can scroll through the text to find the answer, but I’m unclear what the point of this would be. The example above contains a mistake (vie instead of vice), but this was one of only two glitches I encountered.

4. Users are asked to select the word that they hear on an audio recording. For example:

squirreling     squirrel     squirreled     squirrels

Given the high level of challenge of both the text and the target items, this was a rather strange exercise to kick off the practice. The meaning has not yet been presented (in a matching / definition task), so what exactly is the point of this exercise?

5. Users are presented with gapped sentences from the text and asked to choose the correct grammatical form of the missing word. Some of these were hard (e.g. adjective order), others were very easy (e.g. some vs any). The example below struck me as plain weird for a lesson at this level.

________ have zero expectation that this Congress is going to make adequate changes. (I or Me ?)

6. At the end of both lessons, there were a small number of questions that tested your memory of the text. If, like me, you couldn’t remember all that much about the text after twenty minutes of vocabulary activities, you can scroll through the text to find the answers. This is not a task type that will develop reading skills: I am unclear what it could possibly develop.


Using the lessons on offer here wouldn’t do a learner (as long as they already had a high level of proficiency) any harm, but it wouldn’t be the most productive use of their time, either. If a learner is motivated to read the text about Zuckerberg, rather than do lots of ‘busy’ work on a very odd set of words with gap-fills and matching tasks, they’d be better advised just to read the text again once or twice. They could use a look-up for words they want to understand and import them into a flashcard system with spaced repetition ( does have flashcards, but there’s no sign of spaced practice yet). More, they could check out another news website and read / watch other articles on the same subject (perhaps choosing websites with a different slant to CNN) and get valuable narrow-reading practice in this way.

My guess is that the technology has driven the product here, but without answering the fundamental questions about which words it’s appropriate for individual learners to study in a deliberate way and how this is best tackled, it doesn’t take learners very far.






Posted: December 1, 2016 in ed tech
Tags: , , , , , , , ,


Here’s a conversation I had recently with Cleverbot.


User (me): Hi Cleverbot. What are your views on the future of chatbots in language learning?

Cleverbot: Are you talking to yourself?

User: No, I’m asking you a question.

Cleverbot: Then what’s the question again? I forgot.

User: What are your views on the future of chatbots in language learning?

Cleverbot: Don’t you want me to finish my story?

User: No, I want you to answer my question.

Cleverbot: What is your question?

User: Do you think there is a future for chatbots in language education?

Cleverbot: Sure, if you are talking about the moon.

aliceI also tried to have a conversation with Alice, the avatar from EFL Classroom Bot, listed by Larry Ferlazzo as one of ‘the best online chatbots for practicing English’. I didn’t get any more sense out of her than out of Cleverbot.

Chatbots, apparently, are the next big thing. Again. David Mattin, head of trends and insights at, writes (in the September 2016 issue of ‘Business Life’) that ‘the chatbot revolution is coming’ and that chatbots are a step towards the dream of an interface between user and technology that is so intuitive that the interface ‘simply fades away’. Chatbots have been around for some time. Remember Clippy – the Microsoft Office bot in the late 1990s – which you had to disable in order to stop yourself punching your computer screen? Since then, bots have become ubiquitous. There have been problems, such as Microsoft’s Tay bot that had to be taken down after sixteen hours earlier this year, when, after interacting with other Twitter users, it developed into an abusive Nazi. But chatbots aren’t going away and you’ve probably interacted with one to book a taxi, order food or attempt to talk to your bank. In September this year, the Guardian described them as ‘the talk of the town’ and ‘hot property in Silicon Valley’.

The real interest in chatbots is not, however, in the ‘exciting interface’ possibilities (both user interface and user experience remain pretty crude), but in the way that they are leaner, sit comfortably with the things we actually do on a phone and the fact that they offer a way of cutting out the high fees that developers have to pay to app stores . After so many start-up failures, chatbots offer a glimmer of financial hope to developers.

It’s no surprise, of course, to find the world of English language teaching beginning to sit up and take notice of this technology. A 2012 article by Ben Lehtinen in PeerSpectives enthuses about the possibilities in English language learning and reports the positive feedback of the author’s own students. ELTJam, so often so quick off the mark, developed an ELT Bot over the course of a hackathon weekend in March this year. Disappointingly, it wasn’t really a bot – more a case of humans pretending to be a bot pretending to be humans – but it probably served its exploratory purpose. duolingoAnd a few months ago Duolingo began incorporating bots. These are currently only available for French, Spanish and German learners in the iPhone app, so I haven’t been able to try it out and evaluate it. According to an infomercial in TechCrunch, ‘to make talking to the bots a bit more compelling, the company tried to give its different bots a bit of personality. There’s Chef Robert, Renee the Driver and Officer Ada, for example. They will react differently to your answers (and correct you as necessary), but for the most part, the idea here is to mimic a real conversation. These bots also allow for a degree of flexibility in your answers that most language-learning software simply isn’t designed for. There are plenty of ways to greet somebody, for example, but most services will often only accept a single answer. When you’re totally stumped for words, though, Duolingo offers a ‘help my reply’ button with a few suggested answers.’ In the last twelve months or so, Duolingo has considerably improved its ability to recognize multiple correct ways of expressing a particular idea, and its ability to recognise alternative answers to its translation tasks. However, I’m highly sceptical about its ability to mimic a real conversation any better than Cleverbot or Alice the EFL Bot, or its ability to provide systematically useful corrections.

My reasons lie in the current limitations of AI and NLP (Natural Language Processing). In a nutshell, we simply don’t know how to build a machine that can truly understand human language. Limited exchanges in restricted domains can be done pretty well (such as the early chatbot that did a good job of simulating an encounter with an evasive therapist, or, more recently ordering a taco and having a meaningless, but flirty conversation with a bot), but despite recent advances in semantic computing, we’re a long way from anything that can mimic a real conversation. As Audrey Watters puts it, we’re not even close.

When it comes to identifying language errors made by language learners, we’re not really much better off. Apps like Grammarly are not bad at identifying grammatical errors (but not good enough to be reliable), but pretty hopeless at dealing with lexical appropriacy. Much more reliable feedback to learners can be offered when the software is trained on particular topics and text types. Write & Improve does this with a relatively small selection of Cambridge English examination tasks, but a free conversation ….? Forget it.

So, how might chatbots be incorporated into language teaching / learning? A blog post from December 2015 entitled AI-powered chatbots and the future of language learning suggests one plausible possibility. Using an existing messenger service, such as WhatsApp or Telegram, an adaptive chatbot would send tasks (such as participation in a conversation thread with a predetermined topic, register, etc., or pronunciation practice or translation exercises) to a learner, provide feedback and record the work for later recycling. At the same time, the bot could send out reminders of work that needs to be done or administrative tasks that must be completed.

Kat Robb has written a very practical article about using instant messaging in English language classrooms. Her ideas are interesting (although I find the idea of students in a F2F classroom messaging each other slightly bizarre) and it’s easy to imagine ways in which her activities might be augmented with chatbot interventions. The Write & Improve app, mentioned above, could deploy a chatbot interface to give feedback instead of the flat (and, in my opinion, perfectly adequate) pop-up boxes currently in use. Come to think of it, more or less any digital language learning tool could be pimped up with a bot. Countless revisions can be envisioned.

But the overwhelming question is: would it be worth it? Bots are not likely, any time soon, to revolutionise language learning. What they might just do, however, is help to further reduce language teaching to a series of ‘mechanical and scripted gestures’. More certain is that a lot of money will be thrown down the post-truth edtech drain. Then, in the not too distant future, this latest piece of edtech will fall into the trough of disillusionment, to be replaced by the latest latest thing.