Vocabulary apps: a wish list

Posted: December 17, 2015 in apps
Tags: , , , , , , , , , ,

Having spent a lot of time recently looking at vocabulary apps, I decided to put together a Christmas wish list of the features of my ideal vocabulary app. The list is not exhaustive and I’ve given more attention to some features than others. What (apart from testing) have I missed out?

1             Spaced repetition

Since the point of a vocabulary app is to help learners memorise vocabulary items, it is hard to imagine a decent system that does not incorporate spaced repetition. Spaced repetition algorithms offer one well-researched way of improving the brain’s ‘forgetting curve’. These algorithms come in different shapes and sizes, and I am not technically competent to judge which is the most efficient. However, as Peter Ellis Jones, the developer of a flashcard system called CardFlash, points out, efficiency is only one half of the rote memorisation problem. If you are not motivated to learn, the cleverness of the algorithm is moot. Fundamentally, learning software needs to be fun, rewarding, and give a solid sense of progression.

2             Quantity, balance and timing of new and ‘old’ items

A spaced repetition algorithm determines the optimum interval between repetitions, but further algorithms will be needed to determine when and with what frequency new items will be added to the deck. Once a system knows how many items a learner needs to learn and the time in which they have to do it, it is possible to determine the timing and frequency of the presentation of new items. But the system cannot know in advance how well an individual learner will learn the items (for any individual, some items will be more readily learnable than others) nor the extent to which learners will live up to their own positive expectations of time spent on-app. As most users of flashcard systems know, it is easy to fall behind, feel swamped and, ultimately, give up. An intelligent system needs to be able to respond to individual variables in order to ensure that the learning load is realistic.

3             Task variety

A standard flashcard system which simply asks learners to indicate whether they ‘know’ a target item before they flip over the card rapidly becomes extremely boring. A system which tests this knowledge soon becomes equally dull. There needs to be a variety of ways in which learners interact with an app, both for reasons of motivation and learning efficiency. It may be the case that, for an individual user, certain task types lead to more rapid gains in learning. An intelligent, adaptive system should be able to capture this information and modify the selection of task types.

Most younger learners and some adult learners will respond well to the inclusion of games within the range of task types. Examples of such games include the puzzles developed by Oliver Rose in his Phrase Maze app to accompany Quizlet practice.Phrase Maze 1Phrase Maze 2

4             Generative use

Memory researchers have long known about the ‘Generation Effect’ (see for example this piece of research from the Journal of Verbal Learning and Learning Behavior, 1978). Items are better learnt when the learner has to generate, in some (even small) way, the target item, rather than simply reading it. In vocabulary learning, this could be, for example, typing in the target word or, more simply, inserting some missing letters. Systems which incorporate task types that require generative use are likely to result in greater learning gains than simple, static flashcards with target items on one side and definitions or translations on the other.

5             Receptive and productive practice

The most basic digital flashcard systems require learners to understand a target item, or to generate it from a definition or translation prompt. Valuable as this may be, it won’t help learners much to use these items productively, since these systems focus exclusively on meaning. In order to do this, information must be provided about collocation, colligation, register, etc and these aspects of word knowledge will need to be focused on within the range of task types. At the same time, most vocabulary apps that I have seen focus primarily on the written word. Although any good system will offer an audio recording of the target item, and many will offer the learner the option of recording themselves, learners are invariably asked to type in their answers, rather than say them. For the latter, speech recognition technology will be needed. Ideally, too, an intelligent system will compare learner recordings with the audio models and provide feedback in such a way that the learner is guided towards a closer reproduction of the model.

6             Scaffolding and feedback

feebuMost flashcard systems are basically low-stakes, practice self-testing. Research (see, for example, Dunlosky et al’s metastudy ‘Improving Students’ Learning With Effective Learning Techniques: Promising Directions From Cognitive and Educational Psychology’) suggests that, as a learning strategy, practice testing has high utility – indeed, of higher utility than other strategies like keyword mnemonics or highlighting. However, an element of tutoring is likely to enhance practice testing, and, for this, scaffolding and feedback will be needed. If, for example, a learner is unable to produce a correct answer, they will probably benefit from being guided towards it through hints, in the same way as a teacher would elicit in a classroom. Likewise, feedback on why an answer is wrong (as opposed to simply being told that you are wrong), followed by encouragement to try again, is likely to enhance learning. Such feedback might, for example, point out that there is perhaps a spelling problem in the learner’s attempted answer, that the attempted answer is in the wrong part of speech, or that it is semantically close to the correct answer but does not collocate with other words in the text. The incorporation of intelligent feedback of this kind will require a number of NLP tools, since it will never be possible for a human item-writer to anticipate all the possible incorrect answers. A current example of intelligent feedback of this kind can be found in the Oxford English Vocabulary Trainer app.

7             Content

At the very least, a decent vocabulary app will need good definitions and translations (how many different languages?), and these will need to be tagged to the senses of the target items. These will need to be supplemented with all the other information that you find in a good learner’s dictionary: syntactic patterns, collocations, cognates, an indication of frequency, etc. The only way of getting this kind of high-quality content is by paying to license it from a company with expertise in lexicography. It doesn’t come cheap.

There will also need to be example sentences, both to illustrate meaning / use and for deployment in tasks. Dictionary databases can provide some of these, but they cannot be relied on as a source. This is because the example sentences in dictionaries have been selected and edited to accompany the other information provided in the dictionary, and not as items in practice exercises, which have rather different requirements. Once more, the solution doesn’t come cheap: experienced item writers will be needed.

Dictionaries describe and illustrate how words are typically used. But examples of typical usage tend to be as dull as they are forgettable. Learning is likely to be enhanced if examples are cognitively salient: weird examples with odd collocations, for example. Another thing for the item writers to think about.

A further challenge for an app which is not level-specific is that both the definitions and example sentences need to be level-specific. An A1 / A2 learner will need the kind of content that is found in, say, the Oxford Essential dictionary; B2 learners and above will need content from, say, the OALD.

8             Artwork and design

My wordbook2It’s easy enough to find artwork or photos of concrete nouns, but try to find or commission a pair of pictures that differentiate, for example, the adjectives ‘wild’ and ‘dangerous’ … What kind of pictures might illustrate simple verbs like ‘learn’ or ‘remember’? Will such illustrations be clear enough when squeezed into a part of a phone screen? Animations or very short video clips might provide a solution in some cases, but these are more expensive to produce and video files are much heavier.

With a few notable exceptions, such as the British Councils’s MyWordBook 2, design in vocabulary apps has been largely forgotten.

9             Importable and personalisable lists

Many learners will want to use a vocabulary app in association with other course material (e.g. coursebooks). Teachers, however, will inevitably want to edit these lists, deleting some items, adding others. Learners will want to do the same. This is a huge headache for app designers. If new items are going to be added to word lists, how will the definitions, example sentences and illustrations be generated? Will the database contain audio recordings of these words? How will these items be added to the practice tasks (if these include task types that go beyond simple double-sided flashcards)? NLP tools are not yet good enough to trawl a large corpus in order to select (and possibly edit) sentences that illustrate the right meaning and which are appropriate for interactive practice exercises. We can personalise the speed of learning and even the types of learning tasks, so long as the target language is predetermined. But as soon as we allow for personalisation of content, we run into difficulties.

10          Gamification

Maintaining motivation to use a vocabulary app is not easy. Gamification may help. Measuring progress against objectives will be a start. Stars and badges and leaderboards may help some users. Rewards may help others. But gamification features need to be built into the heart of the system, into the design and selection of tasks, rather than simply tacked on as an afterthought. They need to be trialled and tweaked, so analytics will be needed.

11          Teacher support

Although the use of vocabulary flashcards is beginning to catch on with English language teachers, teachers need help with ways to incorporate them in the work they do with their students. What can teachers do in class to encourage use of the app? In what ways does app use require teachers to change their approach to vocabulary work in the classroom? Reporting functions can help teachers know about the progress their students are making and provide very detailed information about words that are causing problems. But, as anyone involved in platform-based course materials knows, teachers need a lot of help.

12          And, of course, …

Apps need to be usable with different operating systems. Ideally, they should be (partially) usable offline. Loading times need to be short. They need to be easy and intuitive to use.

It’s unlikely that I’ll be seeing a vocabulary app with all of these features any time soon. Or, possibly, ever. The cost of developing something that could do all this would be extremely high, and there is no indication that there is a market that would be ready to pay the sort of prices that would be needed to cover the costs of development and turn a profit. We need to bear in mind, too, the fact that vocabulary apps can only ever assist in the initial acquisition of vocabulary: apps alone can’t solve the vocabulary learning problem (despite the silly claims of some app developers). The need for meaningful communicative use, extensive reading and listening, will not go away because a learner has been using an app. So, how far can we go in developing better and better vocabulary apps before users decide that a cheap / free app, with all its shortcomings, is actually good enough?

I posted a follow up to this post in October 2016.

Advertisements
Comments
  1. […] Having spent a lot of time recently looking at vocabulary apps, I decided to put together a Christmas wish list of the features of my ideal vocabulary app. The list is not exhaustive and I’ve given more attention to some features than others. What (apart from testing) have I missed out? 1 Spaced repetition Since…  […]

  2. Oliver Rose says:

    Hi Philip,

    Thanks for the mention of Phrase Maze – it’s actually had a makeover and the latest iOS version is called PhraseBot, which can be downloaded here:

    https://geo.itunes.apple.com/us/app/phrasebot-for-quizlet/id1029801099?mt=8

    Great points you make about all the various factors to take into account, I’ll drop back later and comment further!

    Cheers,

    Oliver

  3. A great read and a really good summary of where we’re at with vocabulary apps. All the way through you had my dreaming about working on the ultimate vocab app, but then I got to the end … and I have to agree that if we can’t even find a way of funding proper lexicography any more, putting all those incredible skills (and people) together and making money from it seems fairly far-fetched. *Sighs*

  4. mjholmwood says:

    This is a really interesting particle. Thanks for writing it! It makes me really happy as I just checked off all your points against our own vocab trainer. OK we don’t score 100%, but we are pretty close :-).

    Seriously though, it would be great if we could offer all students really excellent learning tools.

    Wishing you all a very happy Christmas and prosperous New Year!

    All the best,

    Mark

  5. rbadwan says:

    Do you have a favorite app that you recommend to students? Thinking of the ones I usually recommend, each fit some of the criteria but definitely not all. I’m interested to know which app you looked at fits the most.

  6. […] Having spent a lot of time recently looking at vocabulary apps, I decided to put together a Christmas wish list of the features of my ideal vocabulary app. The list is not exhaustive and I’ve given more attention to some features than others. What (apart from testing) have I missed out? 1 Spaced repetition Since…  […]

  7. We’re just about to release our own vocabulary app, word magic (https://www.wordmagicapp.com), which thankfully seems to meet quite a few of these requirements. It uses word search puzzles with descriptions and images shown once the word is found. I think it’s great because word searches naturally make players have to remember the ‘structure’ of a word before being able to find them really fast. It’s targeted at children but we are working on letting people create their own word searches to help people learn different types of terms like for religious organisations, language teachers, etc. I’d love to hear if anybody has feedback about it. You can grab it for free from http://beta.wordmagicapp.com 🙂

  8. While we don’t meet every criteria on your list – yet – our InferCabulary and WordQuations apps (vocabulary iPad apps) do meet some of the criteria that no other vocabulary apps have. If you would like to check them out I would be glad to give you promo codes from the App Store. Specifically, they are generative in nature because students are asked to infer meaning and generate their own definition first for InferCabulary, before they see our definition. We are currently working on a web-based application, InferCabulary Pro, that will be used on any platform, allow teachers to pick and choose words, provide data collection and eventually create their own words/picture content.

  9. Lida Zlatic says:

    Hi Philip, thanks for the post! When my Spanish class got a laptop cart a few years ago, I made a similar wishlist. Now, I’m working with 2 other teachers to make our wish list into a real program. We have an early prototype that covers most (not yet all) of the things you mention in your blog. If you’d like to give it a try, I’d really appreciate your feedback! You can reach me at zlaticdbk@gmail.com

  10. eflnotes says:

    nice list
    i wonder how current vocab apps measure up to the “Seven Hypotheses Relevant for Developing Multimedia CALL” by Chappelle in 1998
    http://llt.msu.edu/vol2num1/article1/

  11. philipjkerr says:

    Thanks for the comment, Mura. Here are the seven hypotheses and some brief comments about the current state of play:
    1. The linguistic characteristics of target language input need to be made salient.
    Most vocab apps I have seen (there are a few exceptions) limit themselves to target items and definitions / translations, and there is very little in the way of ‘highlighting input in materials to prompt learners to notice particular syntactic forms’.
    2. Learners should receive help in comprehending semantic and syntactic aspects of linguistic input.
    Syntactic aspects are generally not focused on in any systematic or helpful way.
    3. Learners need to have opportunities to produce target language output.
    Not yet. See my comment under point 9 of the post. I’ll be writing more about NLP when I find the time.
    4. Learners need to notice errors in their own output.
    Not yet – see above. Typically, the only output that is required is typing in the target item. Because of this task design, the most common errors are spelling, and most apps cannot even do what Duolingo does and suggest a spelling error has occurred.
    5. Learners need to correct their linguistic output.
    Not yet – see above.
    6. Learners need to engage in target language interaction whose structure can be modified for negotiation of meaning.
    No.
    7. Learners should engage in L2 tasks designed to maximize opportunities for good interaction.
    No.
    Having said the above, I am not sure that Carol Chapelle’s criteria are especially relevant to much language learning software. When she wrote the piece back in 1998, there were plenty of bits of software that could help learners in limited ways (e.g. with their spelling) without attempting to provide the full monty. These programs didn’t meet her criteria, but they didn’t attempt to, either. These days there are plenty of bits of software that can help learners in limited ways (e.g. with pronunciation), but the belief that software can provide the full monty any time soon is shared by very few people, I suspect.

  12. eflnotes says:

    thanks Philip

    i don’t read Chappelle as wanting all dancing all singing software but as applying SLA principles to software, as your list tries to do from mainly cognitive research principles(?)

    and this lack of knowledge by the developers of software is a recurring theme in educational technology not insurmountable but persistent

    ta
    mura

  13. Ryan M says:

    Hi Philip,

    This post is a gold mine of great ideas; I’m glad I found it and look forward to reading more of your work. I am developing WordBrewery.com, a language-learning website and app that will eventually incorporate many of these features. We only teach vocabulary in context through example sentences, and all of our sentences are scraped from news sites around the world and then tested for usefulness by a word-frequency algorithm before being presented to users in a way that is tailored to their individual ability level. If you would like to discuss how some of your insights and research could be built into our program, please email me (admin@wordbrewery.com). We are in active development and collecting as many ideas as we can.

    WordBrewery is building courses for learners of English, Spanish, and 17 other languages that use useful, authentic example sentences to systematically take learners on the most efficient path from their baseline vocabulary to the vocabulary needed for reading fluency. Learners can keep going as long as they want (e.g. study for the TOEFL or the GRE), because we can scrape any text and our sentence database is constantly changing, so our content is inexhaustible. WordBrewery’s courses will work in part as follows: if you have a working vocabulary of the 200 most common English words, our system will attempt to show you a sentence that contains only those words you already know as well as the 201st most common English word. This combines the insights of Zipf’s Law (https://en.wikipedia.org/wiki/Zipf%27s_law#Motivation) and the idea of setting optimal (achievable but challenging) goals

    As for your suggestions–even collectively, I see them as largely achievable in a single app, and we have already devised a solution for most of them:

    1. Spaced repetition – We are building this into our flashcard program and courses, with Anki (http://ankisrs.net) and Supermemo (https://www.supermemo.com/articles/theory.html) as our basic models, and Khan Academy’s math exercises as our approximate model for user-experience (visually attractive, fun and easy to use, addictive, etc.)

    2. Quality balance and timing of new and ‘old’ items – we will accomplish this in our language courses by generally declining to introduce a new word until an existing word is mastered. Each word a learner masters will increase the set of sentences he or she could see.

    3. Task variety – I agree; each additional game is a mini-app in itself, so this is challenging to do well, but we will start with two varieties of questions in our flashcard/quiz module and then gradually consider other games. But I do not believe in studying words in isolation, so learners will never see a word in our program that is not accompanied by one or more example sentences; our unit of study is the sentence, but the outcome and metric is vocabulary knowledge. Our learners will acquire vocabulary by being repeatedly exposed to the same word in different contexts until our spaced-repetition system considers the word mastered.

    4. Generative use – We will incorporate this through cloze (fill-in-the-blank) tests and quizzes requiring recall in addition to recognition.

    P.S. – I feel that the traditional method of showing learners “simple, static flashcards with target items on one side and definitions or translations on the other” is useless. Words exist in context and must be studied in context–so our students study sentences directly and thereby learn words indirectly.

    5. Receptive and productive practice – there are a lot of good ideas here. We are going to eventually treat idioms and common n-grams / collocations as “words” to be learned like any other. We are continually studying ways to use natural language processing technology and text analysis to address the challenges you mention. If you have specific ideas for what you want to see, please contact me.

    6. Scaffolding and feedback – I hadn’t thought of this but will look into the app you mention. Thank you!

    7. Content – we have solved the content problem with respect to example sentences (which are not ancillary in our model, but rather central), and we also show word frequency rank and monolingual definitions as well as English translations. We are working on developing ways to automatically collect some of the other information you list. Definitions and translations are a problem;

    At the very least, a decent vocabulary app will need good definitions and translations (how many different languages?), and these will need to be tagged to the senses of the target items. These will need to be supplemented with all the other information that you find in a good learner’s dictionary: syntactic patterns, collocations, cognates, an indication of frequency, etc. The only way of getting this kind of high-quality content is by paying to license it from a company with expertise in lexicography. It doesn’t come cheap.

    We have solved all of the following problems you mention–we have example sentences that are real and relevant rather than dull, machine-generated, or artificial. And learners see sentences that are tailored to their level.”

    “There will also need to be example sentences, both to illustrate meaning / use and for deployment in tasks. Dictionary databases can provide some of these, but they cannot be relied on as a source. This is because the example sentences in dictionaries have been selected and edited to accompany the other information provided in the dictionary, and not as items in practice exercises, which have rather different requirements. Once more, the solution doesn’t come cheap: experienced item writers will be needed.

    Dictionaries describe and illustrate how words are typically used. But examples of typical usage tend to be as dull as they are forgettable. Learning is likely to be enhanced if examples are cognitively salient: weird examples with odd collocations, for example. Another thing for the item writers to think about.”

    On this issue of multiple dictionaries–I hadn’t thought about it in these terms, but we will address this by providing one-click links to look up words in sources like Wiktionary, Google, Wikipedia, etc.: “A further challenge for an app which is not level-specific is that both the definitions and example sentences need to be level-specific. An A1 / A2 learner will need the kind of content that is found in, say, the Oxford Essential dictionary; B2 learners and above will need content from, say, the OALD.”

    8. Artwork and design – this is a challenge, but we will allow users to upload images to their word and sentence lists, as they can in Memrise and Anki.

    9. Importable and personalisable lists – WordBrewery learners can save any word or sentence they find to study lists, then export them to Anki or any other program; soon they will be able to use our own spaced-repetition quizzes (which are still in development). Learners can currently personalize content by restricting their set of sentences to categories (sports, news, business, etc.), and we will soon build much more fine-grained categories by collecting sentences that use particular keywords on a topical list.

    10. Gamification – this is difficult, but I think Khan Academy’s math courses do it well, and that will be our model (badges, missions, etc.–no time-wasting frills, just enough to show you that you are making progress). When people ask about this, I sometimes say (only half in jest) that WordBrewery doesn’t need gamification because language itself is a game, and each sentence is a puzzle. But we are also building everything with motivational dips and short attention spans in mind.

    11/12. Teacher support – we have plans to build an instructor dashboard (again, similar to what Khan Academy offers). We are also currently thinking through the issue of analytics and metrics–besides word knowledge, what should we track, and why? I would welcome any insights you or others have on that.

    13. Apps need to be usable with different operating systems. (WordBrewery is currently a webapp with a mobile site; both require Internet. We are building Android and iOS apps, and we will find a way to offer some offline content. “Loading times need to be short. They need to be easy and intuitive to use.”: I’m not sure we are there yet at the beta stage, but we are doing a thorough design and UX overhaul and listening very closely to what our beta testers and prospective customers want.

    Thanks again for your thoughtful post. I welcome you or any of your readers to visit WordBrewery.com to try out our example-based learning technology, then email me at admin@wordbrewery.com if you would like to discuss specific requests and ideas, set up a pilot program, become a beta tester, or receive updates on our development.

    Thank you,
    Ryan from WordBrewery.com

  14. […] December last year, I posted a wish list for vocabulary (flashcard) apps. At the time, I hadn’t read a couple of key research texts on the subject. It’s time […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s