Posts Tagged ‘testing’

It’s international ELT conference season again, with TESOL Chicago having just come to a close and IATEFL Brighton soon to start. I decided to take a look at how the subject of personalized learning will be covered at the second of these. Taking the conference programme , I trawled through looking for references to my topic.

Jing_word_cloudMy first question was: how do conference presenters feel about personalised learning? One way of finding out is by looking at the adjectives that are found in close proximity. This is what you get.

The overall enthusiasm is even clearer when the contexts are looked at more closely. Here are a few examples:

  • inspiring assessment, personalising learning
  • personalised training can contribute to professionalism and […] spark ideas for teacher trainers
  • a personalised educational experience that ultimately improves learner outcomes
  • personalised teacher development: is it achievable?

Particularly striking is the complete absence of anything that suggests that personalized learning might not be a ‘good thing’. The assumption throughout is that personalized learning is desirable and the only question that is asked is how it can be achieved. Unfortunately (and however much we might like to believe that it is a ‘good thing’), there is a serious lack of research evidence which demonstrates that this is the case. I have written about this here and here and here . For a useful summary of the current situation, see Benjamin Riley’s article where he writes that ‘it seems wise to ask what evidence we presently have that personalized learning works. Answer: Virtually none. One remarkable aspect of the personalized-learning craze is how quickly the concept has spread despite the almost total absence of rigorous research in support of it, at least thus far.’

Given that personalized learning can mean so many things and given the fact that people do not have space to define their terms in their conference abstracts, it is interesting to see what other aspects of language learning / teaching it is associated with. The four main areas are as follows (in alphabetical order):

  • assessment (especially formative assessment) / learning outcomes
  • continuous professional development
  • learner autonomy
  • technology / blended learning

The IATEFL TD SIG would appear to be one of the main promoters of personalized learning (or personalized teacher development) with a one-day pre-conference event entitled ‘Personalised teacher development – is it achievable?’ and a ‘showcase’ forum entitled ‘Forum on Effective & personalised: the holy grail of CPD’. Amusingly (but coincidentally, I suppose), the forum takes place in the ‘Cambridge room’ (see below).

I can understand why the SIG organisers may have chosen this focus. It’s something of a hot topic, and getting hotter. For example:

  • Cambridge University Press has identified personalization as one of the ‘six key principles of effective teacher development programmes’ and is offering tailor-made teacher development programmes for institutions.
  • NILE and Macmillan recently launched a partnership whose brief is to ‘curate personalised professional development with an appropriate mix of ‘formal’ and ‘informal’ learning delivered online, blended and face to face’.
  • Pearson has developed the Pearson’s Teacher Development Interactive (TDI) – ‘an interactive online course to train and certify teachers to deliver effective instruction in English as a foreign language […] You can complete each module on your own time, at your own pace from anywhere you have access to the internet.’

These examples do not, of course, provide any explanation for why personalized learning is a hot topic, but the answer to that is simple. Money. Billions and billions, and if you want a breakdown, have a look at the appendix of Monica Bulger’s report, ‘Personalized Learning: The Conversations We’re Not Having’ . Starting with Microsoft and the Gates Foundation plus Facebook and the Chan / Zuckerberg Foundation, dozens of venture philanthropists have thrown unimaginable sums of money at the idea of personalized learning. They have backed up their cash with powerful lobbying and their message has got through. Consent has been successfully manufactured.

PearsonOne of the most significant players in this field is Pearson, who have long been one of the most visible promoters of personalized learning (see the screen capture). At IATEFL, two of the ten conference abstracts which include the word ‘personalized’ are directly sponsored by Pearson. Pearson actually have ten presentations they have directly sponsored or are very closely associated with. Many of these do not refer to personalized learning in the abstract, but would presumably do so in the presentations themselves. There is, for example, a report on a professional development programme in Brazil using TDI (see above). There are two talks about the GSE, described as a tool ‘used to provide a personalised view of students’ language’. The marketing intent is clear: Pearson is to be associated with personalized learning (which is, in turn, associated with a variety of tech tools) – they even have a VP of data analytics, data science and personalized learning.

But the direct funding of the message is probably less important these days than the reinforcement, by those with no vested interests, of the set of beliefs, the ideology, which underpin the selling of personalized learning products. According to this script, personalized learning can promote creativity, empowerment, inclusiveness and preparedness for the real world of work. It sets itself up in opposition to lockstep and factory models of education, and sets learners free as consumers in a world of educational choice. It is a message with which it is hard for many of us to disagree.

manufacturing consentIt is also a marvellous example of propaganda, of the way that consent is manufactured. (If you haven’t read it yet, it’s probably time to read Herman and Chomsky’s ‘Manufacturing Consent: The Political Economy of the Mass Media’.) An excellent account of the way that consent for personalized learning has been manufactured can be found at Benjamin Doxtdator’s blog .

So, a hot topic it is, and its multiple inclusion in the conference programme will no doubt be welcomed by those who are selling ‘personalized’ products. It must be very satisfying to see how normalised the term has become, how it’s no longer necessary to spend too much on promoting the idea, how it’s so associated with technology, (formative) assessment, autonomy and teacher development … since others are doing it for you.

Advertisements

A personalized language learning programme that is worth its name needs to offer a wide variety of paths to accommodate the varying interests, priorities, levels and preferred approaches to learning of the users of the programme. For this to be possible, a huge quantity of learning material is needed (Iwata et al., 2011: 1): the preparation and curation of this material is extremely time-consuming and expensive (despite the pittance that is paid to writers and editors). It’s not surprising, then, that a growing amount of research is being devoted to the exploration of ways of automatically generating language learning material. One area that has attracted a lot of attention is the learning of vocabulary.

Memrise screenshot 2Many simple vocabulary learning tasks are relatively simple to generate automatically. These include matching tasks of various kinds, such as the matching of words or phrases to meanings (either in English or the L1), pictures or collocations, as in many flashcard apps. Doing it well is rather harder: the definitions or translations have to be good and appropriate for learners of the level, the pictures need to be appropriate. If, as is often the case, the lexical items have come from a text or form part of a group of some kind, sense disambiguation software will be needed to ensure that the right meaning is being practised. Anyone who has used flashcard apps knows that the major problem is usually the quality of the content (whether it has been automatically generated or written by someone).

A further challenge is the generation of distractors. In the example here (from Memrise), the distractors have been so badly generated as to render the task more or less a complete waste of time. Distractors must, in some way, be viable alternatives (Smith et al., 2010) but still clearly wrong. That means they should normally be the same part of speech and true cognates should be avoided. Research into the automatic generation of distractors is well-advanced (see, for instance, Kumar at al., 2015) with Smith et al (2010), for example, using a very large corpus and various functions of Sketch Engine (the most well-known corpus query tool) to find collocates and other distractors. Their TEDDCLOG (Testing English with Data-Driven CLOze Generation) system produced distractors that were deemed acceptable 91% of the time. Whilst impressive, there is still a long way to go before human editing / rewriting is no longer needed.

One area that has attracted attention is, of course, tests, and some tasks, such as those in TOEFL (see image). Susanti et al (2015, 2017) were able, given a target word, to automatically generate a reading passage from web sources along with questions of the TOEFL kind. However, only about half of them were considered good enough to be used in actual tests. Again, that is some way off avoiding human intervention altogether, but the automatically generated texts and questions can greatly facilitate the work of human item writers.

toefl task

 

Other tools that might be useful include the University of Nottingham AWL (Academic Word List) Gapmaker . This allows users to type or paste in a text, from which items from the AWL are extracted and replaced as a gap. See the example below. It would, presumably, not be too difficult, to combine this approach with automatic distractor generation and to create multiple choice tasks.

Nottingham_AWL_Gapmaster

WordGapThere are a number of applications that offer the possibility of generating cloze tasks from texts selected by the user (learner or teacher). These have not always been designed with the language learner in mind but one that was is the Android app, WordGap (Knoop & Wilske, 2013). Described by its developers as a tool that ‘provides highly individualized exercises to support contextualized mobile vocabulary learning …. It matches the interests of the learner and increases the motivation to learn’. It may well do all that, but then again, perhaps not. As Knoop & Wilske acknowledge, it is only appropriate for adult, advanced learners and its value as a learning task is questionable. The target item that has been automatically selected is ‘novel’, a word that features in the list Oxford 2000 Keywords (as do all three distractors), and therefore ought to be well below the level of the users. Some people might find this fun, but, in terms of learning, they would probably be better off using an app that made instant look-up of words in the text possible.

More interesting, in my view, is TEDDCLOG (Smith et al., 2010), a system that, given a target learning item (here the focus is on collocations), trawls a large corpus to find the best sentence that illustrates it. ‘Good sentences’ were defined as those which were short (but not too short, or there is not enough useful context, begins with a capital letter and ends with a full stop, has a maximum of two commas; and otherwise contains only the 26 lowercase letters. It must be at a lexical and grammatical level that an intermediate level learner of English could be expected to understand. It must be well-formed and without too much superfluous material. All others were rejected. TEDDCLOG uses Sketch Engine’s GDEX function (Good Dictionary Example Extractor, Kilgarriff et al 2008) to do this.

My own interest in this area came about as a result of my work in the development of the Oxford Vocabulary Trainer . The app offers the possibility of studying both pre-determined lexical items (e.g. the vocabulary list of a coursebook that the learner is using) and free choice (any item could be activated and sent to a learning queue). In both cases, practice takes the form of sentences with the target item gapped. There are a range of hints and help options available to the learner, and feedback is both automatic and formative (i.e. if the supplied answer is not correct, hints are given to push the learner to do better on a second attempt). Leveraging some fairly heavy technology, we were able to achieve a fair amount of success in the automation of intelligent feedback, but what had, at first sight, seemed a lesser challenge – the generation of suitable ‘carrier sentences’, proved more difficult.

The sentences which ‘carry’ the gap should, ideally, be authentic: invented examples often ‘do not replicate the phraseology and collocational preferences of naturally-occurring text’ (Smith et al., 2010). The technology of corpus search tools should allow us to do a better job than human item writers. For that to be the case, we need not only good search tools but a good corpus … and some are better than others for the purposes of language learning. As Fenogenova & Kuzmenko (2016) discovered when using different corpora to automatically generate multiple choice vocabulary exercises, the British Academic Written English corpus (BAWE) was almost 50% more useful than the British National Corpus (BNC). In the development of the Oxford Vocabulary Trainer, we thought we had the best corpus we could get our hands on – the tagged corpus used for the production of the Oxford suite of dictionaries. We could, in addition and when necessary, turn to other corpora, including the BAWE and the BNC. Our requirements for acceptable carrier sentences were similar to those of Smith et al (2010), but were considerably more stringent.

To cut quite a long story short, we learnt fairly quickly that we simply couldn’t automate the generation of carrier sentences with sufficient consistency or reliability. As with some of the other examples discussed in this post, we were able to use the technology to help the writers in their work. We also learnt (rather belatedly, it has to be admitted) that we were trying to find technological solutions to problems that we hadn’t adequately analysed at the start. We hadn’t, for example, given sufficient thought to learner differences, especially the role of L1 (and other languages) in learning English. We hadn’t thought enough about the ‘messiness’ of either language or language learning. It’s possible, given enough resources, that we could have found ways of improving the algorithms, of leveraging other tools, or of deploying additional databases (especially learner corpora) in our quest for a personalised vocabulary learning system. But, in the end, it became clear to me that we were only nibbling at the problem of vocabulary learning. Deliberate learning of vocabulary may be an important part of acquiring a language, but it remains only a relatively small part. Technology may be able to help us in a variety of ways (and much more so in testing than learning), but the dreams of the data scientists (who wrote much of the research cited here) are likely to be short-lived. Experienced writers and editors of learning materials will be needed for the foreseeable future. And truly personalized vocabulary learning, fully supported by technology, will not be happening any time soon.

 

References

Fenogenova, A. & Kuzmenko, E. 2016. Automatic Generation of Lexical Exercises Available online at http://www.dialog-21.ru/media/3477/fenogenova.pdf

Iwata, T., Goto, T., Kojiri, T., Watanabe, T. & T. Yamada. 2011. ‘Automatic Generation of English Cloze Questions Based on Machine Learning’. NTT Technical Review Vol. 9 No. 10 Oct. 2011

Kilgarriff, A. et al. 2008. ‘GDEX: Automatically Finding Good Dictionary Examples in a Corpus.’ In E. Bernal and J. DeCesaris (eds.), Proceedings of the XIII EURALEX International Congress: Barcelona, 15-19 July 2008. Barcelona: l’Institut Universitari de Lingüística Aplicada (IULA) dela Universitat Pompeu Fabra, 425–432.

Knoop, S. & Wilske, S. 2013. ‘WordGap – Automatic generation of gap-filling vocabulary exercises for mobile learning’. Proceedings of the second workshop on NLP for computer-assisted language learning at NODALIDA 2013. NEALT Proceedings Series 17 / Linköping Electronic Conference Proceedings 86: 39–47. Available online at http://www.ep.liu.se/ecp/086/004/ecp13086004.pdf

Kumar, G., Banchs, R.E. & D’Haro, L.F. 2015. ‘RevUP: Automatic Gap-Fill Question Generation from Educational Texts’. Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, 2015, pp. 154–161, Denver, Colorado, June 4, Association for Computational Linguistics

Smith, S., Avinesh, P.V.S. & Kilgariff, A. 2010. ‘Gap-fill tests for Language Learners: Corpus-Driven Item Generation’. Proceedings of ICON-2010: 8th International Conference on Natural Language Processing, Macmillan Publishers, India. Available online at https://curve.coventry.ac.uk/open/file/2b755b39-a0fa-4171-b5ae-5d39568874e5/1/smithcomb2.pdf

Susanti, Y., Iida, R. & Tokunaga, T. 2015. ‘Automatic Generation of English Vocabulary Tests’. Proceedings of 7th International Conference on Computer Supported Education. Available online https://pdfs.semanticscholar.org/aead/415c1e07803756902b859e8b6e47ce312d96.pdf

Susanti, Y., Tokunaga, T., Nishikawa, H. & H. Obari 2017. ‘Evaluation of automatically generated English vocabulary questions’ Research and Practice in Technology Enhanced Learning 12 / 11

 

Introduction

In the last post, I looked at issues concerning self-pacing in personalized language learning programmes. This time, I turn to personalized goal-setting. Most definitions of personalized learning, such as that offered by Next Generation Learning Challenges http://nextgenlearning.org/ (a non-profit supported by Educause, the Gates Foundation, the Broad Foundation, the Hewlett Foundation, among others), argue that ‘the default perspective [should be] the student’s—not the curriculum, or the teacher, and that schools need to adjust to accommodate not only students’ academic strengths and weaknesses, but also their interests, and what motivates them to succeed.’ It’s a perspective shared by the United States National Education Technology Plan 2017 https://tech.ed.gov/netp/ , which promotes the idea that learning objectives should vary based on learner needs, and should often be self-initiated. It’s shared by the massively funded Facebook initiative that is developing software that ‘puts students in charge of their lesson plans’, as the New York Times https://www.nytimes.com/2016/08/10/technology/facebook-helps-develop-software-that-puts-students-in-charge-of-their-lesson-plans.html?_r=0 put it. How, precisely, personalized goal-setting can be squared with standardized, high-stakes testing is less than clear. Are they incompatible by any chance?

In language learning, the idea that learners should have some say in what they are learning is not new, going back, at least, to the humanistic turn in the 1970s. Wilga Rivers advocated ‘giving the students opportunity to choose what they want to learn’ (Rivers, 1971: 165). A few years later, Renee Disick argued that the extent to which a learning programme can be called personalized (although she used the term ‘individualized’) depends on the extent to which learners have a say in the choice of learning objectives and the content of learning (Disick, 1975). Coming more up to date, Penny Ur advocated giving learners ‘a measure of freedom to choose how and what to learn’ (Ur, 1996: 233).

The benefits of personalized goal-setting

Personalized goal-setting is closely related to learner autonomy and learner agency. Indeed, it is hard to imagine any meaningful sense of learner autonomy or agency without some control of learning objectives. Without this control, it will be harder for learners to develop an L2 self. This matters because ‘ultimate attainment in second-language learning relies on one’s agency … [it] is crucial at the point where the individuals must not just start memorizing a dozen new words and expressions but have to decide on whether to initiate a long, painful, inexhaustive, and, for some, never-ending process of self-translation. (Pavlenko & Lantolf, 2000: 169 – 170). Put bluntly, if learners ‘have some responsibility for their own learning, they are more likely to be engaged than if they are just doing what the teacher tells them to’ (Harmer, 2012: 90). A degree of autonomy should lead to increased motivation which, in turn, should lead to increased achievement (Dickinson, 1987: 32; Cordova & Lepper, 1996: 726).

Strong evidence for these claims is not easy to provide, not least since autonomy and agency cannot be measured. However, ‘negative evidence clearly shows that a lack of agency can stifle learning by denying learners control over aspects of the language-learning process’ (Vandergriff, 2016: 91). Most language teachers (especially in compulsory education) have witnessed the negative effects that a lack of agency can generate in some students. Irrespective of the extent to which students are allowed to influence learning objectives, the desirability of agency / autonomy appears to be ‘deeply embedded in the professional consciousness of the ELT community’ (Borg and Al-Busaidi, 2012; Benson, 2016: 341). Personalized goal-setting may not, for a host of reasons, be possible in a particular learning / teaching context, but in principle it would seem to be a ‘good thing’.

Goal-setting and technology

The idea that learners might learn more and better if allowed to set their own learning objectives is hardly new, dating back at least one hundred years to the establishment of Montessori’s first Casa dei Bambini. In language teaching, the interest in personalized learning that developed in the 1970s (see my previous post) led to numerous classroom experiments in personalized goal-setting. These did not result in lasting changes, not least because the workload of teachers became ‘overwhelming’ (Disick, 1975: 128).

Closely related was the establishment of ‘self-access centres’. It was clear to anyone, like myself, who was involved in the setting-up and maintenance of a self-access centre, that they cost a lot, in terms of both money and work (Ur, 2012: 236). But there were also nagging questions about how effective they were (Morrison, 2005). Even more problematic was a bigger question: did they actually promote the learner autonomy that was their main goal?

Post-2000, online technology rendered self-access centres redundant: who needs the ‘walled garden’ of a self-access centre when ‘learners are able to connect with multiple resources and communities via the World Wide Web in entirely individual ways’ (Reinders, 2012)? The cost problem of self-access centres was solved by the web. Readily available now were ‘myriad digital devices, software, and learning platforms offering educators a once-unimaginable array of options for tailoring lessons to students’ needs’ (Cavanagh, 2014). Not only that … online technology promised to grant agency, to ‘empower language learners to take charge of their own learning’ and ‘to provide opportunities for learners to develop their L2 voice’ (Vandergriff, 2016: 32). The dream of personalized learning has become inseparable from the affordances of educational technologies.

It is, however, striking just how few online modes of language learning offer any degree of personalized goal-setting. Take a look at some of the big providers – Voxy, Busuu, Duolingo, Rosetta Stone or Babbel, for example – and you will find only the most token nods to personalized learning objectives. Course providers appear to be more interested in claiming their products are personalized (‘You decide what you want to learn and when!’) than in developing a sufficient amount of content to permit personalized goal-setting. We are left with the ELT equivalent of personalized cans of Coke: a marketing tool.

coke_cans

The problems with personalized goal-setting

Would language learning products, such as those mentioned above, be measurably any better if they did facilitate the personalization of learning objectives in a significant way? Would they be able to promote learner autonomy and agency in a way that self-access centres apparently failed to achieve? It’s time to consider the square quotes that I put around ‘good thing’.

Researchers have identified a number of potential problems with goal-setting. I have already mentioned the problem of reconciling personalized goals and standardized testing. In most learning contexts, educational authorities (usually the state) regulate the curriculum and determine assessment practices. It is difficult to see, as Campbell et al. (Campbell et al., 2007: 138) point out, how such regulation ‘could allow individual interpretations of the goals and values of education’. Most assessment systems ‘aim at convergent outcomes and homogeneity’ (Benson, 2016: 345) and this is especially true of online platforms, irrespective of their claims to ‘personalization’. In weak (typically internal) assessment systems, the potential for autonomy is strongest, but these are rare.

In all contexts, it is likely that personalized goal-setting will only lead to learning gains when a number of conditions are met. The goals that are chosen need to be both specific, measurable, challenging and non-conflicting (Ordóñez et al. 2009: 2-3). They need to be realistic: if not, it is unlikely that self-efficacy (a person’s belief about their own capability to achieve or perform to a certain level) will be promoted (Koda-Dallow & Hobbs, 2005), and without self-efficacy, improved performance is also unlikely (Bandura, 1997). The problem is that many learners lack self-efficacy and are poor self-regulators. These things are teachable / learnable, but require time and support. Many learners need help in ‘becoming aware of themselves and their own understandings’ (McMahon & Oliver, 2001: 1304). If they do not get it, the potential advantages of personalized goal-setting will be negated. As learners become better self-regulators, they will want and need to redefine their learning goals: goal-setting should be an iterative process (Hussey & Smith, 2003: 358). Again, support will be needed. In online learning, such support is not common.

A further problem that has been identified is that goal-setting can discourage a focus on non-goal areas (Ordóñez et al. 2009: 2) and can lead to ‘a focus on reaching the goal rather than on acquiring the skills required to reach it’ (Locke & Latham, 2006: 266). We know that much language learning is messy and incidental. Students do not only learn the particular thing that they are studying at the time (the belief that they do was described by Dewey as ‘the greatest of all pedagogical fallacies’). Goal-setting, even when personalized, runs the risk of promoting tunnel-vision.

The incorporation of personalized goal-setting in online language learning programmes is, in so many ways, a far from straightforward matter. Simply tacking it onto existing programmes is unlikely to result in anything positive: it is not an ‘over-the-counter treatment for motivation’ (Ordóñez et al.:2). Course developers will need to look at ‘the complex interplay between goal-setting and organizational contexts’ (Ordóñez et al. 2009: 16). Motivating students is not simply ‘a matter of the teacher deploying the correct strategies […] it is an intensely interactive process’ (Lamb, M. 2017). More generally, developers need to move away from a positivist and linear view of learning as a technical process where teaching interventions (such as the incorporation of goal-setting, the deployment of gamification elements or the use of a particular algorithm) will lead to predictable student outcomes. As Larry Cuban reminds us, ‘no persuasive body of evidence exists yet to confirm that belief (Cuban, 1986: 88). The most recent research into personalized learning has failed to identify any single element of personalization that can be clearly correlated with improved outcomes (Pane et al., 2015: 28).

In previous posts, I considered learning styles and self-pacing, two aspects of personalized learning that are highly problematic. Personalized goal-setting is no less so.

References

Bandura, A. 1997. Self-efficacy: The exercise of control. New York: W.H. Freeman and Company

Benson, P. 2016. ‘Learner Autonomy’ in Hall, G. (ed.) The Routledge Handbook of English Language Teaching. Abingdon: Routledge. pp.339 – 352

Borg, S. & Al-Busaidi, S. 2012. ‘Teachers’ beliefs and practices regarding learner autonomy’ ELT Journal 66 / 3: 283 – 292

Cavanagh, S. 2014. ‘What Is ‘Personalized Learning’? Educators Seek Clarity’ Education Week http://www.edweek.org/ew/articles/2014/10/22/09pl-overview.h34.html

Cordova, D. I. & Lepper, M. R. 1996. ‘Intrinsic Motivation and the Process of Learning: Beneficial Effects of Contextualization, Personalization, and Choice’ Journal of Educational Psychology 88 / 4: 715 -739

Cuban, L. 1986. Teachers and Machines. New York: Teachers College Press

Dickinson, L. 1987. Self-instruction in Language Learning. Cambridge: Cambridge University Press

Disick, R.S. 1975 Individualizing Language Instruction: Strategies and Methods. New York: Harcourt Brace Jovanovich

Harmer, J. 2012. Essential Teacher Knowledge. Harlow: Pearson Education

Hussey, T. & Smith, P. 2003. ‘The Uses of Learning Outcomes’ Teaching in Higher Education 8 / 3: 357 – 368

Lamb, M. 2017 (in press) ‘The motivational dimension of language teaching’ Language Teaching 50 / 3

Locke, E. A. & Latham, G. P. 2006. ‘New Directions in Goal-Setting Theory’ Current Directions in Psychological Science 15 / 5: 265 – 268

McMahon, M. & Oliver, R. (2001). Promoting self-regulated learning in an on-line environment. In C. Montgomerie & J. Viteli (Eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2001 (pp. 1299-1305). Chesapeake, VA: AACE

Morrison, B. 2005. ‘Evaluating learning gain in a self-access learning centre’ Language Teaching Research 9 / 3: 267 – 293

Ordóñez, L. D., Schweitzer, M. E., Galinsky, A. D. & Bazerman, M. H. 2009. Goals Gone Wild: The Systematic Side Effects of Over-Prescribing Goal Setting. Harvard Business School Working Paper 09-083

Pane, J. F., Steiner, E. D., Baird, M. D. & Hamilton, L. S. 2015. Continued Progress: Promising Evidence on Personalized Learning. Seattle: Rand Corporation

Pavlenko, A. & Lantolf, J. P. 2000. ‘Second language learning as participation and the (re)construction of selves’ In J.P. Lantolf (ed.), Sociocultural Theory and Second Language Learning. Oxford: Oxford University Press, pp. 155 – 177

Reinders, H. 2012. ‘The end of self-access? From walled garden to public park’ ELT World Online 4: 1 – 5

Rivers, W. M. 1971. ‘Techniques for Developing Proficiency in the Spoken Language in an Individualized Foreign Language program’ in Altman, H.B. & Politzer, R.L. (eds.) 1971. Individualizing Foreign Language Instruction: Proceedings of the Stanford Conference, May 6 – 8, 1971. Washington, D.C.: Office of Education, U.S. Department of Health, Education, and Welfare. pp. 165 – 169

Ur, P. 1996. A Course in Language Teaching: Practice and Theory. Cambridge: Cambridge University Press

Ur, P. 2012. A Course in English Language Teaching. Cambridge: Cambridge University Press

Vandergriff, I. Second-language Discourse in the Digital World. 2016. Amsterdam: John Benjamins

Introduction

Allowing learners to determine the amount of time they spend studying, and, therefore (in theory at least) the speed of their progress is a key feature of most personalized learning programs. In cases where learners follow a linear path of pre-determined learning items, it is often the only element of personalization that the programs offer. In the Duolingo program that I am using, there are basically only two things that can be personalized: the amount of time I spend studying each day, and the possibility of jumping a number of learning items by ‘testing out’.

Self-regulated learning or self-pacing, as this is commonly referred to, has enormous intuitive appeal. It is clear that different people learn different things at different rates. We’ve known for a long time that ‘the developmental stages of child growth and the individual differences among learners make it impossible to impose a single and ‘correct’ sequence on all curricula’ (Stern, 1983: 439). It therefore follows that it makes even less sense for a group of students (typically determined by age) to be obliged to follow the same curriculum at the same pace in a one-size-fits-all approach. We have probably all experienced, as students, the frustration of being behind, or ahead of, the rest of our colleagues in a class. One student who suffered from the lockstep approach was Sal Khan, founder of the Khan Academy. He has described how he was fed up with having to follow an educational path dictated by his age and how, as a result, individual pacing became an important element in his educational approach (Ferster, 2014: 132-133). As teachers, we have all experienced the challenges of teaching a piece of material that is too hard or too easy for many of the students in the class.

Historical attempts to facilitate self-paced learning

Charles_W__Eliot_cph_3a02149An interest in self-paced learning can be traced back to the growth of mass schooling and age-graded classes in the 19th century. In fact, the ‘factory model’ of education has never existed without critics who saw the inherent problems of imposing uniformity on groups of individuals. These critics were not marginal characters. Charles Eliot (president of Harvard from 1869 – 1909), for example, described uniformity as ‘the curse of American schools’ and argued that ‘the process of instructing students in large groups is a quite sufficient school evil without clinging to its twin evil, an inflexible program of studies’ (Grittner, 1975: 324).

Attempts to develop practical solutions were not uncommon and these are reasonably well-documented. One of the earliest, which ran from 1884 to 1894, was launched in Pueblo, Colorado and was ‘a self-paced plan that required each student to complete a sequence of lessons on an individual basis’ (Januszewski, 2001: 58-59). More ambitious was the Burk Plan (at its peak between 1912 and 1915), named after Frederick Burk of the San Francisco State Normal School, which aimed to allow students to progress through materials (including language instruction materials) at their own pace with only a limited amount of teacher presentations (Januszewski, ibid.). Then, there was the Winnetka Plan (1920s), developed by Carlton Washburne, an associate of Frederick Burk and the superintendent of public schools in Winnetka, Illinois, which also ‘allowed learners to proceed at different rates, but also recognised that learners proceed at different rates in different subjects’ (Saettler, 1990: 65). The Winnetka Plan is especially interesting in the way it presaged contemporary attempts to facilitate individualized, self-paced learning. It was described by its developers in the following terms:

A general technique [consisting] of (a) breaking up the common essentials curriculum into very definite units of achievement, (b) using complete diagnostic tests to determine whether a child has mastered each of these units, and, if not, just where his difficulties lie and, (c) the full use of self-instructive, self corrective practice materials. (Washburne, C., Vogel, M. & W.S. Gray. 1926. A Survey of the Winnetka Public Schools. Bloomington, IL: Public School Press)

Not dissimilar was the Dalton (Massachusetts) Plan in the 1920s which also used a self-paced program to accommodate the different ability levels of the children and deployed contractual agreements between students and teachers (something that remains common educational practice around the world). There were many others, both in the U.S. and other parts of the world.

The personalization of learning through self-pacing was not, therefore, a minor interest. Between 1910 and 1924, nearly 500 articles can be documented on the subject of individualization (Grittner, 1975: 328). In just three years (1929 – 1932) of one publication, The Education Digest, there were fifty-one articles dealing with individual instruction and sixty-three entries treating individual differences (Chastain, 1975: 334). Foreign language teaching did not feature significantly in these early attempts to facilitate self-pacing, but see the Burk Plan described above. Only a handful of references to language learning and self-pacing appeared in articles between 1916 and 1924 (Grittner, 1975: 328).

Disappointingly, none of these initiatives lasted long. Both costs and management issues had been significantly underestimated. Plans such as those described above were seen as progress, but not the hoped-for solution. Problems included the fact that the materials themselves were not individualized and instructional methods were too rigid (Pendleton, 1930: 199). However, concomitant with the interest in individualization (mostly, self-pacing), came the advent of educational technology.

Sidney L. Pressey, the inventor of what was arguably the first teaching machine, was inspired by his experiences with schoolchildren in rural Indiana in the 1920s where he ‘was struck by the tremendous variation in their academic abilities and how they were forced to progress together at a slow, lockstep pace that did not serve all students well’ (Ferster, 2014: 52). Although Pressey failed in his attempts to promote his teaching machines, he laid the foundation stones in the synthesizing of individualization and technology.Pressey machine

Pressey may be seen as the direct precursor of programmed instruction, now closely associated with B. F. Skinner (see my post on Behaviourism and Adaptive Learning). It is a quintessentially self-paced approach and is described by John Hattie as follows:

Programmed instruction is a teaching method of presenting new subject matter to students in graded sequence of controlled steps. A book version, for example, presents a problem or issue, then, depending on the student’s answer to a question about the material, the student chooses from optional answers which refers them to particular pages of the book to find out why they were correct or incorrect – and then proceed to the next part of the problem or issue. (Hattie, 2009: 231)

Programmed instruction was mostly used for the teaching of mathematics, but it is estimated that 4% of programmed instruction programs were for foreign languages (Saettler, 1990: 297). It flourished in the 1960s and 1970s, but even by 1968 foreign language instructors were sceptical (Valdman, 1968). A survey carried out by the Center for Applied Linguistics revealed then that only about 10% of foreign language teachers at college and university reported the use of programmed materials in their departments. (Valdman, 1968: 1).grolier min max

Research studies had failed to demonstrate the effectiveness of programmed instruction (Saettler, 1990: 303). Teachers were often resistant and students were often bored, finding ‘ingenious ways to circumvent the program, including the destruction of their teaching machines!’ (Saettler, ibid.).

In the case of language learning, there were other problems. For programmed instruction to have any chance of working, it was necessary to specify rigorously the initial and terminal behaviours of the learner so that the intermediate steps leading from the former to the latter could be programmed. As Valdman (1968: 4) pointed out, this is highly problematic when it comes to languages (a point that I have made repeatedly in this blog). In addition, students missed the personal interaction that conventional instruction offered, got bored and lacked motivation (Valdman, 1968: 10).

Programmed instruction worked best when teachers were very enthusiastic, but perhaps the most significant lesson to be learned from the experiments was that it was ‘a difficult, time-consuming task to introduce programmed instruction’ (Saettler, 1990: 299). It entailed changes to well-established practices and attitudes, and for such changes to succeed there must be consideration of the social, political, and economic contexts. As Saettler (1990: 306), notes, ‘without the support of the community and the entire teaching staff, sustained innovation is unlikely’. In this light, Hattie’s research finding that ‘when comparisons are made between many methods, programmed instruction often comes near the bottom’ (Hattie, 2009: 231) comes as no great surprise.

Just as programmed instruction was in its death throes, the world of language teaching discovered individualization. Launched as a deliberate movement in the early 1970s at the Stanford Conference (Altman & Politzer, 1971), it was a ‘systematic attempt to allow for individual differences in language learning’ (Stern, 1983: 387). Inspired, in part, by the work of Carl Rogers, this ‘humanistic turn’ was a recognition that ‘each learner is unique in personality, abilities, and needs. Education must be personalized to fit the individual; the individual must not be dehumanized in order to meet the needs of an impersonal school system’ (Disick, 1975:38). In ELT, this movement found many adherents and remains extremely influential to this day.

In language teaching more generally, the movement lost impetus after a few years, ‘probably because its advocates had underestimated the magnitude of the task they had set themselves in trying to match individual learner characteristics with appropriate teaching techniques’ (Stern, 1983: 387). What precisely was meant by individualization was never adequately defined or agreed (a problem that remains to the present time). What was left was self-pacing. In 1975, it was reported that ‘to date the majority of the programs in second-language education have been characterized by a self-pacing format […]. Practice seems to indicate that ‘individualized’ instruction is being defined in the class room as students studying individually’ (Chastain, 1975: 344).

Lessons to be learned

This brief account shows that historical attempts to facilitate self-pacing have largely been characterised by failure. The starting point of all these attempts remains as valid as ever, but it is clear that practical solutions are less than simple. To avoid the insanity of doing the same thing over and over again and expecting different results, we should perhaps try to learn from the past.

One of the greatest challenges that teachers face is dealing with different levels of ability in their classes. In any blended scenario where the online component has an element of self-pacing, the challenge will be magnified as ability differentials are likely to grow rather than decrease as a result of the self-pacing. Bart Simpson hit the nail on the head in a memorable line: ‘Let me get this straight. We’re behind the rest of the class and we’re going to catch up to them by going slower than they are? Coo coo!’ Self-pacing runs into immediate difficulties when it comes up against standardised tests and national or state curriculum requirements. As Ferster observes, ‘the notion of individual pacing [remains] antithetical to […] a graded classroom system, which has been the model of schools for the past century. Schools are just not equipped to deal with students who do not learn in age-processed groups, even if this system is clearly one that consistently fails its students (Ferster, 2014: 90-91).bart_simpson

Ability differences are less problematic if the teacher focusses primarily on communicative tasks in F2F time (as opposed to more teaching of language items), but this is a big ‘if’. Many teachers are unsure of how to move towards a more communicative style of teaching, not least in large classes in compulsory schooling. Since there are strong arguments that students would benefit from a more communicative, less transmission-oriented approach anyway, it makes sense to focus institutional resources on equipping teachers with the necessary skills, as well as providing support, before a shift to a blended, more self-paced approach is implemented.

Such issues are less important in private institutions, which are not age-graded, and in self-study contexts. However, even here there may be reasons to proceed cautiously before buying into self-paced approaches. Self-pacing is closely tied to autonomous goal-setting (which I will look at in more detail in another post). Both require a degree of self-awareness at a cognitive and emotional level (McMahon & Oliver, 2001), but not all students have such self-awareness (Magill, 2008). If students do not have the appropriate self-regulatory strategies and are simply left to pace themselves, there is a chance that they will ‘misregulate their learning, exerting control in a misguided or counterproductive fashion and not achieving the desired result’ (Kirschner & van Merriënboer, 2013: 177). Before launching students on a path of self-paced language study, ‘thought needs to be given to the process involved in users becoming aware of themselves and their own understandings’ (McMahon & Oliver, 2001: 1304). Without training and support provided both before and during the self-paced study, the chances of dropping out are high (as we see from the very high attrition rate in language apps).

However well-intentioned, many past attempts to facilitate self-pacing have also suffered from the poor quality of the learning materials. The focus was more on the technology of delivery, and this remains the case today, as many posts on this blog illustrate. Contemporary companies offering language learning programmes show relatively little interest in the content of the learning (take Duolingo as an example). Few app developers show signs of investing in experienced curriculum specialists or materials writers. Glossy photos, contemporary videos, good UX and clever gamification, all of which become dull and repetitive after a while, do not compensate for poorly designed materials.

Over forty years ago, a review of self-paced learning concluded that the evidence on its benefits was inconclusive (Allison, 1975: 5). Nothing has changed since. For some people, in some contexts, for some of the time, self-paced learning may work. Claims that go beyond that cannot be substantiated.

References

Allison, E. 1975. ‘Self-Paced Instruction: A Review’ The Journal of Economic Education 7 / 1: 5 – 12

Altman, H.B. & Politzer, R.L. (eds.) 1971. Individualizing Foreign Language Instruction: Proceedings of the Stanford Conference, May 6 – 8, 1971. Washington, D.C.: Office of Education, U.S. Department of Health, Education, and Welfare

Chastain, K. 1975. ‘An Examination of the Basic Assumptions of “Individualized” Instruction’ The Modern Language Journal 59 / 7: 334 – 344

Disick, R.S. 1975 Individualizing Language Instruction: Strategies and Methods. New York: Harcourt Brace Jovanovich

Ferster, B. 2014. Teaching Machines. Baltimore: John Hopkins University Press

Grittner, F. M. 1975. ‘Individualized Instruction: An Historical Perspective’ The Modern Language Journal 59 / 7: 323 – 333

Hattie, J. 2009. Visible Learning. Abingdon, Oxon.: Routledge

Januszewski, A. 2001. Educational Technology: The Development of a Concept. Englewood, Colorado: Libraries Unlimited

Kirschner, P. A. & van Merriënboer, J. J. G. 2013. ‘Do Learners Really Know Best? Urban Legends in Education’ Educational Psychologist, 48:3, 169-183

Magill, D. S. 2008. ‘What Part of Self-Paced Don’t You Understand?’ University of Wisconsin 24th Annual Conference on Distance Teaching & Learning Conference Proceedings.

McMahon, M. & Oliver, R. 2001. ‘Promoting self-regulated learning in an on-line environment’ in C. Montgomerie & J. Viteli (eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2001 (pp. 1299-1305). Chesapeake, VA: AACE

Pendleton, C. S. 1930. ‘Personalizing English Teaching’ Peabody Journal of Education 7 / 4: 195 – 200

Saettler, P. 1990. The Evolution of American Educational Technology. Denver: Libraries Unlimited

Stern, H.H. 1983. Fundamental Concepts of Language Teaching. Oxford: Oxford University Press

Valdman, A. 1968. ‘Programmed Instruction versus Guided Learning in Foreign Language Acquisition’ Die Unterrichtspraxis / Teaching German 1 / 2: 1 – 14

 

by Philip Kerr & Andrew Wickham

from IATEFL 2016 Birmingham Conference Selections (ed. Tania Pattison) Faversham, Kent: IATEFL pp. 75 – 78

ELT publishing, international language testing and private language schools are all industries: products are produced, bought and sold for profit. English language teaching (ELT) is not. It is an umbrella term that is used to describe a range of activities, some of which are industries, and some of which (such as English teaching in high schools around the world) might better be described as public services. ELT, like education more generally, is, nevertheless, often referred to as an ‘industry’.

Education in a neoliberal world

The framing of ELT as an industry is both a reflection of how we understand the term and a force that shapes our understanding. Associated with the idea of ‘industry’ is a constellation of other ideas and words (such as efficacy, productivity, privatization, marketization, consumerization, digitalization and globalization) which become a part of ELT once it is framed as an industry. Repeated often enough, ‘ELT as an industry’ can become a metaphor that we think and live by. Those activities that fall under the ELT umbrella, but which are not industries, become associated with the desirability of industrial practices through such discourse.

The shift from education, seen as a public service, to educational managerialism (where education is seen in industrial terms with a focus on efficiency, free market competition, privatization and a view of students as customers) can be traced to the 1980s and 1990s (Gewirtz, 2001). In 1999, under pressure from developed economies, the General Agreement on Trade in Services (GATS) transformed education into a commodity that could be traded like any other in the marketplace (Robertson, 2006). The global industrialisation and privatization of education continues to be promoted by transnational organisations (such as the World Bank and the OECD), well-funded free-market think-tanks (such as the Cato Institute), philanthro-capitalist foundations (such as the Gates Foundation) and educational businesses (such as Pearson) (Ball, 2012).

Efficacy and learning outcomes

Managerialist approaches to education require educational products and services to be measured and compared. In ELT, the most visible manifestation of this requirement is the current ubiquity of learning outcomes. Contemporary coursebooks are full of ‘can-do’ statements, although these are not necessarily of any value to anyone. Examples from one unit of one best-selling course include ‘Now I can understand advice people give about hotels’ and ‘Now I can read an article about unique hotels’ (McCarthy et al. 2014: 74). However, in a world where accountability is paramount, they are deemed indispensable. The problem from a pedagogical perspective is that teaching input does not necessarily equate with learning uptake. Indeed, there is no reason why it should.

Drawing on the Common European Framework of Reference for Languages (CEFR) for inspiration, new performance scales have emerged in recent years. These include the Cambridge English Scale and the Pearson Global Scale of English. Moving away from the broad six categories of the CEFR, such scales permit finer-grained measurement and we now see individual vocabulary and grammar items tagged to levels. Whilst such initiatives undoubtedly support measurements of efficacy, the problem from a pedagogical perspective is that they assume that language learning is linear and incremental, as opposed to complex and jagged.

Given the importance accorded to the measurement of language learning (or what might pass for language learning), it is unsurprising that attention is shifting towards the measurement of what is probably the most important factor impacting on learning: the teaching. Teacher competency scales have been developed by Cambridge Assessment, the British Council and EAQUALS (Evaluation and Accreditation of Quality Language Services), among others.

The backwash effects of the deployment of such scales are yet to be fully experienced, but the likely increase in the perception of both language learning and teacher learning as the synthesis of granularised ‘bits of knowledge’ is cause for concern.

Digital technology

Digital technology may offer advantages to both English language teachers and learners, but its rapid growth in language learning is the result, primarily but not exclusively, of the way it has been promoted by those who stand to gain financially. In education, generally, and in English language teaching, more specifically, advocacy of the privatization of education is always accompanied by advocacy of digitalization. The global market for digital English language learning products was reported to be $2.8 billion in 2015 and is predicted to reach $3.8 billion by 2020 (Ambient Insight, 2016).

In tandem with the increased interest in measuring learning outcomes, there is fierce competition in the market for high-stakes examinations, and these are increasingly digitally delivered and marked. In the face of this competition and in a climate of digital disruption, companies like Pearson and Cambridge English are developing business models of vertical integration where they can provide and sell everything from placement testing, to courseware (either print or delivered through an LMS), teaching, assessment and teacher training. Huge investments are being made in pursuit of such models. Pearson, for example, recently bought GlobalEnglish, Wall Street English, and set up a partnership with Busuu, thus covering all aspects of language learning from resources provision and publishing to off- and online training delivery.

As regards assessment, the most recent adult coursebook from Cambridge University Press (in collaboration with Cambridge English Language Assessment), ‘Empower’ (Doff, et. Al, 2015) sells itself on a combination of course material with integrated, validated assessment.

Besides its potential for scalability (and therefore greater profit margins), the appeal (to some) of platform-delivered English language instruction is that it facilitates assessment that is much finer-grained and actionable in real time. Digitization and testing go hand in hand.

Few English language teachers have been unaffected by the move towards digital. In the state sectors, large-scale digitization initiatives (such as the distribution of laptops for educational purposes, the installation of interactive whiteboards, the move towards blended models of instruction or the move away from printed coursebooks) are becoming commonplace. In the private sectors, online (or partially online) language schools are taking market share from the traditional bricks-and-mortar institutions.

These changes have entailed modifications to the skill-sets that teachers need to have. Two announcements at this conference reflect this shift. First of all, Cambridge English launched their ‘Digital Framework for Teachers’, a matrix of six broad competency areas organised into four levels of proficiency. Secondly, Aqueduto, the Association for Quality Education and Training Online, was launched, setting itself up as an accreditation body for online or blended teacher training courses.

Teachers’ pay and conditions

In the United States, and likely soon in the UK, the move towards privatization is accompanied by an overt attack on teachers’ unions, rights, pay and conditions (Selwyn, 2014). As English language teaching in both public and private sectors is commodified and marketized it is no surprise to find that the drive to bring down costs has a negative impact on teachers worldwide. Gwynt (2015), for example, catalogues cuts in funding, large-scale redundancies, a narrowing of the curriculum, intensified workloads (including the need to comply with ‘quality control measures’), the deskilling of teachers, dilapidated buildings, minimal resources and low morale in an ESOL department in one British further education college. In France, a large-scale study by Wickham, Cagnol, Wright and Oldmeadow (Linguaid, 2015; Wright, 2016) found that EFL teachers in the very competitive private sector typically had multiple employers, limited or no job security, limited sick pay and holiday pay, very little training and low hourly rates that were deteriorating. One of the principle drivers of the pressure on salaries is the rise of online training delivery through Skype and other online platforms, using offshore teachers in low-cost countries such as the Philippines. This type of training represents 15% in value and up to 25% in volume of all language training in the French corporate sector and is developing fast in emerging countries. These examples are illustrative of a broad global trend.

Implications

Given the current climate, teachers will benefit from closer networking with fellow professionals in order, not least, to be aware of the rapidly changing landscape. It is likely that they will need to develop and extend their skill sets (especially their online skills and visibility and their specialised knowledge), to differentiate themselves from competitors and to be able to demonstrate that they are in tune with current demands. More generally, it is important to recognise that current trends have yet to run their full course. Conditions for teachers are likely to deteriorate further before they improve. More than ever before, teachers who want to have any kind of influence on the way that marketization and industrialization are shaping their working lives will need to do so collectively.

References

Ambient Insight. 2016. The 2015-2020 Worldwide Digital English Language Learning Market. http://www.ambientinsight.com/Resources/Documents/AmbientInsight_2015-2020_Worldwide_Digital_English_Market_Sample.pdf

Ball, S. J. 2012. Global Education Inc. Abingdon, Oxon.: Routledge

Doff, A., Thaine, C., Puchta, H., Stranks, J. and P. Lewis-Jones 2015. Empower. Cambridge: Cambridge University Press

Gewirtz, S. 2001. The Managerial School: Post-welfarism and Social Justice in Education. Abingdon, Oxon.: Routledge

Gwynt, W. 2015. ‘The effects of policy changes on ESOL’. Language Issues 26 / 2: 58 – 60

McCarthy, M., McCarten, J. and H. Sandiford 2014. Touchstone 2 Student’s Book Second Edition. Cambridge: Cambridge University Press

Linguaid, 2015. Le Marché de la Formation Langues à l’Heure de la Mondialisation. Guildford: Linguaid

Robertson, S. L. 2006. ‘Globalisation, GATS and trading in education services.’ published by the Centre for Globalisation, Education and Societies, University of Bristol, Bristol BS8 1JA, UK at http://www.bris.ac.uk/education/people/academicStaff/edslr/publications/04slr

Selwyn, N. 2014. Distrusting Educational Technology. New York: Routledge

Wright, R. 2016. ‘My teacher is rich … or not!’ English Teaching Professional 103: 54 – 56

 

 

Every now and then, someone recommends me to take a look at a flashcard app. It’s often interesting to see what developers have done with design, gamification and UX features, but the content is almost invariably awful. Most recently, I was encouraged to look at Word Pash. The screenshots below are from their promotional video.

word-pash-1 word-pash-2 word-pash-3 word-pash-4

The content problems are immediately apparent: an apparently random selection of target items, an apparently random mix of high and low frequency items, unidiomatic language examples, along with definitions and distractors that are less frequent than the target item. I don’t know if these are representative of the rest of the content. The examples seem to come from ‘Stage 1 Level 3’, whatever that means. (My confidence in the product was also damaged by the fact that the Word Pash website includes one testimonial from a certain ‘Janet Reed – Proud Mom’, whose son ‘was able to increase his score and qualify for academic scholarships at major universities’ after using the app. The picture accompanying ‘Janet Reed’ is a free stock image from Pexels and ‘Janet Reed’ is presumably fictional.)

According to the website, ‘WordPash is a free-to-play mobile app game for everyone in the global audience whether you are a 3rd grader or PhD, wordbuff or a student studying for their SATs, foreign student or international business person, you will become addicted to this fast paced word game’. On the basis of the promotional video, the app couldn’t be less appropriate for English language learners. It seems unlikely that it would help anyone improve their ACT or SAT test scores. The suggestion that the vocabulary development needs of 9-year-olds and doctoral students are comparable is pure chutzpah.

The deliberate study of more or less random words may be entertaining, but it’s unlikely to lead to very much in practical terms. For general purposes, the deliberate learning of the highest frequency words, up to about a frequency ranking of #7500, makes sense, because there’s a reasonably high probability that you’ll come across these items again before you’ve forgotten them. Beyond that frequency level, the value of the acquisition of an additional 1000 words tails off very quickly. Adding 1000 words from frequency ranking #8000 to #9000 is likely to result in an increase in lexical understanding of general purpose texts of about 0.2%. When we get to frequency ranks #19,000 to #20,000, the gain in understanding decreases to 0.01%[1]. In other words, deliberate vocabulary learning needs to be targeted. The data is relatively recent, but the principle goes back to at least the middle of the last century when Michael West argued that a principled approach to vocabulary development should be driven by a comparison of the usefulness of a word and its ‘learning cost’[2]. Three hundred years before that, Comenius had articulated something very similar: ‘in compiling vocabularies, my […] concern was to select the words in most frequent use[3].

I’ll return to ‘general purposes’ later in this post, but, for now, we should remember that very few language learners actually study a language for general purposes. Globally, the vast majority of English language learners study English in an academic (school) context and their immediate needs are usually exam-specific. For them, general purpose frequency lists are unlikely to be adequate. If they are studying with a coursebook and are going to be tested on the lexical content of that book, they will need to use the wordlist that matches the book. Increasingly, publishers make such lists available and content producers for vocabulary apps like Quizlet and Memrise often use them. Many examinations, both national and international, also have accompanying wordlists. Examples of such lists produced by examination boards include the Cambridge English young learners’ exams (Starters, Movers and Flyers) and Cambridge English Preliminary. Other exams do not have official word lists, but reasonably reliable lists have been produced by third parties. Examples include Cambridge First, IELTS and SAT. There are, in addition, well-researched wordlists for academic English, including the Academic Word List (AWL)  and the Academic Vocabulary List  (AVL). All of these make sensible starting points for deliberate vocabulary learning.

When we turn to other, out-of-school learners the number of reasons for studying English is huge. Different learners have different lexical needs, and working with a general purpose frequency list may be, at least in part, a waste of time. EFL and ESL learners are likely to have very different needs, as will EFL and ESP learners, as will older and younger learners, learners in different parts of the world, learners who will find themselves in English-speaking countries and those who won’t, etc., etc. For some of these demographics, specialised corpora (from which frequency-based wordlists can be drawn) exist. For most learners, though, the ideal list simply does not exist. Either it will have to be created (requiring a significant amount of time and expertise[4]) or an available best-fit will have to suffice. Paul Nation, in his recent ‘Making and Using Word Lists for Language Learning and Testing’ (John Benjamins, 2016) includes a useful chapter on critiquing wordlists. For anyone interested in better understanding the issues surrounding the development and use of wordlists, three good articles are freely available online. These are:making-and-using-word-lists-for-language-learning-and-testing

Lessard-Clouston, M. 2012 / 2013. ‘Word Lists for Vocabulary Learning and Teaching’ The CATESOL Journal 24.1: 287- 304

Lessard-Clouston, M. 2016. ‘Word lists and vocabulary teaching: options and suggestions’ Cornerstone ESL Conference 2016

Sorell, C. J. 2013. A study of issues and techniques for creating core vocabulary lists for English as an International Language. Doctoral thesis.

But, back to ‘general purposes’ …. Frequency lists are the obvious starting point for preparing a wordlist for deliberate learning, but they are very problematic. Frequency rankings depend on the corpus on which they are based and, since these are different, rankings vary from one list to another. Even drawing on just one corpus, rankings can be a little strange. In the British National Corpus, for example, ‘May’ (the month) is about twice as frequent as ‘August’[5], but we would be foolish to infer from this that the learning of ‘May’ should be prioritised over the learning of ‘August’. An even more striking example from the same corpus is the fact that ‘he’ is about twice as frequent as ‘she’[6]: should, therefore, ‘he’ be learnt before ‘she’?

List compilers have to make a number of judgement calls in their work. There is not space here to consider these in detail, but two particularly tricky questions concerning the way that words are chosen may be mentioned: Is a verb like ‘list’, with two different and unrelated meanings, one word or two? Should inflected forms be considered as separate words? The judgements are not usually informed by considerations of learners’ needs. Learners will probably best approach vocabulary development by building their store of word senses: attempting to learn all the meanings and related forms of any given word is unlikely to be either useful or successful.

Frequency lists, in other words, are not statements of scientific ‘fact’: they are interpretative documents. They have been compiled for descriptive purposes, not as ways of structuring vocabulary learning, and it cannot be assumed they will necessarily be appropriate for a purpose for which they were not designed.

A further major problem concerns the corpus on which the frequency list is based. Large databases, such as the British National Corpus or the Corpus of Contemporary American English, are collections of language used by native speakers in certain parts of the world, usually of a restricted social class. As such, they are of relatively little value to learners who will be using English in contexts that are not covered by the corpus. A context where English is a lingua franca is one such example.

A different kind of corpus is the Cambridge Learner Corpus (CLC), a collection of exam scripts produced by candidates in Cambridge exams. This has led to the development of the English Vocabulary Profile (EVP) , where word senses are tagged as corresponding to particular levels in the Common European Framework scale. At first glance, this looks like a good alternative to frequency lists based on native-speaker corpora. But closer consideration reveals many problems. The design of examination tasks inevitably results in the production of language of a very different kind from that produced in other contexts. Many high frequency words simply do not appear in the CLC because it is unlikely that a candidate would use them in an exam. Other items are very frequent in this corpus just because they are likely to be produced in examination tasks. Unsurprisingly, frequency rankings in EVP do not correlate very well with frequency rankings from other corpora. The EVP, then, like other frequency lists, can only serve, at best, as a rough guide for the drawing up of target item vocabulary lists in general purpose apps or coursebooks[7].

There is no easy solution to the problems involved in devising suitable lexical content for the ‘global audience’. Tagging words to levels (i.e. grouping them into frequency bands) will always be problematic, unless very specific user groups are identified. Writers, like myself, of general purpose English language teaching materials are justifiably irritated by some publishers’ insistence on allocating words to levels with numerical values. The policy, taken to extremes (as is increasingly the case), has little to recommend it in linguistic terms. But it’s still a whole lot better than the aleatory content of apps like Word Pash.

[1] See Nation, I.S.P. 2013. Learning Vocabulary in Another Language 2nd edition. (Cambridge: Cambridge University Press) p. 21 for statistical tables. See also Nation, P. & R. Waring 1997. ‘Vocabulary size, text coverage and word lists’ in Schmitt & McCarthy (eds.) 1997. Vocabulary: Description, Acquisition and Pedagogy. (Cambridge: Cambridge University Press) pp. 6 -19

[2] See Kelly, L.G. 1969. 25 Centuries of Language Teaching. (Rowley, Mass.: Rowley House) p.206 for a discussion of West’s ideas.

[3] Kelly, L.G. 1969. 25 Centuries of Language Teaching. (Rowley, Mass.: Rowley House) p. 184

[4] See Timmis, I. 2015. Corpus Linguistics for ELT (Abingdon: Routledge) for practical advice on doing this.

[5] Nation, I.S.P. 2016. Making and Using Word Lists for Language Learning and Testing. (Amsterdam: John Benjamins) p.58

[6] Taylor, J.R. 2012. The Mental Corpus. (Oxford: Oxford University Press) p.151

[7] For a detailed critique of the limitations of using the CLC as a guide to syllabus design and textbook development, see Swan, M. 2014. ‘A Review of English Profile Studies’ ELTJ 68/1: 89-96

I have been putting in a lot of time studying German vocabulary with Memrise lately, but this is not a review of the Memrise app. For that, I recommend you read Marek Kiczkowiak’s second post on this app. Like me, he’s largely positive, although I am less enthusiastic about Memrise’s USP, the use of mnemonics. It’s not that mnemonics don’t work – there’s a lot of evidence that they do: it’s just that there is little or no evidence that they’re worth the investment of time.

Time … as I say, I have been putting in the hours. Every day, for over a month, averaging a couple of hours a day, it’s enough to get me very near the top of the leader board (which I keep a very close eye on) and it means that I am doing more work than 99% of other users. And, yes, my German is improving.

Putting in the time is the sine qua non of any language learning and a well-designed app must motivate users to do this. Relevant content will be crucial, as will satisfactory design, both visual and interactive. But here I’d like to focus on the two other key elements: task design / variety and gamification.

Memrise offers a limited range of task types: presentation cards (with word, phrase or sentence with translation and audio recording), multiple choice (target item with four choices), unscrambling letters or words, and dictation (see below).

Screenshot_2016-05-24-08-10-42Screenshot_2016-05-24-08-10-57Screenshot_2016-05-24-08-11-24Screenshot_2016-05-24-08-11-45Screenshot_2016-05-24-08-12-51Screenshot_2016-05-24-08-13-44

As Marek writes, it does get a bit repetitive after a while (although less so than thumbing through a pack of cardboard flashcards). The real problem, though, is that there are only so many things an app designer can do with standard flashcards, if they are to contribute to learning. True, there could be a few more game-like tasks (as with Quizlet), races against the clock as you pop word balloons or something of the sort, but, while these might, just might, help with motivation, these games rarely, if ever, contribute much to learning.

What’s more, you’ll get fed up with the games sooner or later if you’re putting in serious study hours. Even if Memrise were to double the number of activity types, I’d have got bored with them by now, in the same way I got bored with the Quizlet games. Bear in mind, too, that I’ve only done a month: I have at least another two months to go before I finish the level I’m working on. There’s another issue with ‘fun’ activities / games which I’ll come on to later.

The options for task variety in vocabulary / memory apps are therefore limited. Let’s look at gamification. Memrise has leader boards (weekly, monthly, ‘all time’), streak badges, daily goals, email reminders and (in the laptop and premium versions) a variety of graphs that allow you to analyse your study patterns. Your degree of mastery of learning items is represented by a growing flower that grows leaves, flowers and withers. None of this is especially original or different from similar apps.

Screenshot_2016-05-24-19-17-14The trouble with all of this is that it can only work for a certain time and, for some people, never. There’s always going to be someone like me who can put in a couple of hours a day more than you can. Or someone, in my case, like ‘Nguyenduyha’, who must be doing about four hours a day, and who, I know, is out of my league. I can’t compete and the realisation slowly dawns that my life would be immeasurably sadder if I tried to.

Having said that, I have tried to compete and the way to do so is by putting in the time on the ‘speed review’. This is the closest that Memrise comes to a game. One hundred items are flashed up with four multiple choices and these are against the clock. The quicker you are, the more points you get, and if you’re too slow, or you make a mistake, you lose a life. That’s how you gain lots of points with Memrise. The problem is that, at best, this task only promotes receptive knowledge of the items, which is not what I need by this stage. At worst, it serves no useful learning function at all because I have learnt ways of doing this well which do not really involve me processing meaning at all. As Marek says in his post (in reference to Quizlet), ‘I had the feeling that sometimes I was paying more attention to ‘winning’ the game and scoring points, rather than to the words on the screen.’ In my case, it is not just a feeling: it’s an absolute certainty.

desktop_dashboard

Sadly, the gamification is working against me. The more time I spend on the U-Bahn doing Memrise, the less time I spend reading the free German-language newspapers, the less time I spend eavesdropping on conversations. Two hours a day is all I have time for for my German study, and Memrise is eating it all up. I know that there are other, and better, ways of learning. In order to do what I know I should be doing, I need to ignore the gamification. For those, more reasonable, students, who can regularly do their fifteen minutes a day, day in – day out, the points and leader boards serve no real function at all.

Cheating at gamification, or gaming the system, is common in app-land. A few years ago, Memrise had to take down their leader board when they realised that cheating was taking place. There’s an inexorable logic to this: gamification is an attempt to motivate by rewarding through points, rather than the reward coming from the learning experience. The logic of the game overtakes itself. Is ‘Nguyenduyha’ cheating, or do they simply have nothing else to do all day? Am I cheating by finding time to do pointless ‘speed reviews’ that earn me lots of points?

For users like myself, then, gamification design needs to be a delicate balancing act. For others, it may be largely an irrelevance. I’ve been working recently on a general model of vocabulary app design that looks at two very different kinds of user. On the one hand, there are the self-motivated learners like myself or the millions of other who have chosen to use self-study apps. On the other, there are the millions of students in schools and colleges, studying English among other subjects, some of whom are now being told to use the vocabulary apps that are beginning to appear packaged with their coursebooks (or other learning material). We’ve never found entirely satisfactory ways of making these students do their homework, and the fact that this homework is now digital will change nothing (except, perhaps, in the very, very short term). The incorporation of games and gamification is unlikely to change much either: there will always be something more interesting and motivating (and unconnected with language learning) elsewhere.

Teachers and college principals may like the idea of gamification (without having really experienced it themselves) for their students. But more important for most of them is likely to be the teacher dashboard: the means by which they can check that their students are putting the time in. Likewise, they will see the utility of automated email reminders that a student is not working hard enough to meet their learning objectives, more and more regular tests that contribute to overall course evaluation, comparisons with college, regional or national benchmarks. Technology won’t solve the motivation issue, but it does offer efficient means of control.

Adaptive learning providers make much of their ability to provide learners with personalised feedback and to provide teachers with dashboard feedback on the performance of both individuals and groups. All well and good, but my interest here is in the automated feedback that software could provide on very specific learning tasks. Scott Thornbury, in a recent talk, ‘Ed Tech: The Mouse that Roared?’, listed six ‘problems’ of language acquisition that educational technology for language learning needs to address. One of these he framed as follows: ‘The feedback problem, i.e. how does the learner get optimal feedback at the point of need?’, and suggested that technological applications ‘have some way to go.’ He was referring, not to the kind of feedback that dashboards can provide, but to the kind of feedback that characterises a good language teacher: corrective feedback (CF) – the way that teachers respond to learner utterances (typically those containing errors, but not necessarily restricted to these) in what Ellis and Shintani call ‘form-focused episodes’[1]. These responses may include a direct indication that there is an error, a reformulation, a request for repetition, a request for clarification, an echo with questioning intonation, etc. Basically, they are correction techniques.

These days, there isn’t really any debate about the value of CF. There is a clear research consensus that it can aid language acquisition. Discussing learning in more general terms, Hattie[2] claims that ‘the most powerful single influence enhancing achievement is feedback’. The debate now centres around the kind of feedback, and when it is given. Interestingly, evidence[3] has been found that CF is more effective in the learning of discrete items (e.g. some grammatical structures) than in communicative activities. Since it is precisely this kind of approach to language learning that we are more likely to find in adaptive learning programs, it is worth exploring further.

What do we know about CF in the learning of discrete items? First of all, it works better when it is explicit than when it is implicit (Li, 2010), although this needs to be nuanced. In immediate post-tests, explicit CF is better than implicit variations. But over a longer period of time, implicit CF provides better results. Secondly, formative feedback (as opposed to right / wrong testing-style feedback) strengthens retention of the learning items: this typically involves the learner repairing their error, rather than simply noticing that an error has been made. This is part of what cognitive scientists[4] sometimes describe as the ‘generation effect’. Whilst learners may benefit from formative feedback without repairing their errors, Ellis and Shintani (2014: 273) argue that the repair may result in ‘deeper processing’ and, therefore, assist learning. Thirdly, there is evidence that some delay in receiving feedback aids subsequent recall, especially over the longer term. Ellis and Shintani (2014: 276) suggest that immediate CF may ‘benefit the development of learners’ procedural knowledge’, while delayed CF is ‘perhaps more likely to foster metalinguistic understanding’. You can read a useful summary of a meta-analysis of feedback effects in online learning here, or you can buy the whole article here.

I have yet to see an online language learning program which can do CF well, but I think it’s a matter of time before things improve significantly. First of all, at the moment, feedback is usually immediate, or almost immediate. This is unlikely to change, for a number of reasons – foremost among them being the pride that ed tech takes in providing immediate feedback, and the fact that online learning is increasingly being conceptualised and consumed in bite-sized chunks, something you do on your phone between doing other things. What will change in better programs, however, is that feedback will become more formative. As things stand, tasks are usually of a very closed variety, with drag-and-drop being one of the most popular. Only one answer is possible and feedback is usually of the right / wrong-and-here’s-the-correct-answer kind. But tasks of this kind are limited in their value, and, at some point, tasks are needed where more than one answer is possible.

Here’s an example of a translation task from Duolingo, where a simple sentence could be translated into English in quite a large number of ways.

i_am_doing_a_basketDecontextualised as it is, the sentence could be translated in the way that I have done it, although it’s unlikely. The feedback, however, is of relatively little help to the learner, who would benefit from guidance of some sort. The simple reason that Duolingo doesn’t offer useful feedback is that the programme is static. It has been programmed to accept certain answers (e.g. in this case both the present simple and the present continuous are acceptable), but everything else will be rejected. Why? Because it would take too long and cost too much to anticipate and enter in all the possible answers. Why doesn’t it offer formative feedback? Because in order to do so, it would need to identify the kind of error that has been made. If we can identify the kind of error, we can make a reasonable guess about the cause of the error, and select appropriate CF … this is what good teachers do all the time.

Analysing the kind of error that has been made is the first step in providing appropriate CF, and it can be done, with increasing accuracy, by current technology, but it requires a lot of computing. Let’s take spelling as a simple place to start. If you enter ‘I am makeing a basket for my mother’ in the Duolingo translation above, the program tells you ‘Nice try … there’s a typo in your answer’. Given the configuration of keyboards, it is highly unlikely that this is a typo. It’s a simple spelling mistake and teachers recognise it as such because they see it so often. For software to achieve the same insight, it would need, as a start, to trawl a large English dictionary database and a large tagged database of learner English. The process is quite complicated, but it’s perfectably do-able, and learners could be provided with CF in the form of a ‘spelling hint’.i_am_makeing_a_basket

Rather more difficult is the error illustrated in my first screen shot. What’s the cause of this ‘error’? Teachers know immediately that this is probably a classic confusion of ‘do’ and ‘make’. They know that the French verb ‘faire’ can be translated into English as ‘make’ or ‘do’ (among other possibilities), and the error is a common language transfer problem. Software could do the same thing. It would need a large corpus (to establish that ‘make’ collocates with ‘a basket’ more often than ‘do’), a good bilingualised dictionary (plenty of these now exist), and a tagged database of learner English. Again, appropriate automated feedback could be provided in the form of some sort of indication that ‘faire’ is only sometimes translated as ‘make’.

These are both relatively simple examples, but it’s easy to think of others that are much more difficult to analyse automatically. Duolingo rejects ‘I am making one basket for my mother’: it’s not very plausible, but it’s not wrong. Teachers know why learners do this (again, it’s probably a transfer problem) and know how to respond (perhaps by saying something like ‘Only one?’). Duolingo also rejects ‘I making a basket for my mother’ (a common enough error), but is unable to provide any help beyond the correct answer. Automated CF could, however, be provided in both cases if more tools are brought into play. Multiple parsing machines (one is rarely accurate enough on its own) and semantic analysis will be needed. Both the range and the complexity of the available tools are increasing so rapidly (see here for the sort of research that Google is doing and here for an insight into current applications of this research in language learning) that Duolingo-style right / wrong feedback will very soon seem positively antediluvian.

One further development is worth mentioning here, and it concerns feedback and gamification. Teachers know from the way that most learners respond to written CF that they are usually much more interested in knowing what they got right or wrong, rather than the reasons for this. Most students are more likely to spend more time looking at the score at the bottom of a corrected piece of written work than at the laborious annotations of the teacher throughout the text. Getting students to pay close attention to the feedback we provide is not easy. Online language learning systems with gamification elements, like Duolingo, typically reward learners for getting things right, and getting things right in the fewest attempts possible. They encourage learners to look for the shortest or cheapest route to finding the correct answers: learning becomes a sexed-up form of test. If, however, the automated feedback is good, this sort of gamification encourages the wrong sort of learning behaviour. Gamification designers will need to shift their attention away from the current concern with right / wrong, and towards ways of motivating learners to look at and respond to feedback. It’s tricky, because you want to encourage learners to take more risks (and reward them for doing so), but it makes no sense to penalise them for getting things right. The probable solution is to have a dual points system: one set of points for getting things right, another for employing positive learning strategies.

The provision of automated ‘optimal feedback at the point of need’ may not be quite there yet, but it seems we’re on the way for some tasks in discrete-item learning. There will probably always be some teachers who can outperform computers in providing appropriate feedback, in the same way that a few top chess players can beat ‘Deep Blue’ and its scions. But the rest of us had better watch our backs: in the provision of some kinds of feedback, computers are catching up with us fast.

[1] Ellis, R. & N. Shintani (2014) Exploring Language Pedagogy through Second Language Acquisition Research. Abingdon: Routledge p. 249

[2] Hattie, K. (2009) Visible Learning. Abingdon: Routledge p.12

[3] Li, S. (2010) ‘The effectiveness of corrective feedback in SLA: a meta-analysis’ Language Learning 60 / 2: 309 -365

[4] Brown, P.C., Roediger, H.L. & McDaniel, M. A. Make It Stick (Cambridge, Mass.: Belknap Press, 2014)

In the words of its founder and CEO, self-declared ‘visionary’ Claudio Santori, Bliu Bliu is ‘the only company in the world that teaches languages we don’t even know’. This claim, which was made during a pitch  for funding in October 2014, tells us a lot about the Bliu Bliu approach. It assumes that there exists a system by which all languages can be learnt / taught, and the particular features of any given language are not of any great importance. It’s questionable, to say the least, and Santori fails to inspire confidence when he says, in the same pitch, ‘you join Bliu Bliu, you use it, we make something magical, and after a few weeks you can understand the language’.

The basic idea behind Bliu Bliu is that a language is learnt by using it (e.g. by reading or listening to texts), but that the texts need to be selected so that you know the great majority of words within them. The technological challenge, therefore, is to find (online) texts that contain the vocabulary that is appropriate for you. After that, Santori explains , ‘you progress, you input more words and you will get more text that you can understand. Hours and hours of conversations you can fully understand and listen. Not just stupid exercise from stupid grammar book. Real conversation. And in all of them you know 100% of the words. […] So basically you will have the same opportunity that a kid has when learning his native language. Listen hours and hours of native language being naturally spoken at you…at a level he/she can understand plus some challenge, everyday some more challenge, until he can pick up words very very fast’ (sic).

test4

On entering the site, you are invited to take a test. In this, you are shown a series of words and asked to say if you find them ‘easy’ or ‘difficult’. There were 12 words in total, and each time I clicked ‘easy’. The system then tells you how many words it thinks you know, and offers you one or more words to click on. Here are the words I was presented with and, to the right, the number of words that Bliu Blu thinks I know, after clicking ‘easy’ on the preceding word.

hello 4145
teenager 5960
soap, grape 7863
receipt, washing, skateboard 9638
motorway, tram, luggage, footballer, weekday 11061

test7

Finally, I was asked about my knowledge of other languages. I said that my French was advanced and that my Spanish and German were intermediate. On the basis of this answer, I was now told that Bliu Bliu thinks that I know 11,073 words.

Eight of the words in the test are starred in the Macmillan dictionaries, meaning they are within the most frequent 7,500 words in English. Of the other four, skateboard, footballer and tram are very international words. The last, weekday, is a readily understandable compound made up of two extremely high frequency words. How could Bliu Bliu know, with such uncanny precision, that I know 11,073 words from a test like this? I decided to try the test for French. Again, I clicked ‘easy’ for each of the twelve words that was offered. This time, I was offered a very different set of words, with low frequency items like polynôme, toponymie, diaspora, vectoriel (all of which are cognate with English words), along with the rather surprising vichy (which should have had a capital letter, as it is a proper noun). Despite finding all these words easy, I was mortified to be told that I only knew 6546 words in French.

I needn’t have bothered with the test, anyway. Irrespective of level, you are offered vocabulary sets of high frequency words. Examples of sets I was offered included [the, be, of, and, to], [way, state, say, world, two], [may, man, hear, said, call] and [life, down, any, show, t]. Bliu Bliu then gives you a series of short texts that include the target words. You can click on any word you don’t know and you are given either a definition or a translation (I opted for French translations). There is no task beyond simply reading these texts. Putting aside for the moment the question of why I was being offered these particular words when my level is advanced, how does the software perform?

The vast majority of the texts are short quotes from brainyquote.com, and here is the first problem. Quotes tend to be pithy and often play with words: their comprehensibility is not always a function of the frequency of the words they contain. For the word ‘say’, for example, the texts included the Shakespearean quote It will have blood, they say; blood will have blood. For the word ‘world’, I was offered this line from Alexander Pope: The world forgetting, by the world forgot. Not, perhaps, the best way of learning a couple of very simple, high-frequency words. But this was the least of the problems.

The system operates on a word level. It doesn’t recognise phrases or chunks, or even phrasal verbs. So, a word like ‘down’ (in one of the lists above) is presented without consideration of its multiple senses. The first set of sentences I was asked to read for ‘down’ included: I never regretted what I turned down, You get old, you slow down, I’m Creole, and I’m down to earth, I never fall down. I always fight, I like seeing girls throw down and I don’t take criticism lying down. Not exactly the best way of getting to grips with the word ‘down’ if you don’t know it!

bliubliu2You may have noticed the inclusion of the word ‘t’ in one of the lists above. Here are the example sentences for practising this word: (1) Knock the ‘t’ off the ‘can’t’, (2) Sometimes reality T.V. can be stressful, (3) Argentina Debt Swap Won’t Avoid Default, (4) OK, I just don’t understand Nethanyahu, (5) Venezuela: Hell on Earth by Walter T Molano and (6) Work will win when wishy washy wishing won t. I paid €7.99 for one month of this!

The translation function is equally awful. With high frequency words with multiple meanings, you get a long list of possible translations, but no indication of which one is appropriate for the context you are looking at. With other words, it is sometimes, simply, wrong. For example, in the sentence, Heaven lent you a soul, Earth will lend a grave, the translation for ‘grave’ was only for the homonymous adjective. In the sentence There’s a bright spot in every dark cloud, the translation for ‘spot’ was only for verbs. And the translation for ‘but’ in We love but once, for once only are we perfectly equipped for loving was ‘mais’ (not at all what it means here!). The translation tool couldn’t handle the first ‘for’ in this sentence, either.

Bliu Bliu’s claim that Bliu Bliu knows you very well, every single word you know or don’t know is manifest nonsense and reveals a serious lack of understanding about what it means to know a word. However, as you spend more time on the system, a picture of your vocabulary knowledge is certainly built up. The texts that are offered begin to move away from the one-liners from brainyquote.com. As reading (or listening to recorded texts) is the only learning task that is offered, the intrinsic interest of the texts is crucial. Here, again, I was disappointed. Texts that I was offered were sourced from IEEE Spectrum (The World’s Largest Professional Association for the Advancement of Technology), infowars.com (the home of the #1 Internet News Show in the World), Latin America News and Analysis, the Google official blog (Meet 15 Finalists and Science in Action Winner for the 2013 GoogleScience Fair) MLB Trade Rumors (a clearinghouse for relevant, legitimate baseball rumors), and a long text entitled Robert Waldmann: Policy-Relevant Macro Is All in Samuelson and Solow (1960) from a blog called Brad DeLong’s Grasping Reality……with the Neural Network of a Moderately-Intelligent Cephalopod.

There is more curated content (selected from a menu which includes sections entitled ‘18+’ and ‘Controversial Jokes’). In these texts, words that the system thinks you won’t know (most of the proper nouns for example) are highlighted. And there is a small library of novels, again, where predicted unknown words are highlighted in pink. These include Dostoyevsky, Kafka, Oscar Wilde, Gogol, Conan Doyle, Joseph Conrad, Oblomov, H.P. Lovecraft, Joyce, and Poe. You can also upload your own texts if you wish.

But, by this stage, I’d had enough and I clicked on the button to cancel my subscription. I shouldn’t have been surprised when the system crashed and a message popped up saying the system had encountered an error.

Like so many ‘language learning’ start-ups, Bliu Bliu seems to know a little, but not a lot about language learning. The Bliu Bliu blog has a video of Stephen Krashen talking about comprehensible input (it is misleadingly captioned ‘Stephen Krashen on Bliu Bliu’) in which he says that we all learn languages the same way, and that is when we get comprehensible input in a low anxiety environment. Influential though it has been, Krashen’s hypothesis remains a hypothesis, and it is generally accepted now that comprehensible input may be necessary, but it is not sufficient for language learning to take place.

The hypothesis hinges, anyway, on a definition of what is meant by ‘comprehensible’ and no one has come close to defining what precisely this means. Bliu Bliu has falsely assumed that comprehensibility can be determined by self-reporting of word knowledge, and this assumption is made even more problematic by the confusion of words (as sequences of letters) with lexical items. Bliu Bliu takes no account of lexical grammar or collocation (fundamental to any real word knowledge).

The name ‘Bliu Bliu’ was inspired by an episode from ‘Friends’ where Joey tries and fails to speak French. In the episode, according to the ‘Friends’ wiki, ‘Phoebe helps Joey prepare for an audition by teaching him how to speak French. Joey does not progress well and just speaks gibberish, thinking he’s doing a great job. Phoebe explains to the director in French that Joey is her mentally disabled younger brother so he’ll take pity on Joey.’ Bliu Bliu was an unfortunately apt choice of name.

friends

Adaptive learning is a product to be sold. How?

1 Individualised learning

In the vast majority of contexts, language teaching is tied to a ‘one-size-fits-all’ model. This is manifested in institutional and national syllabuses which provide lists of structures and / or competences that all students must master within a given period of time. It is usually actualized in the use of coursebooks, often designed for ‘global markets’. Reaction against this model has been common currency for some time, and has led to a range of suggestions for alternative approaches (such as DOGME), none of which have really caught on. The advocates of adaptive learning programs have tapped into this zeitgeist and promise ‘truly personalized learning’. Atomico, a venture capital company that focuses on consumer technologies, and a major investor in Knewton, describes the promise of adaptive learning in the following terms: ‘Imagine lessons that adapt on-the-fly to the way in which an individual learns, and powerful predictive analytics that help teachers differentiate instruction and understand what each student needs to work on and why[1].’

This is a seductive message and is often framed in such a way that disagreement seems impossible. A post on one well-respected blog, eltjam, which focuses on educational technology in language learning, argued the case for adaptive learning very strongly in July 2013: ‘Adaptive Learning is a methodology that is geared towards creating a learning experience that is unique to each individual learner through the intervention of computer software. Rather than viewing learners as a homogenous collective with more or less identical preferences, abilities, contexts and objectives who are shepherded through a glossy textbook with static activities/topics, AL attempts to tap into the rich meta-data that is constantly being generated by learners (and disregarded by educators) during the learning process. Rather than pushing a course book at a class full of learners and hoping that it will (somehow) miraculously appeal to them all in a compelling, salubrious way, AL demonstrates that the content of a particular course would be more beneficial if it were dynamic and interactive. When there are as many responses, ideas, personalities and abilities as there are learners in the room, why wouldn’t you ensure that the content was able to map itself to them, rather than the other way around?[2]

Indeed. But it all depends on what, precisely, the content is – a point I will return to in a later post. For the time being, it is worth noting the prominence that this message is given in the promotional discourse. It is a message that is primarily directed at teachers. It is more than a little disingenuous, however, because teachers are not the primary targets of the promotional discourse, for the simple reason that they are not the ones with purchasing power. The slogan on the homepage of the Knewton website shows clearly who the real audience is: ‘Every education leader needs an adaptive learning infrastructure’[3].

2 Learning outcomes and testing

Education leaders, who are more likely these days to come from the world of business and finance than the world of education, are currently very focused on two closely interrelated topics: the need for greater productivity and accountability, and the role of technology. They generally share the assumption of other leaders in the World Economic Forum that ICT is the key to the former and ‘the key to a better tomorrow’ (Spring, Education Networks, 2012, p.52). ‘We’re at an important transition point,’ said Arne Duncan, the U.S. Secretary of Education in 2010, ‘we’re getting ready to move from a predominantly print-based classroom to a digital learning environment’ (quoted by Spring, 2012, p.58). Later in the speech, which was delivered at the time as the release of the new National Education Technology Plan, Duncan said ‘just as technology has increased productivity in the business world, it is an essential tool to help boost educational productivity’. The plan outlines how this increased productivity could be achieved: we must start ‘with being clear about the learning outcomes we expect from the investments we make’ (Office of Educational Technology, Transforming American Education: Learning Powered by Technology, U.S. Department of Education, 2010). The greater part of the plan is devoted to discussion of learning outcomes and assessment of them.

Learning outcomes (and their assessment) are also at the heart of ‘Asking More: the Path to Efficacy’ (Barber and Rizvi (eds), Asking More: the Path to Efficacy Pearson, 2013), Pearson’s blueprint for the future of education. According to John Fallon, the CEO of Pearson, ‘our focus should unfalteringly be on honing and improving the learning outcomes we deliver’ (Barber and Rizvi, 2013, p.3). ‘High quality learning’ is associated with ‘a relentless focus on outcomes’ (ibid, p.3) and words like ‘measuring / measurable’, ‘data’ and ‘investment’ are almost as salient as ‘outcomes’. A ‘sister’ publication, edited by the same team, is entitled ‘The Incomplete Guide to Delivering Learning Outcomes’ (Barber and Rizvi (eds), Pearson, 2013) and explores further Pearson’s ambition to ‘become the world’s leading education company’ and to ‘deliver learning outcomes’.

It is no surprise that words like ‘outcomes’, ‘data’ and ‘measure’ feature equally prominently in the language of adaptive software companies like Knewton (see, for example, the quotation from Jose Ferreira, CEO of Knewton, in an earlier post). Adaptive software is premised on the establishment and measurement of clearly defined learning outcomes. If measurable learning outcomes are what you’re after, it’s hard to imagine a better path to follow than adaptive software. If your priorities include standards and assessment, it is again hard to imagine an easier path to follow than adaptive software, which was used in testing long before its introduction into instruction. As David Kuntz, VP of research at Knewton and, before that, a pioneer of algorithms in the design of tests, points out, ‘when a student takes a course powered by Knewton, we are continuously evaluating their performance, what others have done with that material before, and what [they] know’[4]. Knewton’s claim that every education leader needs an adaptive learning infrastructure has a powerful internal logic.

3 New business models

‘Adapt or die’ (a phrase originally coined by the last prime minister of apartheid South Africa) is a piece of advice that is often given these days to both educational institutions and publishers. British universities must adapt or die, according to Michael Barber, author of ‘An Avalanche is Coming[5]’ (a report commissioned by the British Institute for Public Policy Research), Chief Education Advisor to Pearson, and editor of the Pearson ‘Efficacy’ document (see above). ELT publishers ‘must change or die’, reported the eltjam blog[6], and it is a message that is frequently repeated elsewhere. The move towards adaptive learning is seen increasingly often as one of the necessary adaptations for both these sectors.

The problems facing universities in countries like the U.K. are acute. Basically, as the introduction to ‘An Avalanche is Coming’ puts it, ‘the traditional university is being unbundled’. There are a number of reasons for this including the rising cost of higher education provision, greater global competition for the same students, funding squeezes from central governments, and competition from new educational providers (such as MOOCs). Unsurprisingly, universities (supported by national governments) have turned to technology, especially online course delivery, as an answer to their problems. There are two main reasons for this. Firstly, universities have attempted to reduce operating costs by looking for increases in scale (through mergers, transnational partnerships, international branch campuses and so on). Mega-universities are growing, and there are thirty-three in Asia alone (Selwyn Education in a Digital World New York: Routledge 2013, p.6). Universities like the Turkish Anadolu University, with over one million students, are no longer exceptional in terms of scale. In this world, online educational provision is a key element. Secondly, and not to put too fine a point on it, online instruction is cheaper (Spring, Education Networks 2012, p.2).

All other things being equal, why would any language department of an institute of higher education not choose an online environment with an adaptive element? Adaptive learning, for the time being at any rate, may be seen as ‘the much needed key to the “Iron Triangle” that poses a conundrum to HE providers; cost, access and quality. Any attempt to improve any one of those conditions impacts negatively on the others. If you want to increase access to a course you run the risk of escalating costs and jeopardising quality, and so on.[7]

Meanwhile, ELT publishers have been hit by rampant pirating of their materials, spiraling development costs of their flagship products and the growth of open educational resources. An excellent blog post by David Wiley[8] explains why adaptive learning services are a heaven-sent opportunity for publishers to modify their business model. ‘While the broad availability of free content and open educational resources have trained internet users to expect content to be free, many people are still willing to pay for services. Adaptive learning systems exploit this willingness by deeply intermingling content and services so that you cannot access one with using the other. Naturally, because an adaptive learning service is comprised of content plus adaptive services, it will be more expensive than static content used to be. And because it is a service, you cannot simply purchase it like you used to buy a textbook. An adaptive learning service is something you subscribe to, like Netflix. […] In short, why is it in a content company’s interest to enable you to own anything? Put simply, it is not. When you own a copy, the publisher completely loses control over it. When you subscribe to content through a digital service (like an adaptive learning service), the publisher achieves complete and perfect control over you and your use of their content.’

Although the initial development costs of building a suitable learning platform with adaptive capabilities are high, publishers will subsequently be able to produce and modify content (i.e. learning materials) much more efficiently. Since content will be mashed up and delivered in many different ways, author royalties will be cut or eliminated. Production and distribution costs will be much lower, and sales and marketing efforts can be directed more efficiently towards the most significant customers. The days of ELT sales reps trying unsuccessfully to get an interview with the director of studies of a small language school or university department are becoming a thing of the past. As with the universities, scale will be everything.


[2]http://www.eltjam.com/adaptive-learning/ (last accessed 13 January 2014)

[3] http://www.knewton.com/ (last accessed 13 January 2014)

[4] MIT Technology Review, November 26, 2012 http://www.technologyreview.com/news/506366/questions-surround-software-that-adapts-to-students/ (last accessed 13 January 2014)

[7] Tim Gifford Taking it Personally: Adaptive Learning July 9, 2013 http://www.eltjam.com/adaptive-learning/ (last accessed January 13, 2014)

[8] David Wiley, Buying our Way into Bondage: the risks of adaptive learning services March 20,2013 http://opencontent.org/blog/archives/2754 (last accessed January 13, 2014)