Posts Tagged ‘chatbots’


At the beginning of March, I’ll be going to Cambridge to take part in a Digital Learning Colloquium (for more information about the event, see here ). One of the questions that will be explored is how research might contribute to the development of digital language learning. In this, the first of two posts on the subject, I’ll be taking a broad overview of the current state of play in edtech research.

I try my best to keep up to date with research. Of the main journals, there are Language Learning and Technology, which is open access; CALICO, which offers quite a lot of open access material; and reCALL, which is the most restricted in terms of access of the three. But there is something deeply frustrating about most of this research, and this is what I want to explore in these posts. More often than not, research articles end with a call for more research. And more often than not, I find myself saying ‘Please, no, not more research like this!’

First, though, I would like to turn to a more reader-friendly source of research findings. Systematic reviews are, basically literature reviews which can save people like me from having to plough through endless papers on similar subjects, all of which contain the same (or similar) literature review in the opening sections. If only there were more of them. Others agree with me: the conclusion of one systematic review of learning and teaching with technology in higher education (Lillejord et al., 2018) was that more systematic reviews were needed.

Last year saw the publication of a systematic review of research on artificial intelligence applications in higher education (Zawacki-Richter, et al., 2019) which caught my eye. The first thing that struck me about this review was that ‘out of 2656 initially identified publications for the period between 2007 and 2018, 146 articles were included for final synthesis’. In other words, only just over 5% of the research was considered worthy of inclusion.

The review did not paint a very pretty picture of the current state of AIEd research. As the second part of the title of this review (‘Where are the educators?’) makes clear, the research, taken as a whole, showed a ‘weak connection to theoretical pedagogical perspectives’. This is not entirely surprising. As Bates (2019) has noted: ‘since AI tends to be developed by computer scientists, they tend to use models of learning based on how computers or computer networks work (since of course it will be a computer that has to operate the AI). As a result, such AI applications tend to adopt a very behaviourist model of learning: present / test / feedback.’ More generally, it is clear that technology adoption (and research) is being driven by technology enthusiasts, with insufficient expertise in education. The danger is that edtech developers ‘will simply ‘discover’ new ways to teach poorly and perpetuate erroneous ideas about teaching and learning’ (Lynch, 2017).

This, then, is the first of my checklist of things that, collectively, researchers need to do to improve the value of their work. The rest of this list is drawn from observations mostly, but not exclusively, from the authors of systematic reviews, and mostly come from reviews of general edtech research. In the next blog post, I’ll look more closely at a recent collection of ELT edtech research (Mavridi & Saumell, 2020) to see how it measures up.

1 Make sure your research is adequately informed by educational research outside the field of edtech

Unproblematised behaviourist assumptions about the nature of learning are all too frequent. References to learning styles are still fairly common. The most frequently investigated skill that is considered in the context of edtech is critical thinking (Sosa Neira, et al., 2017), but this is rarely defined and almost never problematized, despite a broad literature that questions the construct.

2 Adopt a sceptical attitude from the outset

Know your history. Decades of technological innovation in education have shown precious little in the way of educational gains and, more than anything else, have taught us that we need to be sceptical from the outset. ‘Enthusiasm and praise that are directed towards ‘virtual education, ‘school 2.0’, ‘e-learning and the like’ (Selwyn, 2014: vii) are indications that the lessons of the past have not been sufficiently absorbed (Levy, 2016: 102). The phrase ‘exciting potential’, for example, should be banned from all edtech research. See, for example, a ‘state-of-the-art analysis of chatbots in education’ (Winkler & Söllner, 2018), which has nothing to conclude but ‘exciting potential’. Potential is fine (indeed, it is perhaps the only thing that research can unambiguously demonstrate – see section 3 below), but can we try to be a little more grown-up about things?

3 Know what you are measuring

Measuring learning outcomes is tricky, to say the least, but it’s understandable that researchers should try to focus on them. Unfortunately, ‘the vast array of literature involving learning technology evaluation makes it challenging to acquire an accurate sense of the different aspects of learning that are evaluated, and the possible approaches that can be used to evaluate them’ (Lai & Bower, 2019). Metrics such as student grades are hard to interpret, not least because of the large number of variables and the danger of many things being conflated in one score. Equally, or possibly even more, problematic, are self-reporting measures which are rarely robust. It seems that surveys are the most widely used instrument in qualitative research (Sosa Neira, et al., 2017), but these will tell us little or nothing when used for short-term interventions (see point 5 below).

4 Ensure that the sample size is big enough to mean something

In most of the research into digital technology in education that was analysed in a literature review carried out for the Scottish government (ICF Consulting Services Ltd, 2015), there were only ‘small numbers of learners or teachers or schools’.

5 Privilege longitudinal studies over short-term projects

The Scottish government literature review (ICF Consulting Services Ltd, 2015), also noted that ‘most studies that attempt to measure any outcomes focus on short and medium term outcomes’. The fact that the use of a particular technology has some sort of impact over the short or medium term tells us very little of value. Unless there is very good reason to suspect the contrary, we should assume that it is a novelty effect that has been captured (Levy, 2016: 102).

6 Don’t forget the content

The starting point of much edtech research is the technology, but most edtech, whether it’s a flashcard app or a full-blown Moodle course, has content. Research reports rarely give details of this content, assuming perhaps that it’s just fine, and all that’s needed is a little tech to ‘present learners with the ‘right’ content at the ‘right’ time’ (Lynch, 2017). It’s a foolish assumption. Take a random educational app from the Play Store, a random MOOC or whatever, and the chances are you’ll find it’s crap.

7 Avoid anecdotal accounts of technology use in quasi-experiments as the basis of a ‘research article’

Control (i.e technology-free) groups may not always be possible but without them, we’re unlikely to learn much from a single study. What would, however, be extremely useful would be a large, collated collection of such action-research projects, using the same or similar technology, in a variety of settings. There is a marked absence of this kind of work.

8 Enough already of higher education contexts

Researchers typically work in universities where they have captive students who they can carry out research on. But we have a problem here. The systematic review of Lundin et al (2018), for example, found that ‘studies on flipped classrooms are dominated by studies in the higher education sector’ (besides lacking anchors in learning theory or instructional design). With some urgency, primary and secondary contexts need to be investigated in more detail, not just regarding flipped learning.

9 Be critical

Very little edtech research considers the downsides of edtech adoption. Online safety, privacy and data security are hardly peripheral issues, especially with younger learners. Ignoring them won’t make them go away.

More research?

So do we need more research? For me, two things stand out. We might benefit more from, firstly, a different kind of research, and, secondly, more syntheses of the work that has already been done. Although I will probably continue to dip into the pot-pourri of articles published in the main CALL journals, I’m looking forward to a change at the CALICO journal. From September of this year, one issue a year will be thematic, with a lead article written by established researchers which will ‘first discuss in broad terms what has been accomplished in the relevant subfield of CALL. It should then outline which questions have been answered to our satisfaction and what evidence there is to support these conclusions. Finally, this article should pose a “soft” research agenda that can guide researchers interested in pursuing empirical work in this area’. This will be followed by two or three empirical pieces that ‘specifically reflect the research agenda, methodologies, and other suggestions laid out in the lead article’.

But I think I’ll still have a soft spot for some of the other journals that are coyer about their impact factor and that can be freely accessed. How else would I discover (it would be too mean to give the references here) that ‘the effective use of new technologies improves learners’ language learning skills’? Presumably, the ineffective use of new technologies has the opposite effect? Or that ‘the application of modern technology represents a significant advance in contemporary English language teaching methods’?


Bates, A. W. (2019). Teaching in a Digital Age Second Edition. Vancouver, B.C.: Tony Bates Associates Ltd. Retrieved from

ICF Consulting Services Ltd (2015). Literature Review on the Impact of Digital Technology on Learning and Teaching. Edinburgh: The Scottish Government.

Lai, J.W.M. & Bower, M. (2019). How is the use of technology in education evaluated? A systematic review. Computers & Education, 133(1), 27-42. Elsevier Ltd. Retrieved January 14, 2020 from

Levy, M. 2016. Researching in language learning and technology. In Farr, F. & Murray, L. (Eds.) The Routledge Handbook of Language Learning and Technology. Abingdon, Oxon.: Routledge. pp.101 – 114

Lillejord S., Børte K., Nesje K. & Ruud E. (2018). Learning and teaching with technology in higher education – a systematic review. Oslo: Knowledge Centre for Education

Lundin, M., Bergviken Rensfeldt, A., Hillman, T. et al. (2018). Higher education dominance and siloed knowledge: a systematic review of flipped classroom research. International Journal of Educational Technology in Higher Education 15, 20 (2018) doi:10.1186/s41239-018-0101-6

Lynch, J. (2017). How AI Will Destroy Education. Medium, November 13, 2017.

Mavridi, S. & Saumell, V. (Eds.) (2020). Digital Innovations and Research in Language Learning. Faversham, Kent: IATEFL

Selwyn, N. (2014). Distrusting Educational Technology. New York: Routledge

Sosa Neira, E. A., Salinas, J. and de Benito Crosetti, B. (2017). Emerging Technologies (ETs) in Education: A Systematic Review of the Literature Published between 2006 and 2016. International Journal of Emerging Technologies in Education, 12 (5).

Winkler, R. & Söllner, M. (2018): Unleashing the Potential of Chatbots in Education: A State-Of-The-Art Analysis. In: Academy of Management Annual Meeting (AOM). Chicago, USA.

Zawacki-Richter, O., Bond, M., Marin, V. I. And Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education – where are the educators? International Journal of Educational Technology in Higher Education 2019

ltsigIt’s hype time again. Spurred on, no doubt, by the current spate of books and articles  about AIED (artificial intelligence in education), the IATEFL Learning Technologies SIG is organising an online event on the topic in November of this year. Currently, the most visible online references to AI in language learning are related to Glossika , basically a language learning system that uses spaced repetition, whose marketing department has realised that references to AI might help sell the product. GlossikaThey’re not alone – see, for example, Knowble which I reviewed earlier this year .

In the wider world of education, where AI has made greater inroads than in language teaching, every day brings more stuff: How artificial intelligence is changing teaching , 32 Ways AI is Improving Education , How artificial intelligence could help teachers do a better job , etc., etc. There’s a full-length book by Anthony Seldon, The Fourth Education Revolution: will artificial intelligence liberate or infantilise humanity? (2018, University of Buckingham Press) – one of the most poorly researched and badly edited books on education I’ve ever read, although that won’t stop it selling – and, no surprises here, there’s a Pearson commissioned report called Intelligence Unleashed: An argument for AI in Education (2016) which is available free.

Common to all these publications is the claim that AI will radically change education. When it comes to language teaching, a similar claim has been made by Donald Clark (described by Anthony Seldon as an education guru but perhaps best-known to many in ELT for his demolition of Sugata Mitra). In 2017, Clark wrote a blog post for Cambridge English (now unavailable) entitled How AI will reboot language learning, and a more recent version of this post, called AI has and will change language learning forever (sic) is available on Clark’s own blog. Given the history of the failure of education predictions, Clark is making bold claims. Thomas Edison (1922) believed that movies would revolutionize education. Radios were similarly hyped in the 1940s and in the 1960s it was the turn of TV. In the 1980s, Seymour Papert predicted the end of schools – ‘the computer will blow up the school’, he wrote. Twenty years later, we had the interactive possibilities of Web 2.0. As each technology failed to deliver on the hype, a new generation of enthusiasts found something else to make predictions about.

But is Donald Clark onto something? Developments in AI and computational linguistics have recently resulted in enormous progress in machine translation. Impressive advances in automatic speech recognition and generation, coupled with the power that can be packed into a handheld device, mean that we can expect some re-evaluation of the value of learning another language. Stephen Heppell, a specialist at Bournemouth University in the use of ICT in Education, has said: ‘Simultaneous translation is coming, making language teachers redundant. Modern languages teaching in future may be more about navigating cultural differences’ (quoted by Seldon, p.263). Well, maybe, but this is not Clark’s main interest.

Less a matter of opinion and much closer to the present day is the issue of assessment. AI is becoming ubiquitous in language testing. Cambridge, Pearson, TELC, Babbel and Duolingo are all using or exploring AI in their testing software, and we can expect to see this increase. Current, paper-based systems of testing subject knowledge are, according to Rosemary Luckin and Kristen Weatherby, outdated, ineffective, time-consuming, the cause of great anxiety and can easily be automated (Luckin, R. & Weatherby, K. 2018. ‘Learning analytics, artificial intelligence and the process of assessment’ in Luckin, R. (ed.) Enhancing Learning and Teaching with Technology, 2018. UCL Institute of Education Press, p.253). By capturing data of various kinds throughout a language learner’s course of study and by using AI to analyse learning development, continuous formative assessment becomes possible in ways that were previously unimaginable. ‘Assessment for Learning (AfL)’ or ‘Learning Oriented Assessment (LOA)’ are two terms used by Cambridge English to refer to the potential that AI offers which is described by Luckin (who is also one of the authors of the Pearson paper mentioned earlier). In practical terms, albeit in a still very limited way, this can be seen in the CUP course ‘Empower’, which combines CUP course content with validated LOA from Cambridge Assessment English.

Will this reboot or revolutionise language teaching? Probably not and here’s why. AIED systems need to operate with what is called a ‘domain knowledge model’. This specifies what is to be learnt and includes an analysis of the steps that must be taken to reach that learning goal. Some subjects (especially STEM subjects) ‘lend themselves much more readily to having their domains represented in ways that can be automatically reasoned about’ (du Boulay, D. et al., 2018. ‘Artificial intelligences and big data technologies to close the achievement gap’ in Luckin, R. (ed.) Enhancing Learning and Teaching with Technology, 2018. UCL Institute of Education Press, p.258). This is why most AIED systems have been built to teach these areas. Language are rather different. We simply do not have a domain knowledge model, except perhaps for the very lowest levels of language learning (and even that is highly questionable). Language learning is probably not, or not primarily, about acquiring subject knowledge. Debate still rages about the relationship between explicit language knowledge and language competence. AI-driven formative assessment will likely focus most on explicit language knowledge, as does most current language teaching. This will not reboot or revolutionise anything. It will more likely reinforce what is already happening: a model of language learning that assumes there is a strong interface between explicit knowledge and language competence. It is not a model that is shared by most SLA researchers.

So, one thing that AI can do (and is doing) for language learning is to improve the algorithms that determine the way that grammar and vocabulary are presented to individual learners in online programs. AI-optimised delivery of ‘English Grammar in Use’ may lead to some learning gains, but they are unlikely to be significant. It is not, in any case, what language learners need.

AI, Donald Clark suggests, can offer personalised learning. Precisely what kind of personalised learning this might be, and whether or not this is a good thing, remains unclear. A 2015 report funded by the Gates Foundation found that we currently lack evidence about the effectiveness of personalised learning. We do not know which aspects of personalised learning (learner autonomy, individualised learning pathways and instructional approaches, etc.) or which combinations of these will lead to gains in language learning. The complexity of the issues means that we may never have a satisfactory explanation. You can read my own exploration of the problems of personalised learning starting here .

What’s left? Clark suggests that chatbots are one area with ‘huge potential’. I beg to differ and I explained my reasons eighteen months ago . Chatbots work fine in very specific domains. As Clark says, they can be used for ‘controlled practice’, but ‘controlled practice’ means practice of specific language knowledge, the practice of limited conversational routines, for example. It could certainly be useful, but more than that? Taking things a stage further, Clark then suggests more holistic speaking and listening practice with Amazon Echo, Alexa or Google Home. If and when the day comes that we have general, as opposed to domain-specific, AI, chatting with one of these tools would open up vast new possibilities. Unfortunately, general AI does not exist, and until then Alexa and co will remain a poor substitute for human-human interaction (which is readily available online, anyway). Incidentally, AI could be used to form groups of online language learners to carry out communicative tasks – ‘the aim might be to design a grouping of students all at a similar cognitive level and of similar interests, or one where the participants bring different but complementary knowledge and skills’ (Luckin, R., Holmes, W., Griffiths, M. & Forceir, L.B. 2016. Intelligence Unleashed: An argument for AI in Education. London: Pearson, p.26).

Predictions about the impact of technology on education have a tendency to be made by people with a vested interest in the technologies. Edison was a businessman who had invested heavily in motion pictures. Donald Clark is an edtech entrepreneur whose company, Wildfire, uses AI in online learning programs. Stephen Heppell is executive chairman of LP+ who are currently developing a Chinese language learning community for 20 million Chinese school students. The reporting of AIED is almost invariably in websites that are paid for, in one way or another, by edtech companies. Predictions need, therefore, to be treated sceptically. Indeed, the safest prediction we can make about hyped educational technologies is that inflated expectations will be followed by disillusionment, before the technology finds a smaller niche.