Archive for December, 2022

Generative AI and ELT Materials

Posted: December 20, 2022 in Uncategorized

I must begin by apologizing for my last, flippant blog post where I spun a tale about using generative AI to produce ‘teacher development content’ and getting rid of teacher trainers. I had just watched a webinar by Giuseppe Tomasello of, ‘Harnessing Generative AI to Supercharge Language Education’, and it felt as if much of the script of this webinar had been generated by ChatGPT: ‘write a relentlessly positive product pitch for a language teaching platform in the style of a typical edtech vendor’.

In the webinar, Tomasello talked about using GPT-3 to generate texts for language learners. Yes, it’s doable, the results are factually and linguistically accurate, but dull in the extreme (see Steve Smith’s experiment with some texts for French learners). André Hedlund summarises: ‘limited by a rigid structure … no big differences in style or linguistic devices. They follow a recipe, a formula.’ Much like many coursebook / courseware texts, in other words.

More interesting texts can be generated with GPT-3 when the prompts are crafted in careful detail, but this requires creative human imagination. Crafting such prompts requires practice, trial and error, and it requires knowledge of the intended readers.

AI can be used to generate dull texts at a certain level (B1, B2, C1, etc.), but reliability is a problem. So many variables (starting with familiarity with the subject matter and the learner’s first language) impact on levels of reading difficulty that automation can never satisfactorily resolve this challenge.

However, the interesting question is not whether we can quickly generate courseware-like texts using GPT-3, but whether we should even want to. What is the point of using such texts? Giuseppe Tomasello’s demonstration made it clear that the way these texts should be exploited is by using (automatically generating) a series of comprehension questions and a list of key words which can also be (automatically) converted into a variety of practice tasks (flashcards, gapfills, etc.). Like courseware texts, then, these texts are language objects (TALOs), to be tested for comprehension and mined for items to be deliberately learnt. They are not, in any meaningful way, texts as vehicles of information (TAVIs) – see here for more about TALOs and TAVIs.

Comprehension testing is a standard feature of language teaching, but there are good reasons to believe that it will do little, if anything, to improve a learner’s reading skills (see, for example, Swan & Walter, 2017; Grabe, W. & Yamashita, J., 2022) or language competence. It has nothing to do with the kind of extensive reading (see, for example, Leather & Uden, 2021) that is known to be such a powerful force in language acquisition. This use of text is all about deliberate language study. We are talking here about a synthetic approach to language learning.

The problem, of course, is that deliberate language study is far from uncontested. Synthetic approaches wrongly assume that ‘the explicit teaching of declarative knowledge (knowledge about the L2) will lead to the procedural knowledge learners need in order to successfully use the L2 for communicative purpose’ (see Geoff Jordan on this topic). Even if it were true that some explicit teaching was of value, it’s widely agreed that explicit teaching should not form the core of a language learning syllabus. The product, however, is basically all about explicit teaching.

Judging from the presentation on YouTube, the platform offers a wide range of true / false, multiple choice questions, gapfills, dictations and so on, all of which can be gamified. The ‘methodology’ is called ‘Flip and Flop the Classroom’. In this approach, learners do some self-study (presumably of some pre-selected discrete language item(s), practise it in the synchronous lesson, and then have more self-study where this language is reviewed. In the ‘flop’ section, the learner’s spoken contribution to the live lesson is recorded, transcribed and analysed by the system which identifies aspects of the learner’s language which can be improved through personalized practice.

The focus is squarely on accuracy, and the importance of accuracy is underlined by the presenter’s observation that Communicative Language Teaching does not focus enough on accuracy. Accuracy is also to the fore in another task type, strangely called ‘chunking’ (see below). Apparently, this will lead to fluency: ‘The goal of this template is that they can repeat it a lot of times and reach fluency at the end’.

The YouTube presentation is called ‘How to structure your language course using popular pedagogical approaches’ and suggests that you can mix’n’match bits from ‘Grammar Translation’, ‘Direct Method’ and ‘CLT’. Sorry, Giuseppe, you can mix’n’match all you like, but you can’t be methodologically agnostic. This is a synthetic approach all the way down the line. As such, it’s not going to supercharge language education. It’s basically more of what we already have too much of.

Let’s assume, though, for a moment that what we really want is a kind of supercharged, personalizable, quickly generated combination of vocabulary flashcards and ‘English Grammar in Use’ loaded with TALOs. How does this product stand up? We’ll need to consider two related questions: (1) its reliability, and (2) how time-saving it actually is.

As far as I can judge from the YouTube presentation, reliability is not exactly optimal. There’s the automated key word identifier that identified ‘Denise’, ‘the’ and ‘we’ as ‘key words’ in an extract of learner-produced text (‘Hello, my name is Denise and I founded my makeup company in 2021. We produce skin care products using 100% natural ingredients.’). There’s the automated multiple choice / translation generator which tests your understanding of ‘Denise’ (see below), and there’s the voice recognition software which transcribed ‘it cost’ as ‘they cost’.

In the more recent ‘webinar’ (i.e commercial presentation) that has not yet been uploaded to YouTube, participants identified a number of other bloopers. In short, reliability is an issue. This shouldn’t surprise anyone. Automation of some of these tasks is extremely difficult (see my post about the automated generation of vocabulary learning materials). Perhaps impossible … but how much error is acceptable? does not sell content: they sell a platform for content-generation, course-creation and selling. Putative clients are institutions wanting to produce and sell learning content of the synthetic kind. The problem with a lack of reliability, any lack of reliability, is that you immediately need skilled staff to work with the platform, to check for error, to edit, to improve on the inevitable limitations of the AI (starting, perhaps, with the dull texts it has generated). It is disingenuous to suggest that anyone can do this without substantial training / supervision. Generative AI only offers a time-saving route, which does not sacrifice reliability, if a skilled and experienced writer is working with it. is a young start-up that raised $345K in first round funding in September of last year. The various technologies they are using are expensive, and a lot more funding will be needed to make the improvements and additions (such as spaced repetition) that are necessary. In both presentations, there was lots of talk that the platform will be able to do this and will be able to do that. For the moment, though, nothing has been proved, and my suspicion is that some of the problems they are trying to solve do not have technological solutions. First of all, they’ll need a better understanding of what these problems are, and, for that, there has to be a coherent and credible theory of second language acquisition. There are all sorts of good uses that GPT-3 / AI could be put to in language teaching. I doubt this is one of them.

To wrap up, here’s a little question. What are the chances that’s claims that the product will lead to ‘+50% student engagement’ and ‘5X Faster creating language courses’ were also generated by GPT-3?


Grabe, W. & Yamashita, J. (2022) Reading in a Second Language 2nd edition. New York: Cambridge University Press

Leather, S. & Uden, J. (2021) Extensive Reading. New York: Routledge

Swan, M. & Walter, C. (2017) Misunderstanding comprehension. ELT Journal, 71 (2): 228 – 236

Who can tell where a blog post might lead? Over six years ago I wrote about adaptive professional development for teachers. I imagined the possibility of bite-sized, personalized CPD material. Now my vision is becoming real.

For the last two years, I have been working with a start-up that has been using AI to generate text using GPT-3 large language models. GPT-3 has recently been in the news because of the phenomenal success of the newly released ChatGPT. The technology certainly has a wow factor, but it has been around for a while now. ChatGPT can generate texts of various genres on any topic (with a few exceptions like current affairs) and the results are impressive. Imagine, then, how much more impressive the results can be when the kind of text is limited by genre and topic, allowing the software to be trained much more reliably.

This is what we have been working on. We took as our training corpus a huge collection of English language teaching teacher development texts that we could access online: blogs from all the major publishers, personal blogs, transcriptions from recorded conference presentations and webinars, magazine articles directed at teachers, along with books from publishers such as DELTA and Pavilion ELT, etc. We identified topics that seemed to be of current interest and asked our AI to generate blog posts. Later, we were able to get suggestions of topics from the software itself.

We then contacted a number of teachers and trainers who contribute to the publishers’ blogs and contracted them, first, to act as human trainers for the software, and, second, to agree to their names being used as the ‘authors’ of the blog posts we generated. In one or two cases, the authors thought that they had actually written the posts themselves! Next we submitted these posts to the marketing departments of the publishers (who run the blogs). Over twenty were submitted in this way, including:

  • What do teachers need to know about teaching 21st century skills in the English classroom?
  • 5 top ways of improving the well-being of English teachers
  • Teaching leadership skills in the primary English classroom
  • How can we promote eco-literacy in the English classroom?
  • My 10 favourite apps for English language learners

We couldn’t, of course, tell the companies that AI had been used to write the copy, but once we were sure that nobody had ever spotted the true authorship of this material, we were ready to move to the next stage of the project. We approached the marketing executives of two publishers and showed how we could generate teacher development material at a fraction of the current cost and in a fraction of the time. Partnerships were quickly signed.

Blog posts were just the beginning. We knew that we could use the same technology to produce webinar scripts, using learning design insights to optimise the webinars. The challenge we faced was that webinars need a presenter. We experimented with using animations, but feedback indicated that participants like to see a face. This is eminently doable, using our contracted authors and deep fake technology, but costs are still prohibitive. It remains cheaper and easier to use our authors delivering the scripts we have generated. This will no doubt change before too long.

The next obvious step was to personalize the development material. Large publishers collect huge amounts of data about visitors to their sites using embedded pixels. It is also relatively cheap and easy to triangulate this data with information from the customer databases and from activity on social media (especially Facebook). We know what kinds of classes people teach, and we know which aspects of teacher development they are interested in.

Publishers have long been interested in personalizing marketing material, and the possibility of extending this to the delivery of real development content is clearly exciting. (See below an email I received this week from the good folks at OUP marketing.)

Earlier this year one of our publishing partners began sending links to personalized materials of the kind we were able to produce with AI. The experiment was such a success that we have already taken it one stage further.

One of the most important clients of our main publishing partner employs hundreds of teachers to deliver online English classes using courseware that has been tailored to the needs of the institution. With so many freelance teachers working for them, along with high turnover of staff, there is inevitably a pressing need for teacher training to ensure optimal delivery. Since the classes are all online, it is possible to capture precisely what is going on. Using an AI-driven tool that was inspired by the Visible Classroom app (informed by the work of John Hattie), we can identify the developmental needs of the teachers. What kinds of activities are they using? How well do they exploit the functionalities of the platform? What can be said about the quality of their teacher talk? We combine this data with everything else and our proprietary algorithms determine what kinds of training materials each teacher receives. It doesn’t stop there. We can also now evaluate the effectiveness of these materials by analysing the learning outcomes of the students.

Teaching efficacy can by massively increased, whilst the training budget of the institution can be slashed. If all goes well, there will be no further need for teacher trainers at all. We won’t be stopping there. If results such as these can be achieved in teacher training, there’s no reason why the same technology cannot be leveraged for the teaching itself. Most of our partner’s teaching and testing materials are now quickly and very cheaply generated using GPT-3.5. If you want to see how this is done, check out the work of edugo.AI (a free trial is available) which can generate gapfills and comprehension test questions in a flash. As for replacing the teachers, we’re getting there. For the time being, though, it’s more cost-effective to use freelancers and to train them up.