Archive for the ‘analytics’ Category

ltsigIt’s hype time again. Spurred on, no doubt, by the current spate of books and articles  about AIED (artificial intelligence in education), the IATEFL Learning Technologies SIG is organising an online event on the topic in November of this year. Currently, the most visible online references to AI in language learning are related to Glossika , basically a language learning system that uses spaced repetition, whose marketing department has realised that references to AI might help sell the product. GlossikaThey’re not alone – see, for example, Knowble which I reviewed earlier this year .

In the wider world of education, where AI has made greater inroads than in language teaching, every day brings more stuff: How artificial intelligence is changing teaching , 32 Ways AI is Improving Education , How artificial intelligence could help teachers do a better job , etc., etc. There’s a full-length book by Anthony Seldon, The Fourth Education Revolution: will artificial intelligence liberate or infantilise humanity? (2018, University of Buckingham Press) – one of the most poorly researched and badly edited books on education I’ve ever read, although that won’t stop it selling – and, no surprises here, there’s a Pearson commissioned report called Intelligence Unleashed: An argument for AI in Education (2016) which is available free.

Common to all these publications is the claim that AI will radically change education. When it comes to language teaching, a similar claim has been made by Donald Clark (described by Anthony Seldon as an education guru but perhaps best-known to many in ELT for his demolition of Sugata Mitra). In 2017, Clark wrote a blog post for Cambridge English (now unavailable) entitled How AI will reboot language learning, and a more recent version of this post, called AI has and will change language learning forever (sic) is available on Clark’s own blog. Given the history of the failure of education predictions, Clark is making bold claims. Thomas Edison (1922) believed that movies would revolutionize education. Radios were similarly hyped in the 1940s and in the 1960s it was the turn of TV. In the 1980s, Seymour Papert predicted the end of schools – ‘the computer will blow up the school’, he wrote. Twenty years later, we had the interactive possibilities of Web 2.0. As each technology failed to deliver on the hype, a new generation of enthusiasts found something else to make predictions about.

But is Donald Clark onto something? Developments in AI and computational linguistics have recently resulted in enormous progress in machine translation. Impressive advances in automatic speech recognition and generation, coupled with the power that can be packed into a handheld device, mean that we can expect some re-evaluation of the value of learning another language. Stephen Heppell, a specialist at Bournemouth University in the use of ICT in Education, has said: ‘Simultaneous translation is coming, making language teachers redundant. Modern languages teaching in future may be more about navigating cultural differences’ (quoted by Seldon, p.263). Well, maybe, but this is not Clark’s main interest.

Less a matter of opinion and much closer to the present day is the issue of assessment. AI is becoming ubiquitous in language testing. Cambridge, Pearson, TELC, Babbel and Duolingo are all using or exploring AI in their testing software, and we can expect to see this increase. Current, paper-based systems of testing subject knowledge are, according to Rosemary Luckin and Kristen Weatherby, outdated, ineffective, time-consuming, the cause of great anxiety and can easily be automated (Luckin, R. & Weatherby, K. 2018. ‘Learning analytics, artificial intelligence and the process of assessment’ in Luckin, R. (ed.) Enhancing Learning and Teaching with Technology, 2018. UCL Institute of Education Press, p.253). By capturing data of various kinds throughout a language learner’s course of study and by using AI to analyse learning development, continuous formative assessment becomes possible in ways that were previously unimaginable. ‘Assessment for Learning (AfL)’ or ‘Learning Oriented Assessment (LOA)’ are two terms used by Cambridge English to refer to the potential that AI offers which is described by Luckin (who is also one of the authors of the Pearson paper mentioned earlier). In practical terms, albeit in a still very limited way, this can be seen in the CUP course ‘Empower’, which combines CUP course content with validated LOA from Cambridge Assessment English.

Will this reboot or revolutionise language teaching? Probably not and here’s why. AIED systems need to operate with what is called a ‘domain knowledge model’. This specifies what is to be learnt and includes an analysis of the steps that must be taken to reach that learning goal. Some subjects (especially STEM subjects) ‘lend themselves much more readily to having their domains represented in ways that can be automatically reasoned about’ (du Boulay, D. et al., 2018. ‘Artificial intelligences and big data technologies to close the achievement gap’ in Luckin, R. (ed.) Enhancing Learning and Teaching with Technology, 2018. UCL Institute of Education Press, p.258). This is why most AIED systems have been built to teach these areas. Language are rather different. We simply do not have a domain knowledge model, except perhaps for the very lowest levels of language learning (and even that is highly questionable). Language learning is probably not, or not primarily, about acquiring subject knowledge. Debate still rages about the relationship between explicit language knowledge and language competence. AI-driven formative assessment will likely focus most on explicit language knowledge, as does most current language teaching. This will not reboot or revolutionise anything. It will more likely reinforce what is already happening: a model of language learning that assumes there is a strong interface between explicit knowledge and language competence. It is not a model that is shared by most SLA researchers.

So, one thing that AI can do (and is doing) for language learning is to improve the algorithms that determine the way that grammar and vocabulary are presented to individual learners in online programs. AI-optimised delivery of ‘English Grammar in Use’ may lead to some learning gains, but they are unlikely to be significant. It is not, in any case, what language learners need.

AI, Donald Clark suggests, can offer personalised learning. Precisely what kind of personalised learning this might be, and whether or not this is a good thing, remains unclear. A 2015 report funded by the Gates Foundation found that we currently lack evidence about the effectiveness of personalised learning. We do not know which aspects of personalised learning (learner autonomy, individualised learning pathways and instructional approaches, etc.) or which combinations of these will lead to gains in language learning. The complexity of the issues means that we may never have a satisfactory explanation. You can read my own exploration of the problems of personalised learning starting here .

What’s left? Clark suggests that chatbots are one area with ‘huge potential’. I beg to differ and I explained my reasons eighteen months ago . Chatbots work fine in very specific domains. As Clark says, they can be used for ‘controlled practice’, but ‘controlled practice’ means practice of specific language knowledge, the practice of limited conversational routines, for example. It could certainly be useful, but more than that? Taking things a stage further, Clark then suggests more holistic speaking and listening practice with Amazon Echo, Alexa or Google Home. If and when the day comes that we have general, as opposed to domain-specific, AI, chatting with one of these tools would open up vast new possibilities. Unfortunately, general AI does not exist, and until then Alexa and co will remain a poor substitute for human-human interaction (which is readily available online, anyway). Incidentally, AI could be used to form groups of online language learners to carry out communicative tasks – ‘the aim might be to design a grouping of students all at a similar cognitive level and of similar interests, or one where the participants bring different but complementary knowledge and skills’ (Luckin, R., Holmes, W., Griffiths, M. & Forceir, L.B. 2016. Intelligence Unleashed: An argument for AI in Education. London: Pearson, p.26).

Predictions about the impact of technology on education have a tendency to be made by people with a vested interest in the technologies. Edison was a businessman who had invested heavily in motion pictures. Donald Clark is an edtech entrepreneur whose company, Wildfire, uses AI in online learning programs. Stephen Heppell is executive chairman of LP+ who are currently developing a Chinese language learning community for 20 million Chinese school students. The reporting of AIED is almost invariably in websites that are paid for, in one way or another, by edtech companies. Predictions need, therefore, to be treated sceptically. Indeed, the safest prediction we can make about hyped educational technologies is that inflated expectations will be followed by disillusionment, before the technology finds a smaller niche.

 

Advertisements

Back in December 2013, in an interview with eltjam , David Liu, COO of the adaptive learning company, Knewton, described how his company’s data analysis could help ELT publishers ‘create more effective learning materials’. He focused on what he calls ‘content efficacy[i]’ (he uses the word ‘efficacy’ five times in the interview), a term which he explains below:

A good example is when we look at the knowledge graph of our partners, which is a map of how concepts relate to other concepts and prerequisites within their product. There may be two or three prerequisites identified in a knowledge graph that a student needs to learn in order to understand a next concept. And when we have hundreds of thousands of students progressing through a course, we begin to understand the efficacy of those said prerequisites, which quite frankly were made by an author or set of authors. In most cases they’re quite good because these authors are actually good in what they do. But in a lot of cases we may find that one of those prerequisites actually is not necessary, and not proven to be useful in achieving true learning or understanding of the current concept that you’re trying to learn. This is interesting information that can be brought back to the publisher as they do revisions, as they actually begin to look at the content as a whole.

One commenter on the post, Tom Ewens, found the idea interesting. It could, potentially, he wrote, give us new insights into how languages are learned much in the same way as how corpora have given us new insights into how language is used. Did Knewton have any plans to disseminate the information publicly, he asked. His question remains unanswered.

At the time, Knewton had just raised $51 million (bringing their total venture capital funding to over $105 million). Now, 16 months later, Knewton have launched their new product, which they are calling Knewton Content Insights. They describe it as the world’s first and only web-based engine to automatically extract statistics comparing the relative quality of content items — enabling us to infer more information about student proficiency and content performance than ever before possible.

The software analyses particular exercises within the learning content (and particular items within them). It measures the relative difficulty of individual items by, for example, analysing how often a question is answered incorrectly and how many tries it takes each student to answer correctly. It also looks at what they call ‘exhaustion’ – how much content students are using in a particular area – and whether they run out of content. The software can correlate difficulty with exhaustion. Lastly, it analyses what they call ‘assessment quality’ – how well  individual questions assess a student’s understanding of a topic.

Knewton’s approach is premised on the idea that learning (in this case language learning) can be broken down into knowledge graphs, in which the information that needs to be learned can be arranged and presented hierarchically. The ‘granular’ concepts are then ‘delivered’ to the learner, and Knewton’s software can optimise the delivery. The first problem, as I explored in a previous post, is that language is a messy, complex system: it doesn’t lend itself terribly well to granularisation. The second problem is that language learning does not proceed in a linear, hierarchical way: it is also messy and complex. The third is that ‘language learning content’ cannot simply be delivered: a process of mediation is unavoidable. Are the people at Knewton unaware of the extensive literature devoted to the differences between synthetic and analytic syllabuses, of the differences between product-oriented and process-oriented approaches? It would seem so.

Knewton’s ‘Content Insights’ can only, at best, provide some sort of insight into the ‘language knowledge’ part of any learning content. It can say nothing about the work that learners do to practise language skills, since these are not susceptible to granularisation: you simply can’t take a piece of material that focuses on reading or listening and analyse its ‘content efficacy at the concept level’. Because of this, I predicted (in the post about Knowledge Graphs) that the likely focus of Knewton’s analytics would be discrete item, sentence-level grammar (typically tenses). It turns out that I was right.

Knewton illustrate their new product with screen shots such as those below.

Content-Insight-Assessment-1

 

 

 

 

 

Content-Insight-Exhaustion-1

 

 

 

 

 

 

 

They give a specific example of the sort of questions their software can answer. It is: do students generally find the present simple tense easier to understand than the present perfect tense? Doh!

It may be the case that Knewton Content Insights might optimise the presentation of this kind of grammar, but optimisation of this presentation and practice is highly unlikely to have any impact on the rate of language acquisition. Students are typically required to study the present perfect at every level from ‘elementary’ upwards. They have to do this, not because the presentation in, say, Headway, is not optimised. What they need is to spend a significantly greater proportion of their time on ‘language use’ and less on ‘language knowledge’. This is not just my personal view: it has been extensively researched, and I am unaware of any dissenting voices.

The number-crunching in Knewton Content Insights is unlikely, therefore, to lead to any actionable insights. It is, however, very likely to lead (as writer colleagues at Pearson and other publishers are finding out) to an obsession with measuring the ‘efficacy’ of material which, quite simply, cannot meaningfully be measured in this way. It is likely to distract from much more pressing issues, notably the question of how we can move further and faster away from peddling sentence-level, discrete-item grammar.

In the long run, it is reasonable to predict that the attempt to optimise the delivery of language knowledge will come to be seen as an attempt to tackle the wrong question. It will make no significant difference to language learners and language learning. In the short term, how much time and money will be wasted?

[i] ‘Efficacy’ is the buzzword around which Pearson has built its materials creation strategy, a strategy which was launched around the same time as this interview. Pearson is a major investor in Knewton.

2014-09-30_2216Jose Ferreira, the fast-talking sales rep-in-chief of Knewton, likes to dazzle with numbers. In a 2012 talk hosted by the US Department of Education, Ferreira rattles off the stats: So Knewton students today, we have about 125,000, 180,000 right now, by December it’ll be 650,000, early next year it’ll be in the millions, and next year it’ll be close to 10 million. And that’s just through our Pearson partnership. For each of these students, Knewton gathers millions of data points every day. That, brags Ferreira, is five orders of magnitude more data about you than Google has. … We literally have more data about our students than any company has about anybody else about anything, and it’s not even close. With just a touch of breathless exaggeration, Ferreira goes on: We literally know everything about what you know and how you learn best, everything.

The data is mined to find correlations between learning outcomes and learning behaviours, and, once correlations have been established, learning programmes can be tailored to individual students. Ferreira explains: We take the combined data problem all hundred million to figure out exactly how to teach every concept to each kid. So the 100 million first shows up to learn the rules of exponents, great let’s go find a group of people who are psychometrically equivalent to that kid. They learn the same ways, they have the same learning style, they know the same stuff, because Knewton can figure out things like you learn math best in the morning between 8:40 and 9:13 am. You learn science best in 42 minute bite sizes the 44 minute mark you click right, you start missing questions you would normally get right.

The basic premise here is that the more data you have, the more accurately you can predict what will work best for any individual learner. But how accurate is it? In the absence of any decent, independent research (or, for that matter, any verifiable claims from Knewton), how should we respond to Ferreira’s contribution to the White House Education Datapalooza?

A 51Oy5J3o0yL._AA258_PIkin4,BottomRight,-46,22_AA280_SH20_OU35_new book by Stephen Finlay, Predictive Analytics, Data Mining and Big Data (Palgrave Macmillan, 2014) suggests that predictive analytics are typically about 20 – 30% more accurate than humans attempting to make the same judgements. That’s pretty impressive and perhaps Knewton does better than that, but the key thing to remember is that, however much data Knewton is playing with, and however good their algorithms are, we are still talking about predictions and not certainties. If an adaptive system could predict with 90% accuracy (and the actual figure is typically much lower than that) what learning content and what learning approach would be effective for an individual learner, it would still mean that it was wrong 10% of the time. When this is scaled up to the numbers of students that use Knewton software, it means that millions of students are getting faulty recommendations. Beyond a certain point, further expansion of the data that is mined is unlikely to make any difference to the accuracy of predictions.

A further problem identified by Stephen Finlay is the tendency of people in predictive analytics to confuse correlation and causation. Certain students may have learnt maths best between 8.40 and 9.13, but it does not follow that they learnt it best because they studied at that time. If strong correlations do not involve causality, then actionable insights (such as individualised course design) can be no more than an informed gamble.

Knewton’s claim that they know how every student learns best is marketing hyperbole and should set alarm bells ringing. When it comes to language learning, we simply do not know how students learn (we do not have any generally accepted theory of second language acquisition), let alone how they learn best. More data won’t help our theories of learning! Ferreira’s claim that, with Knewton, every kid gets a perfectly optimized textbook, except it’s also video and other rich media dynamically generated in real time is equally preposterous, not least since the content of the textbook will be at least as significant as the way in which it is ‘optimized’. And, as we all know, textbooks have their faults.

Cui bono? Perhaps huge data and predictive analytics will benefit students; perhaps not. We will need to wait and find out. But Stephen Finlay reminds us that in gold rushes (and internet booms and the exciting world of Big Data) the people who sell the tools make a lot of money. Far more strike it rich selling picks and shovels to prospectors than do the prospectors. Likewise, there is a lot of money to be made selling Big Data solutions. Whether the buyer actually gets any benefit from them is not the primary concern of the sales people. (p.16/17) Which is, perhaps, one of the reasons that some sales people talk so fast.