Big data, learning analytics and language learning / teaching

Posted: March 14, 2019 in analytics, big data, Online learning, Personalization, research
Tags: , , , , , , ,

The use of big data and analytics in education continues to grow.

A vast apparatus of measurement is being developed to underpin national education systems, institutions and the actions of the individuals who occupy them. […] The presence of digital data and software in education is being amplified through massive financial and political investment in educational technologies, as well as huge growth in data collection and analysis in policymaking practices, extension of performance measurement technologies in the management of educational institutions, and rapid expansion of digital methodologies in educational research. To a significant extent, many of the ways in which classrooms function, educational policy departments and leaders make decisions, and researchers make sense of data, simply would not happen as currently intended without the presence of software code and the digital data processing programs it enacts. (Williamson, 2017: 4)

The most common and successful use of this technology so far has been in the identification of students at risk of dropping out of their courses (Jørno & Gynther, 2018: 204). The kind of analytics used in this context may be called ‘academic analytics’ and focuses on educational processes at the institutional level or higher (Gelan et al, 2018: 3). However, ‘learning analytics’, the capture and analysis of learner and learning data in order to personalize learning ‘(1) through real-time feedback on online courses and e-textbooks that can ‘learn’ from how they are used and ‘talk back’ to the teacher, and (2) individualization and personalization of the educational experience through adaptive learning systems that enable materials to be tailored to each student’s individual needs through automated real-time analysis’ (Mayer-Schönberger & Cukier, 2014) has become ‘the main keyword of data-driven education’ (Williamson, 2017: 10). See my earlier posts on this topic here and here and here.

Learning with big dataNear the start of Mayer-Schönberger and Cukier’s enthusiastic sales pitch (Learning with Big Data: The Future of Education) for the use of big data in education, there is a discussion of Duolingo. They quote Luis von Ahn, the founder of Duolingo, as saying ‘there has been little empirical work on what is the best way to teach a foreign language’. This is so far from the truth as to be laughable. Von Ahn’s comment, along with the Duolingo product itself, is merely indicative of a lack of awareness of the enormous amount of research that has been carried out. But what could the data gleaned from the interactions of millions of users with Duolingo tell us of value? The example that is given is the following. Apparently, ‘in the case of Spanish speakers learning English, it’s common to teach pronouns early on: words like “he,” “she,” and “it”.’ But, Duolingo discovered, ‘the term “it” tends to confuse and create anxiety for Spanish speakers, since the word doesn’t easily translate into their language […] Delaying the introduction of “it” until a few weeks later dramatically improves the number of people who stick with learning English rather than drop out.’ Was von Ahn unaware of the decades of research into language transfer effects? Did von Ahn (who grew up speaking Spanish in Guatemala) need all this data to tell him that English personal pronouns can cause problems for Spanish learners of English? Was von Ahn unaware of the debates concerning the value of teaching isolated words (especially grammar words!)?

The area where little empirical research has been done is not in different ways of learning another language: it is in the use of big data and learning analytics to assist language learning. Claims about the value of these technologies in language learning are almost always speculative – they are based on comparison to other school subjects (especially, mathematics). Gelan et al (2018: 2), who note this lack of research, suggest that ‘understanding language learner behaviour could provide valuable insights into task design for instructors and materials designers, as well as help students with effective learning strategies and personalised learning pathways’ (my italics). Reinders (2018: 81) writes ‘that analysis of prior experiences with certain groups or certain courses may help to identify key moments at which students need to receive more or different support. Analysis of student engagement and performance throughout a course may help with early identification of learning problems and may prompt early intervention’ (italics added). But there is some research out there, and it’s worth having a look at. Most studies that have collected learner-tracking data concern glossary use for reading comprehension and vocabulary retention (Gelan et al, 2018: 5), but a few have attempted to go further in scope.

Volk et al (2015) looked at the behaviour of the 20,000 students per day using the platform which accompanies ‘More!’ (Gerngross et al. 2008) to do their English homework for Austrian lower secondary schools. They discovered that

  • the exercises used least frequently were those that are located further back in the course book
  • usage is highest from Monday to Wednesday, declining from Thursday, with a rise again on Sunday
  • most interaction took place between 3:00 and 5:00 pm.
  • repetition of exercises led to a strong improvement in success rate
  • students performed better on multiple choice and matching exercises than they did where they had to produce some language

The authors of this paper conclude by saying that ‘the results of this study suggest a number of new avenues for research. In general, the authors plan to extend their analysis of exercise results and applied exercises to the population of all schools using the online learning platform more-online.at. This step enables a deeper insight into student’s learning behaviour and allows making more generalizing statements.’ When I shared these research findings with the Austrian lower secondary teachers that I work with, their reaction was one of utter disbelief. People get paid to do this research? Why not just ask us?

More useful, more actionable insights may yet come from other sources. For example, Gu Yueguo, Pro-Vice-Chancellor of the Beijing Foreign Studies University has announced the intention to set up a national Big Data research center, specializing in big data-related research topics in foreign language education (Yu, 2015). Meanwhile, I’m aware of only one big research project that has published its results. The EC Erasmus+ VITAL project (Visualisation Tools and Analytics to monitor Online Language Learning & Teaching) was carried out between 2015 and 2017 and looked at the learning trails of students from universities in Belgium, Britain and the Netherlands. It was discovered (Gelan et al, 2015) that:

  • students who did online exercises when they were supposed to do them were slightly more successful than those who were late carrying out the tasks
  • successful students logged on more often, spent more time online, attempted and completed more tasks, revisited both exercises and theory pages more frequently, did the work in the order in which it was supposed to be done and did more work in the holidays
  • most students preferred to go straight into the assessed exercises and only used the theory pages when they felt they needed to; successful students referred back to the theory pages more often than unsuccessful students
  • students made little use of the voice recording functionality
  • most online activity took place the day before a class and the day of the class itself

EU funding for this VITAL project amounted to 274,840 Euros[1]. The technology for capturing the data has been around for a long time. In my opinion, nothing of value, or at least nothing new, has been learnt. Publishers like Pearson and Cambridge University Press who have large numbers of learners using their platforms have been capturing learning data for many years. They do not publish their findings and, intriguingly, do not even claim that they have learnt anything useful / actionable from the data they have collected. Sure, an exercise here or there may need to be amended. Both teachers and students may need more support in using the more open-ended functionalities of the platforms (e.g. discussion forums). But are they getting ‘unprecedented insights into what works and what doesn’t’ (Mayer-Schönberger & Cukier, 2014)? Are they any closer to building better pedagogies? On the basis of what we know so far, you wouldn’t want to bet on it.

It may be the case that all the learning / learner data that is captured could be used in some way that has nothing to do with language learning. Show me a language-learning app developer who does not dream of monetizing the ‘behavioural surplus’ (Zuboff, 2018) that they collect! But, for the data and analytics to be of any value in guiding language learning, it must lead to actionable insights. Unfortunately, as Jørno & Gynther (2018: 198) point out, there is very little clarity about what is meant by ‘actionable insights’. There is a danger that data and analytics ‘simply gravitates towards insights that confirm longstanding good practice and insights, such as “students tend to ignore optional learning activities … [and] focus on activities that are assessed” (Jørno & Gynther, 2018: 211). While this is happening, the focus on data inevitably shapes the way we look at the object of study (i.e. language learning), ‘thereby systematically excluding other perspectives’ (Mau, 2019: 15; see also Beer, 2019). The belief that tech is always the solution, that all we need is more data and better analytics, remains very powerful: it’s called techno-chauvinism (Broussard, 2018: 7-8).

References

Beer, D. 2019. The Data Gaze. London: Sage

Broussard, M. 2018. Artificial Unintelligence. Cambridge, Mass.: MIT Press

Gelan, A., Fastre, G., Verjans, M., Martin, N., Jansenswillen, G., Creemers, M., Lieben, J., Depaire, B. & Thomas, M. 2018. ‘Affordances and limitations of learning analytics for computer­assisted language learning: a case study of the VITAL project’. Computer Assisted Language Learning. pp. 1­26. http://clok.uclan.ac.uk/21289/

Gerngross, G., Puchta, H., Holzmann, C., Stranks, J., Lewis-Jones, P. & Finnie, R. 2008. More! 1 Cyber Homework. Innsbruck, Austria: Helbling

Jørno, R. L. & Gynther, K. 2018. ‘What Constitutes an “Actionable Insight” in Learning Analytics?’ Journal of Learning Analytics 5 (3): 198 – 221

Mau, S. 2019. The Metric Society. Cambridge: Polity Press

Mayer-Schönberger, V. & Cukier, K. 2014. Learning with Big Data: The Future of Education. New York: Houghton Mifflin Harcourt

Reinders, H. 2018. ‘Learning analytics for language learning and teaching’. JALT CALL Journal 14 / 1: 77 – 86 https://files.eric.ed.gov/fulltext/EJ1177327.pdf

Volk, H., Kellner, K. & Wohlhart, D. 2015. ‘Learning Analytics for English Language Teaching.’ Journal of Universal Computer Science, Vol. 21 / 1: 156-174 http://www.jucs.org/jucs_21_1/learning_analytics_for_english/jucs_21_01_0156_0174_volk.pdf

Williamson, B. 2017. Big Data in Education. London: Sage

Yu, Q. 2015. ‘Learning Analytics: The next frontier for computer assisted language learning in big data age’ SHS Web of Conferences, 17 https://www.shs-conferences.org/articles/shsconf/pdf/2015/04/shsconf_icmetm2015_02013.pdf

Zuboff, S. 2019. The Age of Surveillance Capitalism. London: Profile Books

 

[1] See https://ec.europa.eu/programmes/erasmus-plus/sites/erasmusplus2/files/ka2-2015-he_en.pdf

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s