In order to understand more complex models of adaptive learning, it is necessary to take a temporary step sideways away from the world of language learning. Businesses have long used analytics – the analysis of data to find meaningful patterns – in insurance, banking and marketing. With the exponential growth in computer processing power and memory capacity, businesses now have access to volumes of data of almost unimaginable size. This is known as ‘big data’ and has been described as ‘a revolution that will transform how we live, work and think’ (Mayer-Schönberger & Cukier, ‘Big Data’, 2013). Frequently cited examples of the potential of big data are the success of Amazon to analyze and predict buying patterns and the use of big data analysis in Barack Obama’s 2012 presidential re-election. Business commentators are all singing the same song on the subject. This will be looked at again in later posts. For the time being, it is enough to be aware of the main message. ‘The high-performing organisation of the future will be one that places great value on data and analytical exploration’ (The Economist Intelligence Unit, ‘In Search of Insight and Foresight: Getting more out of big data’ 2013, p.15). ‘Almost no sphere of business activity will remain untouched by this movement,’ (McAfee & Brynjolfsson, ‘Big Data: The Management Revolution’, Harvard Business Review (October 2012), p. 65).
With the growing bonds between business and education (another topic which will be explored later), it is unsurprising that language learning / teaching materials are rapidly going down the big data route. In comparison to what is now being developed for ELT, the data that is analyzed in the adaptive learning models I have described in an earlier post is very limited, and the algorithms used to shape the content are very simple.
The volume and variety of data and the speed of processing are now of an altogether different order. Jose Ferreira, CEO of Knewton, one of the biggest players in adaptive learning in ELT, spells out the kind of data that can be tapped:
At Knewton, we divide educational data into five types: one pertaining to student identity and onboarding, and four student activity-based data sets that have the potential to improve learning outcomes. They’re listed below in order of how difficult they are to attain:
1) Identity Data: Who are you? Are you allowed to use this application? What admin rights do you have? What district are you in? How about demographic info?
2) User Interaction Data: User interaction data includes engagement metrics, click rate, page views, bounce rate, etc. These metrics have long been the cornerstone of internet optimization for consumer web companies, which use them to improve user experience and retention. This is the easiest to collect of the data sets that affect student outcomes. Everyone who creates an online app can and should get this for themselves.
3) Inferred Content Data: How well does a piece of content “perform” across a group, or for any one subgroup, of students? What measurable student proficiency gains result when a certain type of student interacts with a certain piece of content? How well does a question actually assess what it intends to? Efficacy data on instructional materials isn’t easy to generate — it requires algorithmically normed assessment items. However it’s possible now for even small companies to “norm” small quantities of items. (Years ago, before we developed more sophisticated methods of norming items at scale, Knewton did so using Amazon’s “Mechanical Turk” service.)
4) System-Wide Data: Rosters, grades, disciplinary records, and attendance information are all examples of system-wide data. Assuming you have permission (e.g. you’re a teacher or principal), this information is easy to acquire locally for a class or school. But it isn’t very helpful at small scale because there is so little of it on a per-student basis. At very large scale it becomes more useful, and inferences that may help inform system-wide recommendations can be teased out.
5) Inferred Student Data: Exactly what concepts does a student know, at exactly what percentile of proficiency? Was an incorrect answer due to a lack of proficiency, or forgetfulness, or distraction, or a poorly worded question, or something else altogether? What is the probability that a student will pass next week’s quiz, and what can she do right this moment to increase it?
Software of this kind keeps complex personal profiles, with millions of variables per student, on as many students as necessary. The more student profiles (and therefore students) that can be compared, the more useful the data is. Big players in this field, such as Knewton, are aiming for student numbers in the tens to hundreds of millions. Once data volume of this order is achieved, the ‘analytics’, or the algorithms that convert data into ‘actionable insights’ (J. Spring, ‘Education Networks’ (New York: Routledge, 2012), p.55) become much more reliable.
 http://www.knewton.com/blog/knewton/from-jose/2013/07/18/big-data-in-education/ (last accessed 9 December 2013)