Archive for September, 2014

2014-09-30_2216Jose Ferreira, the fast-talking sales rep-in-chief of Knewton, likes to dazzle with numbers. In a 2012 talk hosted by the US Department of Education, Ferreira rattles off the stats: So Knewton students today, we have about 125,000, 180,000 right now, by December it’ll be 650,000, early next year it’ll be in the millions, and next year it’ll be close to 10 million. And that’s just through our Pearson partnership. For each of these students, Knewton gathers millions of data points every day. That, brags Ferreira, is five orders of magnitude more data about you than Google has. … We literally have more data about our students than any company has about anybody else about anything, and it’s not even close. With just a touch of breathless exaggeration, Ferreira goes on: We literally know everything about what you know and how you learn best, everything.

The data is mined to find correlations between learning outcomes and learning behaviours, and, once correlations have been established, learning programmes can be tailored to individual students. Ferreira explains: We take the combined data problem all hundred million to figure out exactly how to teach every concept to each kid. So the 100 million first shows up to learn the rules of exponents, great let’s go find a group of people who are psychometrically equivalent to that kid. They learn the same ways, they have the same learning style, they know the same stuff, because Knewton can figure out things like you learn math best in the morning between 8:40 and 9:13 am. You learn science best in 42 minute bite sizes the 44 minute mark you click right, you start missing questions you would normally get right.

The basic premise here is that the more data you have, the more accurately you can predict what will work best for any individual learner. But how accurate is it? In the absence of any decent, independent research (or, for that matter, any verifiable claims from Knewton), how should we respond to Ferreira’s contribution to the White House Education Datapalooza?

A 51Oy5J3o0yL._AA258_PIkin4,BottomRight,-46,22_AA280_SH20_OU35_new book by Stephen Finlay, Predictive Analytics, Data Mining and Big Data (Palgrave Macmillan, 2014) suggests that predictive analytics are typically about 20 – 30% more accurate than humans attempting to make the same judgements. That’s pretty impressive and perhaps Knewton does better than that, but the key thing to remember is that, however much data Knewton is playing with, and however good their algorithms are, we are still talking about predictions and not certainties. If an adaptive system could predict with 90% accuracy (and the actual figure is typically much lower than that) what learning content and what learning approach would be effective for an individual learner, it would still mean that it was wrong 10% of the time. When this is scaled up to the numbers of students that use Knewton software, it means that millions of students are getting faulty recommendations. Beyond a certain point, further expansion of the data that is mined is unlikely to make any difference to the accuracy of predictions.

A further problem identified by Stephen Finlay is the tendency of people in predictive analytics to confuse correlation and causation. Certain students may have learnt maths best between 8.40 and 9.13, but it does not follow that they learnt it best because they studied at that time. If strong correlations do not involve causality, then actionable insights (such as individualised course design) can be no more than an informed gamble.

Knewton’s claim that they know how every student learns best is marketing hyperbole and should set alarm bells ringing. When it comes to language learning, we simply do not know how students learn (we do not have any generally accepted theory of second language acquisition), let alone how they learn best. More data won’t help our theories of learning! Ferreira’s claim that, with Knewton, every kid gets a perfectly optimized textbook, except it’s also video and other rich media dynamically generated in real time is equally preposterous, not least since the content of the textbook will be at least as significant as the way in which it is ‘optimized’. And, as we all know, textbooks have their faults.

Cui bono? Perhaps huge data and predictive analytics will benefit students; perhaps not. We will need to wait and find out. But Stephen Finlay reminds us that in gold rushes (and internet booms and the exciting world of Big Data) the people who sell the tools make a lot of money. Far more strike it rich selling picks and shovels to prospectors than do the prospectors. Likewise, there is a lot of money to be made selling Big Data solutions. Whether the buyer actually gets any benefit from them is not the primary concern of the sales people. (p.16/17) Which is, perhaps, one of the reasons that some sales people talk so fast.

Advertisements

Duolingo testing

Posted: September 6, 2014 in testing
Tags: , , , , ,

After a break of two years, I recently returned to Duolingo in an attempt to build my German vocabulary. The attempt lasted a week. A few small things had changed, but the essentials had not, and my amusement at translating sentences like The duck eats oranges, A red dog wears white clothes or The fly is important soon turned to boredom and irritation. There are better, free ways of building vocabulary in another language.

Whilst little is new in the learning experience of Duolingo, there are significant developments at the company. The first of these is a new funding round in which they raised a further $20 million, bringing total investment to close to $40 million. Duolingo now has more than 25 million users, half of whom are described as ‘active’, and, according to Louis von Ahn,  the company’s founder, their ambition is to dominate the language learning market. Approaching their third anniversary, though, Duolingo will need, before long, to turn a profit or, at least, to break even. The original plan, to use the language data generated by users of the site to power a paying translation service, is beginning to bear fruit, with contracts with CNN and BuzzFeed. But Duolingo is going to need other income streams. This may well be part of the reason behind their decision to develop and launch their own test.

Duolingo’s marketing people, however, are trying to get another message across: Every year, over 30 million job seekers and students around the world are forced to take a test to prove that they know English in order to apply for a job or school. For some, these tests can cost their family an entire month’s salary. And not only that, taking them typically requires traveling to distant examination facilities and waiting weeks for the results. We believe there should be a better way. This is why today I’m proud to announce the beta release of the Duolingo Test Center, which was created to give everyone equal access to jobs and educational opportunities. Now anyone can conveniently certify their English skills from home, on their mobile device, and for only $20. That’s 1/10th the cost of existing tests. Talking the creative disruption talk, Duolingo wants to break into the “archaic” industry of language proficiency tests. Basically, then, they want to make the world a better place. I seem to have heard this kind of thing before.

The tests will cost $20. Gina Gotthilf , Duolingo’s head of marketing, explains the pricing strategy: We came up with the smallest value that works for us and that a lot of people can pay. Duolingo’s main markets are now the BRICS countries. In China, for example, 1.5 million people signed up with Duolingo in just one week in April of this year, according to @TECHINASIA . Besides China, Duolingo has expanded into India, Japan, Korea, Taiwan, Hong Kong, Vietnam and Indonesia this year. (Brazil already has 2.4 million users, and there are 1.5 million in Mexico.) That’s a lot of potential customers.

So, what do you get for your twenty bucks? Not a lot, is the short answer. The test lasts about 18 minutes. There are four sections, and adaptive software analyses the testee’s responses to determine the level of difficulty of subsequent questions. The first section requires users to select real English words from a list which includes invented words. The second is a short dictation, the third is a gapfill, and the fourth is a read-aloud task which is recorded and compared to a native-speaker norm. That’s it.Item types

Duolingo claims that the test scores correlate very well with TOEFL, but the claim is based on a single study by a University of Pittsburgh professor that was sponsored by Duolingo. Will further studies replicate the findings? I, for one, wouldn’t bet on it, but I won’t insult your intelligence by explaining my reasons. Test validity and reliability, then, remain to be proved, but even John Lehoczky , interim executive vice president of Carnegie Mellon University (Duolingo was developed by researchers from Carnegie Mellon’s computer science department) acknowledges that at this point [the test] is not a fit vehicle for undergraduate admissions.

Even more of a problem than validity and reliability, however, is the question of security. The test is delivered via the web or smartphone apps (Android and iOS). Testees have to provide photo ID and a photo taken on the device they are using. There are various rules (they must be alone, no headphones, etc) and a human proctor reviews the test after it has been completed. This is unlikely to impress authorities like the British immigration authorities, which recently refused to recognise online TOEFL and TOEIC qualifications, after a BBC documentary revealed ‘systematic fraud’ in the taking of these tests.

There will always be a market of sorts for valueless qualifications (think, for example, of all the cheap TEFL courses that can be taken online), but to break into the monopoly of TOEFL and IELTS (and soon perhaps Pearson), Duolingo will need to deal with the issues of validity, reliability and security. If they don’t, few – if any – institutions of higher education will recognise the test. But if they do, they’ll need to spend more money: a team of applied linguists with expertise in testing would be a good start, and serious proctoring doesn’t come cheap. Will they be able to do this and keep the price down to $20?