Posts Tagged ‘Global Scale of English’

NB This is an edited version of the original review.

Words & Monsters is a new vocabulary app that has caught my attention. There are three reasons for this. Firstly, because it’s free. Secondly, because I was led to believe (falsely, as it turns out) that two of the people behind it are Charles Browne and Brent Culligan, eminently respectable linguists, who were also behind the development of the New General Service List (NGSL), based on data from the Cambridge English Corpus. And thirdly, because a lot of thought, effort and investment have clearly gone into the gamification of Words & Monsters (WAM). It’s to the last of these that I’ll turn my attention first.

WAM teaches vocabulary in the context of a battle between a player’s avatar and a variety of monsters. If users can correctly match a set of target items to definitions or translations in the available time, they ‘defeat’ the monster and accumulate points. The more points you have, the higher you advance through a series of levels and ranks. There are bonuses for meeting daily and weekly goals, there are leaderboards, and trophies and medals can be won. In addition to points, players also win ‘crystals’ after successful battles, and these crystals can be used to buy accessories which change the appearance of the avatar and give the player added ‘powers’. I was never able to fully understand precisely how these ‘powers’ affected the number of points I could win in battle. It remained as baffling to me as the whole system of values with Pokemon cards, which is presumably a large part of the inspiration here. Perhaps others, more used to games like Pokemon, would find it all much more transparent.

The system of rewards is all rather complicated, but perhaps this doesn’t matter too much. In fact, it might be the case that working out how reward systems work is part of what motivates people to play games. But there is another aspect to this: the app’s developers refer in their bumf to research by Howard-Jones and Jay (2016), which suggests that when rewards are uncertain, more dopamine is released in the mid-brain and this may lead to reinforcement of learning, and, possibly, enhancement of declarative memory function. Possibly … but Howard-Jones and Jay point out that ‘the science required to inform the manipulation of reward schedules for educational benefit is very incomplete.’ So, WAM’s developers may be jumping the gun a little and overstating the applicability of the neuroscientific research, but they’re not alone in that!

If you don’t understand a reward system, it’s certain that the rewards are uncertain. But WAM takes this further in at least two ways. Firstly, when you win a ‘battle’, you have to click on a plain treasure bag to collect your crystals, and you don’t know whether you’ll get one, two, three, or zero, crystals. You are given a semblance of agency, but, essentially, the whole thing is random. Secondly, when you want to convert your crystals into accessories for your avatar, random selection determines which accessory you receive, even though, again, there is a semblance of agency. Different accessories have different power values. This extended use of what the developers call ‘the thrill of uncertain rewards’ is certainly interesting, but how effective it is is another matter. My own reaction, after quite some time spent ‘studying’, to getting no crystals or an avatar accessory that I didn’t want was primarily frustration, rather than motivation to carry on. I have no idea how typical my reaction (more ‘treadmill’ than ‘thrill’) might be.

Unsurprisingly, for an app that has so obviously thought carefully about gamification, players are encouraged to interact with each other. As part of the early promotion, WAM is running, from 15 November to 19 December, a free ‘team challenge tournament’, allowing teams of up to 8 players to compete against each other. Ingeniously, it would appear to allow teams and players of varying levels of English to play together, with the app’s algorithms determining each individual’s level of lexical knowledge and therefore the items that will be presented / tested. Social interaction is known to be an important component of successful games (Dehghanzadeh et al., 2019), but for vocabulary apps there’s a huge challenge. In order to learn vocabulary from an app, learners need to put in time – on a regular basis. Team challenge tournaments may help with initial on-boarding of players, but, in the end, learning from a vocabulary app is inevitably and largely a solitary pursuit. Over time, social interaction is unlikely to be maintained, and it is, in any case, of a very limited nature. The other features of successful games – playful freedom and intrinsically motivating tasks (Driver, 2012) – are also absent from vocabulary apps. Playful freedom is mostly incompatible with points, badges and leaderboards. And flashcard tasks, however intrinsically motivating they may be at the outset, will always become repetitive after a while. In the end, what’s left, for those users who hang around long enough, is the reward system.

It’s also worth noting that this free challenge is of limited duration: it is a marketing device attempting to push you towards the non-free use of the app, once the initial promotion is over.

Gamified motivation tools are only of value, of course, if they motivate learners to spend their time doing things that are of clear learning value. To evaluate the learning potential of WAM, then, we need to look at the content (the ‘learning objects’) and the learning tasks that supposedly lead to acquisition of these items.

When you first use WAM, you need to play for about 20 minutes, at which point algorithms determine ‘how many words [you] know and [you can] see scores for English tests such as; TOEFL, TOEIC, IELTS, EIKEN, Kyotsu Shiken, CEFR, SAT and GRE’. The developers claim that these scores correlate pretty highly with actual test scores: ‘they are about as accurate as the tests themselves’, they say. If Browne and Culligan had been behind the app, I would have been tempted to accept the claim – with reservations: after all, it still allows for one item out of 5 to be wrongly identified. But, what is this CEFR test score that is referred to? There is no CEFR test, although many tests are correlated with CEFR. The two tools that I am most familiar with which allocate CEFR levels to individual words – Cambridge’s English Vocabulary Profile and Pearson’s Global Scale of English – often conflict in their results. I suspect that ‘CEFR’ was just thrown into the list of tests as an attempt to broaden the app’s appeal.

English target words are presented and practised with their translation ‘equivalents’ in Japanese. For the moment, Japanese is the only language available, which means the app is of little use to learners who don’t know any Japanese. It’s now well-known that bilingual pairings are more effective in deliberate language learning than using definitions in the same language as the target items. This becomes immediately apparent when, for example, a word like ‘something’ is defined (by WAM) as ‘a thing not known or specified’ and ‘anything’ as ‘a thing of whatever kind’. But although I’m in no position to judge the Japanese translations, there are reasons why I would want to check the spreadsheet before recommending the app. ‘Lady’ is defined as ‘polite word for a woman’; ‘missus’ is defined as ‘wife’; and ‘aye’ is defined as ‘yes’. All of these definitions are, at best, problematic; at worst, they are misleading. Are the Japanese translations more helpful? I wonder … Perhaps these are simply words that do not lend themselves to flashcard treatment?

Because I tested in to the app at C1 level, I was not able to evaluate the selection of words at lower levels. A pity. Instead, I was presented with words like ‘ablution’, ‘abrade’, ‘anode’, and ‘auspice’. The app claims to be suitable ‘for both second-language learners and native speakers’. For lower levels of the former, this may be true (but without looking at the lexical spreadsheets, I can’t tell). But for higher levels, however much fun this may be for some people, it seems unlikely that you’ll learn very much of any value. Outside of words in, say, the top 8000 frequency band, it is practically impossible to differentiate the ‘surrender value’ of words in any meaningful way. Deliberate learning of vocabulary only makes sense with high frequency words that you have a chance of encountering elsewhere. You’d be better off reading, extensively, rather than learning random words from an app. Words, which (for reasons I’ll come on to) you probably won’t actually learn anyway.

With very few exceptions, the learning objects in WAM are single words, rather than phrases, even when the item is of little or no value outside its use in a phrase. ‘Betide’ is defined as ‘to happen to; befall’ but this doesn’t tell a learner much that is useful. It’s practically only ever used following ‘woe’ (but what does ‘woe’ mean?!). Learning items can be checked in the ‘study guide’, which will show that ‘betide’ typically follows ‘woe’, but unless you choose to refer to the study guide (and there’s no reason, in a case like this, that you would know that you need to check things out more fully), you’ll be none the wiser. In other words, checking the study guide is unlikely to betide you. ‘Wee’, as another example, is treated as two items: (1) meaning ‘very small’ as in ‘wee baby’, and (2) meaning ‘very early in the morning’ as in ‘in the wee hours’. For the latter, ‘wee’ can only collocate with ‘in the’ and ‘hours’, so it makes little sense to present it as a single word. This is also an example of how, in some cases, different meanings of particular words are treated as separate learning objects, even when the two meanings are very close and, in my view, are hardly worth learning separately. Examples include ‘czar’ and ‘assonance’. Sometimes, cognates are treated as separate learning objects (e.g. ‘adulterate’ and ‘adulteration’ or ‘dolor’ and ‘dolorous’); with other words (e.g. ‘effulgence’), only one grammatical form appears to be given. I could not begin to figure out any rationale behind any of this.

All in all, then, there are reasons to be a little skeptical about some of the content. Up to level B2 – which, in my view, is the highest level at which it makes sense to use vocabulary flashcards – it may be of value, so long as your first language is Japanese. But given the claim that it can help you prepare for the ‘CEFR test’, I have to wonder …

The learning tasks require players to match target items to translations / definitions (in both directions), with the target item sometimes in written form, sometimes spoken. Users do not, as far as I can tell, ever have to produce the target item: they only have to select. The learning relies on spaced repetition, but there is no generative effect (known to enhance memorisation). When I was experimenting, there were a few words that I did not know, but I was usually able to get the correct answer by eliminating the distractors (a choice of one from three gives players a reasonable chance of guessing correctly). WAM does not teach users how to produce words; its focus is on receptive knowledge (of a limited kind). I learn, for example, what a word like ‘aye’ or ‘missus’ kind of means, but I learn nothing about how to use it appropriately. Contrary to the claims in WAM’s bumf (that ‘all senses and dimensions of each word are fully acquired’), reading and listening comprehension speeds may be improved, but appropriate and accurate use of these words in speaking and writing is much less likely to follow. Does WAM really ‘strengthen and expand the foundation levels of cognition that support all higher level thinking’, as is claimed?

Perhaps it’s unfair to mention some of the more dubious claims of WAM’s promotional material, but here is a small selection, anyway: ‘WAM unleashes the full potential of natural motivation’. ‘WAM promotes Flow by carefully managing the ratio of unknown words. Your mind moves freely in the channel below frustration and above boredom’.

WAM is certainly an interesting project, but, like all the vocabulary apps I have ever looked at, there have to be trade-offs between optimal task design and what will fit on a mobile screen, between freedoms and flexibility for the user and the requirements of gamified points systems, between the amount of linguistic information that is desirable and the amount that spaced repetition can deal with, between attempting to make the app suitable for the greatest number of potential users and making it especially appropriate for particular kinds of users. Design considerations are always a mix of the pedagogical and the practical / commercial. And, of course, the financial. And, like most edtech products, the claims for its efficacy need to be treated with a bucket of salt.

References

Dehghanzadeh, H., Fardanesh, H., Hatami, J., Talaee, E. & Noroozi, O. (2019) Using gamification to support learning English as a second language: a systematic review, Computer Assisted Language Learning, DOI: 10.1080/09588221.2019.1648298

Driver, P. (2012) The Irony of Gamification. In English Digital Magazine 3, British Council Portugal, pp. 21 – 24 http://digitaldebris.info/digital-debris/2011/12/31/the-irony-of-gamification-written-for-ied-magazine.html

Howard-Jones, P. & Jay, T. (2016) Reward, learning and games. Current Opinion in Behavioral Sciences, 10: 65 – 72

There are a number of reasons why we sometimes need to describe a person’s language competence using a single number. Most of these are connected to the need for a shorthand to differentiate people, in summative testing or in job selection, for example. Numerical (or grade) allocation of this kind is so common (and especially in times when accountability is greatly valued) that it is easy to believe that this number is an objective description of a concrete entity, rather than a shorthand description of an abstract concept. In the process, the abstract concept (language competence) becomes reified and there is a tendency to stop thinking about what it actually is.

Language is messy. It’s a complex, adaptive system of communication which has a fundamentally social function. As Diane Larsen-Freeman and others have argued patterns of use strongly affect how language is acquired, is used, and changes. These processes are not independent of one another but are facets of the same complex adaptive system. […] The system consists of multiple agents (the speakers in the speech community) interacting with one another [and] the structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms.

As such, competence in language use is difficult to measure. There are ways of capturing some of it. Think of the pages and pages of competency statements in the Common European Framework, but there has always been something deeply unsatisfactory about documents of this kind. How, for example, are we supposed to differentiate, exactly and objectively, between, say, can participate fully in an interview (C1) and can carry out an effective, fluent interview (B2)? The short answer is that we can’t. There are too many of these descriptors anyway and, even if we did attempt to use such a detailed tool to describe language competence, we would still be left with a very incomplete picture. There is at least one whole book devoted to attempts to test the untestable in language education (edited by Amos Paran and Lies Sercu, Multilingual Matters, 2010).

So, here is another reason why we are tempted to use shorthand numerical descriptors (such as A1, A2, B1, etc.) to describe something which is very complex and abstract (‘overall language competence’) and to reify this abstraction in the process. From there, it is a very short step to making things even more numerical, more scientific-sounding. Number-creep in recent years has brought us the Pearson Global Scale of English which can place you at a precise point on a scale from 10 to 90. Not to be outdone, Cambridge English Language Assessment now has a scale that runs from 80 points to 230, although Cambridge does, at least, allocate individual scores for four language skills.

As the title of this post suggests (in its reference to Stephen Jay Gould’s The Mismeasure of Man), I am suggesting that there are parallels between attempts to measure language competence and the sad history of attempts to measure ‘general intelligence’. Both are guilty of the twin fallacies of reification and ranking – the ordering of complex information as a gradual ascending scale. These conceptual fallacies then lead us, through the way that they push us to think about language, into making further conceptual errors about language learning. We start to confuse language testing with the ways that language learning can be structured.

We begin to granularise language. We move inexorably away from difficult-to-measure hazy notions of language skills towards what, on the surface at least, seem more readily measurable entities: words and structures. We allocate to them numerical values on our testing scales, so that an individual word can be deemed to be higher or lower on the scale than another word. And then we have a syllabus, a synthetic syllabus, that lends itself to digital delivery and adaptive manipulation. We find ourselves in a situation where materials writers for Pearson, writing for a particular ‘level’, are only allowed to use vocabulary items and grammatical structures that correspond to that ‘level’. We find ourselves, in short, in a situation where the acquisition of a complex and messy system is described as a linear, additive process. Here’s an example from the Pearson website: If you score 29 on the scale, you should be able to identify and order common food and drink from a menu; at 62, you should be able to write a structured review of a film, book or play. And because the GSE is so granular in nature, you can conquer smaller steps more often; and you are more likely to stay motivated as you work towards your goal. It’s a nonsense, a nonsense that is dictated by the needs of testing and adaptive software, but the sciency-sounding numbers help to hide the conceptual fallacies that lie beneath.

Perhaps, though, this doesn’t matter too much for most language learners. In the early stages of language learning (where most language learners are to be found), there are countless millions of people who don’t seem to mind the granularised programmes of Duolingo or Rosetta Stone, or the Grammar McNuggets of coursebooks. In these early stages, anything seems to be better than nothing, and the testing is relatively low-stakes. But as a learner’s interlanguage becomes more complex, and as the language she needs to acquire becomes more complex, attempts to granularise it and to present it in a linearly additive way become more problematic. It is for this reason, I suspect, that the appeal of granularised syllabuses declines so rapidly the more progress a learner makes. It comes as no surprise that, the further up the scale you get, the more that both teachers and learners want to get away from pre-determined syllabuses in coursebooks and software.

Adaptive language learning software is continuing to gain traction in the early stages of learning, in the initial acquisition of basic vocabulary and structures and in coming to grips with a new phonological system. It will almost certainly gain even more. But the challenge for the developers and publishers will be to find ways of making adaptive learning work for more advanced learners. Can it be done? Or will the mismeasure of language make it impossible?

(This post was originally published at eltjam.)

learning_teaching_ngramWe now have young learners and very young learners, learner differences and learner profiles, learning styles, learner training, learner independence and autonomy, learning technologies, life-long learning, learning management systems, virtual learning environments, learning outcomes, learning analytics and adaptive learning. Much, but not perhaps all, of this is to the good, but it’s easy to forget that it wasn’t always like this.

The rise in the use of the terms ‘learner’ and ‘learning’ can be seen in policy documents, educational research and everyday speech, and it really got going in the mid 1980s[1]. Duncan Hunter and Richard Smith[2] have identified a similar trend in ELT after analysing a corpus of articles from the English Language Teaching Journal. They found that ‘learner’ had risen to near the top of the key-word pile in the mid 1980s, but had been practically invisible 15 years previously. Accompanying this rise has been a relative decline of words like ‘teacher’, ‘teaching’, ‘pupil’ and, even, ‘education’. Gert Biesta has described this shift in discourse as a ‘new language of learning’ and the ‘learnification of education’.

It’s not hard to see the positive side of this change in focus towards the ‘learner’ and away from the syllabus, the teachers and the institution in which the ‘learning’ takes place. We can, perhaps, be proud of our preference for learner-centred approaches over teacher-centred ones. We can see something liberating (for our students) in the change of language that we use. But, as Bingham and Biesta[3] have pointed out, this gain is also a loss.

The language of ‘learners’ and ‘learning’ focusses our attention on process – how something is learnt. This was a much-needed corrective after an uninterrupted history of focussing on end-products, but the corollary is that it has become very easy to forget not only about the content of language learning, but also its purposes and the social relationships through which it takes place.

There has been some recent debate about the content of language learning, most notably in the work of the English as a Lingua Franca scholars. But there has been much more attention paid to the measurement of the learners’ acquisition of that content (through the use of tools like the Pearson Global Scale of English). There is a growing focus on ‘granularized’ content – lists of words and structures, and to a lesser extent language skills, that can be easily measured. It looks as though other things that we might want our students to be learning – critical thinking skills and intercultural competence, for example – are being sidelined.

More significant is the neglect of the purposes of language learning. The discourse of ELT is massively dominated by the paying sector of private language schools and semi-privatised universities. In these contexts, questions of purpose are not, perhaps, terribly important, as the whole point of the enterprise can be assumed to be primarily instrumental. But the vast majority of English language learners around the world are studying in state-funded institutions as part of a broader educational programme, which is as much social and political as it is to do with ‘learning’. The ultimate point of English lessons in these contexts is usually stated in much broader terms. The Council of Europe’s Common European Framework of Reference, for example, states that the ultimate point of the document is to facilitate better intercultural understanding. It is very easy to forget this when we are caught up in the business of levels and scales and measuring learning outcomes.

Lastly, a focus on ‘learners’ and ‘learning’ distracts attention away from the social roles that are enacted in classrooms. 25 years ago, Henry Widdowson[4] pointed out that there are two quite different kinds of role. The first of these is concerned with occupation (student / pupil vs teacher / master / mistress) and is identifying. The second (the learning role) is actually incidental and cannot be guaranteed. He reminds us that the success of the language learning / teaching enterprise depends on ‘recognizing and resolving the difficulties inherent in the dual functioning of roles in the classroom encounter’[5]. Again, this may not matter too much in the private sector, but, elsewhere, any attempt to tackle the learning / teaching conundrum through an exclusive focus on learning processes is unlikely to succeed.

The ‘learnification’ of education has been accompanied by two related developments: the casting of language learners as consumers of a ‘learning experience’ and the rise of digital technologies in education. For reasons of space, I will limit myself to commenting on the second of these[6]. Research by Geir Haugsbakk and Yngve Nordkvelle[7] has documented a clear and critical link between the new ‘language of learning’ and the rhetoric of edtech advocacy. These researchers suggest that these discourses are mutually reinforcing, that both contribute to the casting of the ‘learner’ as a consumer, and that the coupling of learning and digital tools is often purely rhetorical.

One of the net results of ‘learnification’ is the transformation of education into a technical or technological problem to be solved. It suggests, wrongly, that approaches to education can be derived purely from theories of learning. By adopting an ahistorical and apolitical standpoint, it hides ‘the complex nexus of political and economic power and resources that lies behind a considerable amount of curriculum organization and selection’[8]. The very real danger, as Biesta[9] has observed, is that ‘if we fail to engage with the question of good education head-on – there is a real risk that data, statistics and league tables will do the decision-making for us’.

[1] 2004 Biesta, G.J.J. ‘Against learning. Reclaiming a language for education in an age of learning’ Nordisk Pedagogik 24 (1), 70-82 & 2010 Biesta, G.J.J. Good Education in an Age of Measurement (Boulder, Colorado: Paradigm Publishers)

[2] 2012 Hunter, D. & R. Smith ‘Unpackaging the past: ‘CLT’ through ELTJ keywords’ ELTJ 66/4 430-439

[3] 2010 Bingham, C. & Biesta, G.J.J. Jacques Rancière: Education, Truth, Emancipation (London: Continuum) 134

[4] 1990 Widdowson, H.G. Aspects of Language Teaching (Oxford: OUP) 182 ff

[5] 1987 Widdowson, H.G. ‘The roles of teacher and learner’ ELTJ 41/2

[6] A compelling account of the way that students have become ‘consumers’ can be found in 2013 Williams, J. Consuming Higher Education (London: Bloomsbury)

[7] 2007 Haugsbakk, G. & Nordkvelle, Y. ‘The Rhetoric of ICT and the New Language of Learning: a critical analysis of the use of ICT in the curricular field’ European Educational Research Journal 6/1 1 – 12

[8] 2004 Apple, M. W. Ideology and Curriculum 3rd edition (New York: Routledge) 28

[9] 2010 Biesta, G.J.J. Good Education in an Age of Measurement (Boulder, Colorado: Paradigm Publishers) 27

 

 

Pearson’s ‘Efficacy’ initiative is a series of ‘commitments designed to measure and increase the company’s impact on learning outcomes around the world’. The company’s dedicated website  offers two glossy brochures with a wide range of interesting articles, a good questionnaire tool that can be used by anyone to measure the efficacy of their own educational products or services, as well as an excellent selection of links to other articles, some of which are critical of the initiative. These include Michael Feldstein’s long blog post  ‘Can Pearson Solve the Rubric’s Cube?’ which should be a first port of call for anyone wanting to understand better what is going on.

What does it all boil down to? The preface to Pearson’s ‘Asking More: the Path to Efficacy’ by CEO John Fallon provides a succinct introduction. Efficacy in education, says Fallon, is ‘making a measurable impact on someone’s life through learning’. ‘Measurable’ is the key word, because, as Fallon continues, ‘it is increasingly possible to determine what works and what doesn’t in education, just as in healthcare.’ We need ‘a relentless focus’ on ‘the learning outcomes we deliver’ because it is these outcomes that can be measured in ‘a systematic, evidence-based fashion’. Measurement, of course, is all the easier when education is delivered online, ‘real-time learner data’ can be captured, and the power of analytics can be deployed.

Pearson are very clearly aligning themselves with recent moves towards a more evidence-based education. In the US, Obama’s Race to the Top is one manifestation of this shift. Britain (with, for example, the Education Endowment Foundation) and France (with its Fonds d’Expérimentation pour la Jeunesse ) are both going in the same direction. Efficacy is all about evidence-based practice.

Both the terms ‘efficacy’ and ‘evidence-based practice’ come originally from healthcare. Fallon references this connection in the quote two paragraphs above. In the UK last year, Ben Goldacre (medical doctor, author of ‘Bad Science’ and a relentless campaigner against pseudo-science) was commissioned by the UK government to write a paper entitled ‘Building Evidence into Education’ . In this, he argued for the need to introduce randomized controlled trials into education in a similar way to their use in medicine.

As Fallon observed in the preface to the Pearson ‘Efficacy’ brochure, this all sounds like ‘common sense’. But, as Ben Goldacre discovered, things are not so straightforward in education. An excellent article in The Guardian outlined some of the problems in Goldacre’s paper.

With regard to ELT, Pearson’s ‘Efficacy’ initiative will stand or fall with the validity of their Global Scale of English, discussed in my March post ‘Knowledge Graphs’ . However, there are a number of other considerations that make the whole evidence-based / efficacy business rather less common-sensical than might appear at first glance.

  • The purpose of English language teaching and learning (at least, in compulsory education) is rather more than simply the mastery of grammatical and lexical systems, or the development of particular language skills. Some of these other purposes (e.g. the development of intercultural competence or the acquisition of certain 21st century skills, such as creativity) continue to be debated. There is very little consensus about the details of what these purposes (or outcomes) might be, or how they can be defined. Without consensus about these purposes / outcomes, it is not possible to measure them.
  • Even if we were able to reach a clear consensus, many of these outcomes do not easily lend themselves to measurement, and even less to low-cost measurement.
  • Although we clearly need to know what ‘works’ and what ‘doesn’t work’ in language teaching, there is a problem in assigning numerical values. As the EduThink blog observes, ‘the assignation of numerical values is contestable, problematic and complex. As teachers and researchers we should be engaging with the complexity [of education] rather than the reductive simplicities of [assigning numerical values]’.
  • Evidence-based medicine has resulted in unquestionable progress, but it is not without its fierce critics. A short summary of the criticisms can be found here .  It would be extremely risky to assume that a contested research procedure from one discipline can be uncritically applied to another.
  • Kathleen Graves, in her plenary at IATEFL 2014, ‘The Efficiency of Inefficiency’, explicitly linked health care and language teaching. She described a hospital where patient care was as much about human relationships as it was about medical treatment, an aspect of the hospital that went unnoticed by efficiency experts, since this could not be measured. See this blog for a summary of her talk.

These issues need to be discussed much further before we get swept away by the evidence-based bandwagon. If they are not, the real danger is that, as John Fallon cautions, we end up counting things that don’t really count, and we don’t count the things that really do count. Somehow, I doubt that an instrument like the Global Scale of English will do the trick.

In a recent interesting post on eltjam, Cleve Miller wrote the following

Knewton asks its publishing partners to organize their courses into a “knowledge graph” where content is mapped to an analyzable form that consists of the smallest meaningful chunks (called “concepts”), organized as prerequisites to specific learning goals. You can see here the influence of general learning theory and not SLA/ELT, but let’s not concern ourselves with nomenclature and just call their “knowledge graph” an “acquisition graph”, and call “concepts” anything else at all, say…“items”. Basically our acquisition graph could be something like the CEFR, and the items are the specifications in a completed English Profile project that detail the grammar, lexis, and functions necessary for each of the can-do’s in the CEFR. Now, even though this is a somewhat plausible scenario, it opens Knewton up to several objections, foremost the degree of granularity and linearity.

In this post, Cleve acknowledges that, for the time being, adaptive learning may be best suited to ‘certain self-study material, some online homework, and exam prep – anywhere the language is fairly defined and the content more amenable to algorithmic micro-adaptation.’ I would agree, but its value / usefulness will depend on getting the knowledge graph right.

Which knowledge graph, then? Cleve suggests that it could be something like the CEFR, but it couldn’t be the CEFR itself because it is, quite simply, too vague. This was recognized by Pearson when they developed their Global Scale of English (GSE), an instrument which, they claim, can provide ‘for more granular and detailed measurements of learners’ levels than is possible with the CEFR itself, with its limited number of wide levels’. This Global Scale of English will serve as ‘the metric underlying all Pearson English learning, teaching and assessment products’, including, therefore, the adaptive products under development.

gse2

‘As part of the GSE project, Pearson is creating an associated set of Pearson Syllabuses […]. These will help to link instructional content with assessments and to create a reference for authoring, instruction and testing.’ These syllabuses will contain grammar and vocabulary inventories which ‘will be expressed in the form of can-do statements with suggested sample exponents rather than as the prescriptive lists found in more traditional syllabuses.’ I haven’t been able to get my hands on one of these syllabuses yet: perhaps someone could help me out?

Informal feedback from writer colleagues working for Pearson suggests that, in practice, these inventories are much more prescriptive than Pearson claim, but this is hardly surprising, as the value of an inventory is precisely its more-or-less finite nature.

Until I see more, I will have to limit my observations to two documents in the public domain which are the closest we have to what might become knowledge graphs. The first of these is the British Council / EAQUALS Core Inventory for General EnglishScott Thornbury, back in 2011, very clearly set out the problems with this document and, to my knowledge, the reservations he expressed have not yet been adequately answered. To be fair, this inventory was never meant to be used as a knowledge graph: ‘It is a description, not a prescription’, wrote the author (North, 2010). But presumably a knowledge graph would look much like this, and it would have the same problems. The second place where we can find what a knowledge graph might look like is English Profile and this is mentioned by Cleve. Would English Profile work any better? Possibly not. Michael Swan’s critique of English Profile (ELTJ 68/1 January 2014 pp.89-96) asks some big questions that have yet, to my knowledge, to be answered.

Knewton’s Sally Searby has said that, for ELT, knowledge graphing needs to be ‘much more nuanced’. Her comment suggests a belief that knowledge graphing can be much more nuanced, but this is open to debate. Michael Swan quotes Prodeau, Lopez and Véronique (2012): ‘the sum of pragmatic and linguistic skills needed to achieve communicative success at each level makes it difficult, if not impossible, to find lexical and grammatical means that would characterize only one level’. He observes that ‘the problem may, in fact, simply not be soluble’.

So, what kind of knowledge graph are we likely to see? My best bet is that it would look a bit like a Headway syllabus.