Posts Tagged ‘behaviourism’

From time to time, I have mentioned Programmed Learning (or Programmed Instruction) in this blog (here and here, for example). It felt like time to go into a little more detail about what Programmed Instruction was (and is) and why I think it’s important to know about it.

A brief description

The basic idea behind Programmed Instruction was that subject matter could be broken down into very small parts, which could be organised into an optimal path for presentation to students. Students worked, at their own speed, through a series of micro-tasks, building their mastery of each nugget of learning that was presented, not progressing from one to the next until they had demonstrated they could respond accurately to the previous task.

There were two main types of Programmed Instruction: linear programming and branching programming. In the former, every student would follow the same path, the same sequence of frames. This could be used in classrooms for whole-class instruction and I tracked down a book (illustrated below) called ‘Programmed English Course Student’s Book 1’ (Hill, 1966), which was an attempt to transfer the ideas behind Programmed Instruction to a zero-tech, class environment. This is very similar in approach to the material I had to use when working at an Inlingua school in the 1980s.

Programmed English Course

Comparatives strip

An example of how self-paced programming worked is illustrated here, with a section on comparatives.

With branching programming, ‘extra frames (or branches) are provided for students who do not get the correct answer’ (Kay et al., 1968: 19). This was only suitable for self-study, but it was clearly preferable, as it allowed for self-pacing and some personalization. The material could be presented in books (which meant that students had to flick back and forth in their books) or with special ‘teaching machines’, but the latter were preferred.

In the words of an early enthusiast, Programmed Instruction was essentially ‘a device to control a student’s behaviour and help him to learn without the supervision of a teacher’ (Kay et al.,1968: 58). The approach was inspired by the work of Skinner and it was first used as part of a university course in behavioural psychology taught by Skinner at Harvard University in 1957. It moved into secondary schools for teaching mathematics in 1959 (Saettler, 2004: 297).

Enthusiasm and uptake

The parallels between current enthusiasm for the power of digital technology to transform education and the excitement about Programmed Instruction and teaching machines in the 1960s are very striking (McDonald et al., 2005: 90). In 1967, it was reported that ‘we are today on the verge of what promises to be a revolution in education’ (Goodman, 1967: 3) and that ‘tremors of excitement ran through professional journals and conferences and department meetings from coast to coast’ (Kennedy, 1967: 871). The following year, another commentator referred to the way that the field of education had been stirred ‘with an almost Messianic promise of a breakthrough’ (Ornstein, 1968: 401). Programmed instruction was also seen as an exciting business opportunity: ‘an entire industry is just coming into being and significant sales and profits should not be too long in coming’, wrote one hopeful financial analyst as early as 1961 (Kozlowski, 1967: 47).

The new technology seemed to offer a solution to the ‘problems of education’. Media reports in 1963 in Germany, for example, discussed a shortage of teachers, large classes and inadequate learning progress … ‘an ‘urgent pedagogical emergency’ that traditional teaching methods could not resolve’ (Hof, 2018). Individualised learning, through Programmed Instruction, would equalise educational opportunity and if you weren’t part of it, you would be left behind. In the US, two billion dollars were spent on educational technology by the government in the decade following the passing of the National Defense Education Act, and this was added to by grants from private foundations. As a result, ‘the production of teaching machines began to flourish, accompanied by the marketing of numerous ‘teaching units’ stamped into punch cards as well as less expensive didactic programme books and index cards. The market grew dramatically in a short time’ (Hof, 2018).

In the field of language learning, however, enthusiasm was more muted. In the year in which he completed his doctoral studies[1], the eminent linguist, Bernard Spolsky noted that ‘little use is actually being made of the new technique’ (Spolsky, 1966). A year later, a survey of over 600 foreign language teachers at US colleges and universities reported that only about 10% of them had programmed materials in their departments (Valdman, 1968: 1). In most of these cases, the materials ‘were being tried out on an experimental basis under the direction of their developers’. And two years after that, it was reported that ‘programming has not yet been used to any very great extent in language teaching, so there is no substantial body of experience from which to draw detailed, water-tight conclusions’ (Howatt, 1969: 164).

By the early 1970s, Programmed Instruction was already beginning to seem like yesterday’s technology, even though the principles behind it are still very much alive today (Thornbury (2017) refers to Duolingo as ‘Programmed Instruction’). It would be nice to think that language teachers of the day were more sceptical than, for example, their counterparts teaching mathematics. It would be nice to think that, like Spolsky, they had taken on board Chomsky’s (1959) demolition of Skinner. But the widespread popularity of Audiolingual methods suggests otherwise. Audiolingualism, based essentially on the same Skinnerian principles as Programmed Instruction, needed less outlay on technology. The machines (a slide projector and a record or tape player) were cheaper than the teaching machines, could be used for other purposes and did not become obsolete so quickly. The method also lent itself more readily to established school systems (i.e. whole-class teaching) and the skills sets of teachers of the day. Significantly, too, there was relatively little investment in Programmed Instruction for language teaching (compared to, say, mathematics), since this was a smallish and more localized market. There was no global market for English language learning as there is today.

Lessons to be learned

1 Shaping attitudes

It was not hard to persuade some educational authorities of the value of Programmed Instruction. As discussed above, it offered a solution to the problem of ‘the chronic shortage of adequately trained and competent teachers at all levels in our schools, colleges and universities’, wrote Goodman (1967: 3), who added, there is growing realisation of the need to give special individual attention to handicapped children and to those apparently or actually retarded’. The new teaching machines ‘could simulate the human teacher and carry out at least some of his functions quite efficiently’ (Goodman, 1967: 4). This wasn’t quite the same thing as saying that the machines could replace teachers, although some might have hoped for this. The official line was more often that the machines could ‘be used as devices, actively co-operating with the human teacher as adaptive systems and not just merely as aids’ (Goodman, 1967: 37). But this more nuanced message did not always get through, and ‘the Press soon stated that robots would replace teachers and conjured up pictures of classrooms of students with little iron men in front of them’ (Kay et al., 1968: 161).

For teachers, though, it was one thing to be told that the machines would free their time to perform more meaningful tasks, but harder to believe when this was accompanied by a ‘rhetoric of the instructional inadequacies of the teacher’ (McDonald, et al., 2005: 88). Many teachers felt threatened. They ‘reacted against the ‘unfeeling machine’ as a poor substitute for the warm, responsive environment provided by a real, live teacher. Others have seemed to take it more personally, viewing the advent of programmed instruction as the end of their professional career as teachers. To these, even the mention of programmed instruction produces a momentary look of panic followed by the appearance of determination to stave off the ominous onslaught somehow’ (Tucker, 1972: 63).

Some of those who were pushing for Programmed Instruction had a bigger agenda, with their sights set firmly on broader school reform made possible through technology (Hof, 2018). Individualised learning and Programmed Instruction were not just ends in themselves: they were ways of facilitating bigger changes. The trouble was that teachers were necessary for Programmed Instruction to work. On the practical level, it became apparent that a blend of teaching machines and classroom teaching was more effective than the machines alone (Saettler, 2004: 299). But the teachers’ attitudes were crucial: a research study involving over 6000 students of Spanish showed that ‘the more enthusiastic the teacher was about programmed instruction, the better the work the students did, even though they worked independently’ (Saettler, 2004: 299). In other researched cases, too, ‘teacher attitudes proved to be a critical factor in the success of programmed instruction’ (Saettler, 2004: 301).

2 Returns on investment

Pricing a hyped edtech product is a delicate matter. Vendors need to see a relatively quick return on their investment, before a newer technology knocks them out of the market. Developments in computing were fast in the late 1960s, and the first commercially successful personal computer, the Altair 8800, appeared in 1974. But too high a price carried obvious risks. In 1967, the cheapest teaching machine in the UK, the Tutorpack (from Packham Research Ltd), cost £7 12s (equivalent to about £126 today), but machines like these were disparagingly referred to as ‘page-turners’ (Higgins, 1983: 4). A higher-end linear programming machine cost twice this amount. Branching programme machines cost a lot more. The Mark II AutoTutor (from USI Great Britain Limited), for example, cost £31 per month (equivalent to £558), with eight reels of programmes thrown in (Goodman, 1967: 26). A lower-end branching machine, the Grundytutor, could be bought for £ 230 (worth about £4140 today).

Teaching machines (from Goodman)AutoTutor Mk II (from Goodman)

This was serious money, and any institution splashing out on teaching machines needed to be confident that they would be well used for a long period of time (Nordberg, 1965). The programmes (the software) were specific to individual machines and the content could not be updated easily. At the same time, other technological developments (cine projectors, tape recorders, record players) were arriving in classrooms, and schools found themselves having to pay for technical assistance and maintenance. The average teacher was ‘unable to avail himself fully of existing aids because, to put it bluntly, he is expected to teach for too many hours a day and simply has not the time, with all the administrative chores he is expected to perform, either to maintain equipment, to experiment with it, let alone keeping up with developments in his own and wider fields. The advent of teaching machines which can free the teacher to fulfil his role as an educator will intensify and not diminish the problem’ (Goodman, 1967: 44). Teaching machines, in short, were ‘oversold and underused’ (Cuban, 2001).

3 Research and theory

Looking back twenty years later, B. F. Skinner conceded that ‘the machines were crude, [and] the programs were untested’ (Skinner, 1986: 105). The documentary record suggests that the second part of this statement is not entirely true. Herrick (1966: 695) reported that ‘an overwhelming amount of research time has been invested in attempts to determine the relative merits of programmed instruction when compared to ‘traditional’ or ‘conventional’ methods of instruction. The results have been almost equally overwhelming in showing no significant differences’. In 1968, Kay et al (1968: 96) noted that ‘there has been a definite effort to examine programmed instruction’. A later meta-analysis of research in secondary education (Kulik et al.: 1982) confirmed that ‘Programmed Instruction did not typically raise student achievement […] nor did it make students feel more positively about the subjects they were studying’.

It was not, therefore, the case that research was not being done. It was that many people were preferring not to look at it. The same holds true for theoretical critiques. In relation to language learning, Spolsky (1966) referred to Chomsky’s (1959) rebuttal of Skinner’s arguments, adding that ‘there should be no need to rehearse these inadequacies, but as some psychologists and even applied linguists appear to ignore their existence it might be as well to remind readers of a few’. Programmed Instruction might have had a limited role to play in language learning, but vendors’ claims went further than that and some people believed them: ‘Rather than addressing themselves to limited and carefully specified FL tasks – for example the teaching of spelling, the teaching of grammatical concepts, training in pronunciation, the acquisition of limited proficiency within a restricted number of vocabulary items and grammatical features – most programmers aimed at self-sufficient courses designed to lead to near-native speaking proficiency’ (Valdman, 1968: 2).

4 Content

When learning is conceptualised as purely the acquisition of knowledge, technological optimists tend to believe that machines can convey it more effectively and more efficiently than teachers (Hof, 2018). The corollary of this is the belief that, if you get the materials right (plus the order in which they are presented and appropriate feedback), you can ‘to a great extent control and engineer the quality and quantity of learning’ (Post, 1972: 14). Learning, in other words, becomes an engineering problem, and technology is its solution.

One of the problems was that technology vendors were, first and foremost, technology specialists. Content was almost an afterthought. Materials writers needed to be familiar with the technology and, if not, they were unlikely to be employed. Writers needed to believe in the potential of the technology, so those familiar with current theory and research would clearly not fit in. The result was unsurprising. Kennedy (1967: 872) reported that ‘there are hundreds of programs now available. Many more will be published in the next few years. Watch for them. Examine them critically. They are not all of high quality’. He was being polite.

5 Motivation

As is usually the case with new technologies, there was a positive novelty effect with Programmed Instruction. And, as is always the case, the novelty effect wears off: ‘students quickly tired of, and eventually came to dislike, programmed instruction’ (McDonald et al.: 89). It could not really have been otherwise: ‘human learning and intrinsic motivation are optimized when persons experience a sense of autonomy, competence, and relatedness in their activity. Self-determination theorists have also studied factors that tend to occlude healthy functioning and motivation, including, among others, controlling environments, rewards contingent on task performance, the lack of secure connection and care by teachers, and situations that do not promote curiosity and challenge’ (McDonald et al.: 93). The demotivating experience of using these machines was particularly acute with younger and ‘less able’ students, as was noted at the time (Valdman, 1968: 9).

The unlearned lessons

I hope that you’ll now understand why I think the history of Programmed Instruction is so relevant to us today. In the words of my favourite Yogi-ism, it’s like deja vu all over again. I have quoted repeatedly from the article by McDonald et al (2005) and I would highly recommend it – available here. Hopefully, too, Audrey Watters’ forthcoming book, ‘Teaching Machines’, will appear before too long, and she will, no doubt, have much more of interest to say on this topic.

References

Chomsky N. 1959. ‘Review of Skinner’s Verbal Behavior’. Language, 35:26–58.

Cuban, L. 2001. Oversold & Underused: Computers in the Classroom. (Cambridge, MA: Harvard University Press)

Goodman, R. 1967. Programmed Learning and Teaching Machines 3rd edition. (London: English Universities Press)

Herrick, M. 1966. ‘Programmed Instruction: A critical appraisal’ The American Biology Teacher, 28 (9), 695 -698

Higgins, J. 1983. ‘Can computers teach?’ CALICO Journal, 1 (2)

Hill, L. A. 1966. Programmed English Course Student’s Book 1. (Oxford: Oxford University Press)

Hof, B. 2018. ‘From Harvard via Moscow to West Berlin: educational technology, programmed instruction and the commercialisation of learning after 1957’ History of Education, 47:4, 445-465

Howatt, A. P. R. 1969. Programmed Learning and the Language Teacher. (London: Longmans)

Kay, H., Dodd, B. & Sime, M. 1968. Teaching Machines and Programmed Instruction. (Harmondsworth: Penguin)

Kennedy, R.H. 1967. ‘Before using Programmed Instruction’ The English Journal, 56 (6), 871 – 873

Kozlowski, T. 1961. ‘Programmed Teaching’ Financial Analysts Journal, 17 / 6, 47 – 54

Kulik, C.-L., Schwalb, B. & Kulik, J. 1982. ‘Programmed Instruction in Secondary Education: A Meta-analysis of Evaluation Findings’ Journal of Educational Research, 75: 133 – 138

McDonald, J. K., Yanchar, S. C. & Osguthorpe, R.T. 2005. ‘Learning from Programmed Instruction: Examining Implications for Modern Instructional Technology’ Educational Technology Research and Development, 53 / 2, 84 – 98

Nordberg, R. B. 1965. Teaching machines-six dangers and one advantage. In J. S. Roucek (Ed.), Programmed teaching: A symposium on automation in education (pp. 1–8). (New York: Philosophical Library)

Ornstein, J. 1968. ‘Programmed Instruction and Educational Technology in the Language Field: Boon or Failure?’ The Modern Language Journal, 52 / 7, 401 – 410

Post, D. 1972. ‘Up the programmer: How to stop PI from boring learners and strangling results’. Educational Technology, 12(8), 14–1

Saettler, P. 2004. The Evolution of American Educational Technology. (Greenwich, Conn.: Information Age Publishing)

Skinner, B. F. 1986. ‘Programmed Instruction Revisited’ The Phi Delta Kappan, 68 (2), 103 – 110

Spolsky, B. 1966. ‘A psycholinguistic critique of programmed foreign language instruction’ International Review of Applied Linguistics in Language Teaching, Volume 4, Issue 1-4: 119–130

Thornbury, S. 2017. Scott Thornbury’s 30 Language Teaching Methods. (Cambridge: Cambridge University Press)

Tucker, C. 1972. ‘Programmed Dictation: An Example of the P.I. Process in the Classroom’. TESOL Quarterly, 6(1), 61-70

Valdman, A. 1968. ‘Programmed Instruction versus Guided Learning in Foreign Language Acquisition’ Die Unterrichtspraxis / Teaching German, 1 (2), 1 – 14

 

 

 

[1] Spolsky’ doctoral thesis for the University of Montreal was entitled ‘The psycholinguistic basis of programmed foreign language instruction’.

 

 

 

 

 

Colloquium

At the beginning of March, I’ll be going to Cambridge to take part in a Digital Learning Colloquium (for more information about the event, see here ). One of the questions that will be explored is how research might contribute to the development of digital language learning. In this, the first of two posts on the subject, I’ll be taking a broad overview of the current state of play in edtech research.

I try my best to keep up to date with research. Of the main journals, there are Language Learning and Technology, which is open access; CALICO, which offers quite a lot of open access material; and reCALL, which is the most restricted in terms of access of the three. But there is something deeply frustrating about most of this research, and this is what I want to explore in these posts. More often than not, research articles end with a call for more research. And more often than not, I find myself saying ‘Please, no, not more research like this!’

First, though, I would like to turn to a more reader-friendly source of research findings. Systematic reviews are, basically literature reviews which can save people like me from having to plough through endless papers on similar subjects, all of which contain the same (or similar) literature review in the opening sections. If only there were more of them. Others agree with me: the conclusion of one systematic review of learning and teaching with technology in higher education (Lillejord et al., 2018) was that more systematic reviews were needed.

Last year saw the publication of a systematic review of research on artificial intelligence applications in higher education (Zawacki-Richter, et al., 2019) which caught my eye. The first thing that struck me about this review was that ‘out of 2656 initially identified publications for the period between 2007 and 2018, 146 articles were included for final synthesis’. In other words, only just over 5% of the research was considered worthy of inclusion.

The review did not paint a very pretty picture of the current state of AIEd research. As the second part of the title of this review (‘Where are the educators?’) makes clear, the research, taken as a whole, showed a ‘weak connection to theoretical pedagogical perspectives’. This is not entirely surprising. As Bates (2019) has noted: ‘since AI tends to be developed by computer scientists, they tend to use models of learning based on how computers or computer networks work (since of course it will be a computer that has to operate the AI). As a result, such AI applications tend to adopt a very behaviourist model of learning: present / test / feedback.’ More generally, it is clear that technology adoption (and research) is being driven by technology enthusiasts, with insufficient expertise in education. The danger is that edtech developers ‘will simply ‘discover’ new ways to teach poorly and perpetuate erroneous ideas about teaching and learning’ (Lynch, 2017).

This, then, is the first of my checklist of things that, collectively, researchers need to do to improve the value of their work. The rest of this list is drawn from observations mostly, but not exclusively, from the authors of systematic reviews, and mostly come from reviews of general edtech research. In the next blog post, I’ll look more closely at a recent collection of ELT edtech research (Mavridi & Saumell, 2020) to see how it measures up.

1 Make sure your research is adequately informed by educational research outside the field of edtech

Unproblematised behaviourist assumptions about the nature of learning are all too frequent. References to learning styles are still fairly common. The most frequently investigated skill that is considered in the context of edtech is critical thinking (Sosa Neira, et al., 2017), but this is rarely defined and almost never problematized, despite a broad literature that questions the construct.

2 Adopt a sceptical attitude from the outset

Know your history. Decades of technological innovation in education have shown precious little in the way of educational gains and, more than anything else, have taught us that we need to be sceptical from the outset. ‘Enthusiasm and praise that are directed towards ‘virtual education, ‘school 2.0’, ‘e-learning and the like’ (Selwyn, 2014: vii) are indications that the lessons of the past have not been sufficiently absorbed (Levy, 2016: 102). The phrase ‘exciting potential’, for example, should be banned from all edtech research. See, for example, a ‘state-of-the-art analysis of chatbots in education’ (Winkler & Söllner, 2018), which has nothing to conclude but ‘exciting potential’. Potential is fine (indeed, it is perhaps the only thing that research can unambiguously demonstrate – see section 3 below), but can we try to be a little more grown-up about things?

3 Know what you are measuring

Measuring learning outcomes is tricky, to say the least, but it’s understandable that researchers should try to focus on them. Unfortunately, ‘the vast array of literature involving learning technology evaluation makes it challenging to acquire an accurate sense of the different aspects of learning that are evaluated, and the possible approaches that can be used to evaluate them’ (Lai & Bower, 2019). Metrics such as student grades are hard to interpret, not least because of the large number of variables and the danger of many things being conflated in one score. Equally, or possibly even more, problematic, are self-reporting measures which are rarely robust. It seems that surveys are the most widely used instrument in qualitative research (Sosa Neira, et al., 2017), but these will tell us little or nothing when used for short-term interventions (see point 5 below).

4 Ensure that the sample size is big enough to mean something

In most of the research into digital technology in education that was analysed in a literature review carried out for the Scottish government (ICF Consulting Services Ltd, 2015), there were only ‘small numbers of learners or teachers or schools’.

5 Privilege longitudinal studies over short-term projects

The Scottish government literature review (ICF Consulting Services Ltd, 2015), also noted that ‘most studies that attempt to measure any outcomes focus on short and medium term outcomes’. The fact that the use of a particular technology has some sort of impact over the short or medium term tells us very little of value. Unless there is very good reason to suspect the contrary, we should assume that it is a novelty effect that has been captured (Levy, 2016: 102).

6 Don’t forget the content

The starting point of much edtech research is the technology, but most edtech, whether it’s a flashcard app or a full-blown Moodle course, has content. Research reports rarely give details of this content, assuming perhaps that it’s just fine, and all that’s needed is a little tech to ‘present learners with the ‘right’ content at the ‘right’ time’ (Lynch, 2017). It’s a foolish assumption. Take a random educational app from the Play Store, a random MOOC or whatever, and the chances are you’ll find it’s crap.

7 Avoid anecdotal accounts of technology use in quasi-experiments as the basis of a ‘research article’

Control (i.e technology-free) groups may not always be possible but without them, we’re unlikely to learn much from a single study. What would, however, be extremely useful would be a large, collated collection of such action-research projects, using the same or similar technology, in a variety of settings. There is a marked absence of this kind of work.

8 Enough already of higher education contexts

Researchers typically work in universities where they have captive students who they can carry out research on. But we have a problem here. The systematic review of Lundin et al (2018), for example, found that ‘studies on flipped classrooms are dominated by studies in the higher education sector’ (besides lacking anchors in learning theory or instructional design). With some urgency, primary and secondary contexts need to be investigated in more detail, not just regarding flipped learning.

9 Be critical

Very little edtech research considers the downsides of edtech adoption. Online safety, privacy and data security are hardly peripheral issues, especially with younger learners. Ignoring them won’t make them go away.

More research?

So do we need more research? For me, two things stand out. We might benefit more from, firstly, a different kind of research, and, secondly, more syntheses of the work that has already been done. Although I will probably continue to dip into the pot-pourri of articles published in the main CALL journals, I’m looking forward to a change at the CALICO journal. From September of this year, one issue a year will be thematic, with a lead article written by established researchers which will ‘first discuss in broad terms what has been accomplished in the relevant subfield of CALL. It should then outline which questions have been answered to our satisfaction and what evidence there is to support these conclusions. Finally, this article should pose a “soft” research agenda that can guide researchers interested in pursuing empirical work in this area’. This will be followed by two or three empirical pieces that ‘specifically reflect the research agenda, methodologies, and other suggestions laid out in the lead article’.

But I think I’ll still have a soft spot for some of the other journals that are coyer about their impact factor and that can be freely accessed. How else would I discover (it would be too mean to give the references here) that ‘the effective use of new technologies improves learners’ language learning skills’? Presumably, the ineffective use of new technologies has the opposite effect? Or that ‘the application of modern technology represents a significant advance in contemporary English language teaching methods’?

References

Bates, A. W. (2019). Teaching in a Digital Age Second Edition. Vancouver, B.C.: Tony Bates Associates Ltd. Retrieved from https://pressbooks.bccampus.ca/teachinginadigitalagev2/

ICF Consulting Services Ltd (2015). Literature Review on the Impact of Digital Technology on Learning and Teaching. Edinburgh: The Scottish Government. https://dera.ioe.ac.uk/24843/1/00489224.pdf

Lai, J.W.M. & Bower, M. (2019). How is the use of technology in education evaluated? A systematic review. Computers & Education, 133(1), 27-42. Elsevier Ltd. Retrieved January 14, 2020 from https://www.learntechlib.org/p/207137/

Levy, M. 2016. Researching in language learning and technology. In Farr, F. & Murray, L. (Eds.) The Routledge Handbook of Language Learning and Technology. Abingdon, Oxon.: Routledge. pp.101 – 114

Lillejord S., Børte K., Nesje K. & Ruud E. (2018). Learning and teaching with technology in higher education – a systematic review. Oslo: Knowledge Centre for Education https://www.forskningsradet.no/siteassets/publikasjoner/1254035532334.pdf

Lundin, M., Bergviken Rensfeldt, A., Hillman, T. et al. (2018). Higher education dominance and siloed knowledge: a systematic review of flipped classroom research. International Journal of Educational Technology in Higher Education 15, 20 (2018) doi:10.1186/s41239-018-0101-6

Lynch, J. (2017). How AI Will Destroy Education. Medium, November 13, 2017. https://buzzrobot.com/how-ai-will-destroy-education-20053b7b88a6

Mavridi, S. & Saumell, V. (Eds.) (2020). Digital Innovations and Research in Language Learning. Faversham, Kent: IATEFL

Selwyn, N. (2014). Distrusting Educational Technology. New York: Routledge

Sosa Neira, E. A., Salinas, J. and de Benito Crosetti, B. (2017). Emerging Technologies (ETs) in Education: A Systematic Review of the Literature Published between 2006 and 2016. International Journal of Emerging Technologies in Education, 12 (5). https://online-journals.org/index.php/i-jet/article/view/6939

Winkler, R. & Söllner, M. (2018): Unleashing the Potential of Chatbots in Education: A State-Of-The-Art Analysis. In: Academy of Management Annual Meeting (AOM). Chicago, USA. https://www.alexandria.unisg.ch/254848/1/JML_699.pdf

Zawacki-Richter, O., Bond, M., Marin, V. I. And Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education – where are the educators? International Journal of Educational Technology in Higher Education 2019

Introduction

Allowing learners to determine the amount of time they spend studying, and, therefore (in theory at least) the speed of their progress is a key feature of most personalized learning programs. In cases where learners follow a linear path of pre-determined learning items, it is often the only element of personalization that the programs offer. In the Duolingo program that I am using, there are basically only two things that can be personalized: the amount of time I spend studying each day, and the possibility of jumping a number of learning items by ‘testing out’.

Self-regulated learning or self-pacing, as this is commonly referred to, has enormous intuitive appeal. It is clear that different people learn different things at different rates. We’ve known for a long time that ‘the developmental stages of child growth and the individual differences among learners make it impossible to impose a single and ‘correct’ sequence on all curricula’ (Stern, 1983: 439). It therefore follows that it makes even less sense for a group of students (typically determined by age) to be obliged to follow the same curriculum at the same pace in a one-size-fits-all approach. We have probably all experienced, as students, the frustration of being behind, or ahead of, the rest of our colleagues in a class. One student who suffered from the lockstep approach was Sal Khan, founder of the Khan Academy. He has described how he was fed up with having to follow an educational path dictated by his age and how, as a result, individual pacing became an important element in his educational approach (Ferster, 2014: 132-133). As teachers, we have all experienced the challenges of teaching a piece of material that is too hard or too easy for many of the students in the class.

Historical attempts to facilitate self-paced learning

Charles_W__Eliot_cph_3a02149An interest in self-paced learning can be traced back to the growth of mass schooling and age-graded classes in the 19th century. In fact, the ‘factory model’ of education has never existed without critics who saw the inherent problems of imposing uniformity on groups of individuals. These critics were not marginal characters. Charles Eliot (president of Harvard from 1869 – 1909), for example, described uniformity as ‘the curse of American schools’ and argued that ‘the process of instructing students in large groups is a quite sufficient school evil without clinging to its twin evil, an inflexible program of studies’ (Grittner, 1975: 324).

Attempts to develop practical solutions were not uncommon and these are reasonably well-documented. One of the earliest, which ran from 1884 to 1894, was launched in Pueblo, Colorado and was ‘a self-paced plan that required each student to complete a sequence of lessons on an individual basis’ (Januszewski, 2001: 58-59). More ambitious was the Burk Plan (at its peak between 1912 and 1915), named after Frederick Burk of the San Francisco State Normal School, which aimed to allow students to progress through materials (including language instruction materials) at their own pace with only a limited amount of teacher presentations (Januszewski, ibid.). Then, there was the Winnetka Plan (1920s), developed by Carlton Washburne, an associate of Frederick Burk and the superintendent of public schools in Winnetka, Illinois, which also ‘allowed learners to proceed at different rates, but also recognised that learners proceed at different rates in different subjects’ (Saettler, 1990: 65). The Winnetka Plan is especially interesting in the way it presaged contemporary attempts to facilitate individualized, self-paced learning. It was described by its developers in the following terms:

A general technique [consisting] of (a) breaking up the common essentials curriculum into very definite units of achievement, (b) using complete diagnostic tests to determine whether a child has mastered each of these units, and, if not, just where his difficulties lie and, (c) the full use of self-instructive, self corrective practice materials. (Washburne, C., Vogel, M. & W.S. Gray. 1926. A Survey of the Winnetka Public Schools. Bloomington, IL: Public School Press)

Not dissimilar was the Dalton (Massachusetts) Plan in the 1920s which also used a self-paced program to accommodate the different ability levels of the children and deployed contractual agreements between students and teachers (something that remains common educational practice around the world). There were many others, both in the U.S. and other parts of the world.

The personalization of learning through self-pacing was not, therefore, a minor interest. Between 1910 and 1924, nearly 500 articles can be documented on the subject of individualization (Grittner, 1975: 328). In just three years (1929 – 1932) of one publication, The Education Digest, there were fifty-one articles dealing with individual instruction and sixty-three entries treating individual differences (Chastain, 1975: 334). Foreign language teaching did not feature significantly in these early attempts to facilitate self-pacing, but see the Burk Plan described above. Only a handful of references to language learning and self-pacing appeared in articles between 1916 and 1924 (Grittner, 1975: 328).

Disappointingly, none of these initiatives lasted long. Both costs and management issues had been significantly underestimated. Plans such as those described above were seen as progress, but not the hoped-for solution. Problems included the fact that the materials themselves were not individualized and instructional methods were too rigid (Pendleton, 1930: 199). However, concomitant with the interest in individualization (mostly, self-pacing), came the advent of educational technology.

Sidney L. Pressey, the inventor of what was arguably the first teaching machine, was inspired by his experiences with schoolchildren in rural Indiana in the 1920s where he ‘was struck by the tremendous variation in their academic abilities and how they were forced to progress together at a slow, lockstep pace that did not serve all students well’ (Ferster, 2014: 52). Although Pressey failed in his attempts to promote his teaching machines, he laid the foundation stones in the synthesizing of individualization and technology.Pressey machine

Pressey may be seen as the direct precursor of programmed instruction, now closely associated with B. F. Skinner (see my post on Behaviourism and Adaptive Learning). It is a quintessentially self-paced approach and is described by John Hattie as follows:

Programmed instruction is a teaching method of presenting new subject matter to students in graded sequence of controlled steps. A book version, for example, presents a problem or issue, then, depending on the student’s answer to a question about the material, the student chooses from optional answers which refers them to particular pages of the book to find out why they were correct or incorrect – and then proceed to the next part of the problem or issue. (Hattie, 2009: 231)

Programmed instruction was mostly used for the teaching of mathematics, but it is estimated that 4% of programmed instruction programs were for foreign languages (Saettler, 1990: 297). It flourished in the 1960s and 1970s, but even by 1968 foreign language instructors were sceptical (Valdman, 1968). A survey carried out by the Center for Applied Linguistics revealed then that only about 10% of foreign language teachers at college and university reported the use of programmed materials in their departments. (Valdman, 1968: 1).grolier min max

Research studies had failed to demonstrate the effectiveness of programmed instruction (Saettler, 1990: 303). Teachers were often resistant and students were often bored, finding ‘ingenious ways to circumvent the program, including the destruction of their teaching machines!’ (Saettler, ibid.).

In the case of language learning, there were other problems. For programmed instruction to have any chance of working, it was necessary to specify rigorously the initial and terminal behaviours of the learner so that the intermediate steps leading from the former to the latter could be programmed. As Valdman (1968: 4) pointed out, this is highly problematic when it comes to languages (a point that I have made repeatedly in this blog). In addition, students missed the personal interaction that conventional instruction offered, got bored and lacked motivation (Valdman, 1968: 10).

Programmed instruction worked best when teachers were very enthusiastic, but perhaps the most significant lesson to be learned from the experiments was that it was ‘a difficult, time-consuming task to introduce programmed instruction’ (Saettler, 1990: 299). It entailed changes to well-established practices and attitudes, and for such changes to succeed there must be consideration of the social, political, and economic contexts. As Saettler (1990: 306), notes, ‘without the support of the community and the entire teaching staff, sustained innovation is unlikely’. In this light, Hattie’s research finding that ‘when comparisons are made between many methods, programmed instruction often comes near the bottom’ (Hattie, 2009: 231) comes as no great surprise.

Just as programmed instruction was in its death throes, the world of language teaching discovered individualization. Launched as a deliberate movement in the early 1970s at the Stanford Conference (Altman & Politzer, 1971), it was a ‘systematic attempt to allow for individual differences in language learning’ (Stern, 1983: 387). Inspired, in part, by the work of Carl Rogers, this ‘humanistic turn’ was a recognition that ‘each learner is unique in personality, abilities, and needs. Education must be personalized to fit the individual; the individual must not be dehumanized in order to meet the needs of an impersonal school system’ (Disick, 1975:38). In ELT, this movement found many adherents and remains extremely influential to this day.

In language teaching more generally, the movement lost impetus after a few years, ‘probably because its advocates had underestimated the magnitude of the task they had set themselves in trying to match individual learner characteristics with appropriate teaching techniques’ (Stern, 1983: 387). What precisely was meant by individualization was never adequately defined or agreed (a problem that remains to the present time). What was left was self-pacing. In 1975, it was reported that ‘to date the majority of the programs in second-language education have been characterized by a self-pacing format […]. Practice seems to indicate that ‘individualized’ instruction is being defined in the class room as students studying individually’ (Chastain, 1975: 344).

Lessons to be learned

This brief account shows that historical attempts to facilitate self-pacing have largely been characterised by failure. The starting point of all these attempts remains as valid as ever, but it is clear that practical solutions are less than simple. To avoid the insanity of doing the same thing over and over again and expecting different results, we should perhaps try to learn from the past.

One of the greatest challenges that teachers face is dealing with different levels of ability in their classes. In any blended scenario where the online component has an element of self-pacing, the challenge will be magnified as ability differentials are likely to grow rather than decrease as a result of the self-pacing. Bart Simpson hit the nail on the head in a memorable line: ‘Let me get this straight. We’re behind the rest of the class and we’re going to catch up to them by going slower than they are? Coo coo!’ Self-pacing runs into immediate difficulties when it comes up against standardised tests and national or state curriculum requirements. As Ferster observes, ‘the notion of individual pacing [remains] antithetical to […] a graded classroom system, which has been the model of schools for the past century. Schools are just not equipped to deal with students who do not learn in age-processed groups, even if this system is clearly one that consistently fails its students (Ferster, 2014: 90-91).bart_simpson

Ability differences are less problematic if the teacher focusses primarily on communicative tasks in F2F time (as opposed to more teaching of language items), but this is a big ‘if’. Many teachers are unsure of how to move towards a more communicative style of teaching, not least in large classes in compulsory schooling. Since there are strong arguments that students would benefit from a more communicative, less transmission-oriented approach anyway, it makes sense to focus institutional resources on equipping teachers with the necessary skills, as well as providing support, before a shift to a blended, more self-paced approach is implemented.

Such issues are less important in private institutions, which are not age-graded, and in self-study contexts. However, even here there may be reasons to proceed cautiously before buying into self-paced approaches. Self-pacing is closely tied to autonomous goal-setting (which I will look at in more detail in another post). Both require a degree of self-awareness at a cognitive and emotional level (McMahon & Oliver, 2001), but not all students have such self-awareness (Magill, 2008). If students do not have the appropriate self-regulatory strategies and are simply left to pace themselves, there is a chance that they will ‘misregulate their learning, exerting control in a misguided or counterproductive fashion and not achieving the desired result’ (Kirschner & van Merriënboer, 2013: 177). Before launching students on a path of self-paced language study, ‘thought needs to be given to the process involved in users becoming aware of themselves and their own understandings’ (McMahon & Oliver, 2001: 1304). Without training and support provided both before and during the self-paced study, the chances of dropping out are high (as we see from the very high attrition rate in language apps).

However well-intentioned, many past attempts to facilitate self-pacing have also suffered from the poor quality of the learning materials. The focus was more on the technology of delivery, and this remains the case today, as many posts on this blog illustrate. Contemporary companies offering language learning programmes show relatively little interest in the content of the learning (take Duolingo as an example). Few app developers show signs of investing in experienced curriculum specialists or materials writers. Glossy photos, contemporary videos, good UX and clever gamification, all of which become dull and repetitive after a while, do not compensate for poorly designed materials.

Over forty years ago, a review of self-paced learning concluded that the evidence on its benefits was inconclusive (Allison, 1975: 5). Nothing has changed since. For some people, in some contexts, for some of the time, self-paced learning may work. Claims that go beyond that cannot be substantiated.

References

Allison, E. 1975. ‘Self-Paced Instruction: A Review’ The Journal of Economic Education 7 / 1: 5 – 12

Altman, H.B. & Politzer, R.L. (eds.) 1971. Individualizing Foreign Language Instruction: Proceedings of the Stanford Conference, May 6 – 8, 1971. Washington, D.C.: Office of Education, U.S. Department of Health, Education, and Welfare

Chastain, K. 1975. ‘An Examination of the Basic Assumptions of “Individualized” Instruction’ The Modern Language Journal 59 / 7: 334 – 344

Disick, R.S. 1975 Individualizing Language Instruction: Strategies and Methods. New York: Harcourt Brace Jovanovich

Ferster, B. 2014. Teaching Machines. Baltimore: John Hopkins University Press

Grittner, F. M. 1975. ‘Individualized Instruction: An Historical Perspective’ The Modern Language Journal 59 / 7: 323 – 333

Hattie, J. 2009. Visible Learning. Abingdon, Oxon.: Routledge

Januszewski, A. 2001. Educational Technology: The Development of a Concept. Englewood, Colorado: Libraries Unlimited

Kirschner, P. A. & van Merriënboer, J. J. G. 2013. ‘Do Learners Really Know Best? Urban Legends in Education’ Educational Psychologist, 48:3, 169-183

Magill, D. S. 2008. ‘What Part of Self-Paced Don’t You Understand?’ University of Wisconsin 24th Annual Conference on Distance Teaching & Learning Conference Proceedings.

McMahon, M. & Oliver, R. 2001. ‘Promoting self-regulated learning in an on-line environment’ in C. Montgomerie & J. Viteli (eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2001 (pp. 1299-1305). Chesapeake, VA: AACE

Pendleton, C. S. 1930. ‘Personalizing English Teaching’ Peabody Journal of Education 7 / 4: 195 – 200

Saettler, P. 1990. The Evolution of American Educational Technology. Denver: Libraries Unlimited

Stern, H.H. 1983. Fundamental Concepts of Language Teaching. Oxford: Oxford University Press

Valdman, A. 1968. ‘Programmed Instruction versus Guided Learning in Foreign Language Acquisition’ Die Unterrichtspraxis / Teaching German 1 / 2: 1 – 14

 

In ELT circles, ‘behaviourism’ is a boo word. In the standard history of approaches to language teaching (characterised as a ‘procession of methods’ by Hunter & Smith 2012: 432[1]), there were the bad old days of behaviourism until Chomsky came along, savaged the theory in his review of Skinner’s ‘Verbal Behavior’, and we were all able to see the light. In reality, of course, things weren’t quite like that. The debate between Chomsky and the behaviourists is far from over, behaviourism was not the driving force behind the development of audiolingual approaches to language teaching, and audiolingualism is far from dead. For an entertaining and eye-opening account of something much closer to reality, I would thoroughly recommend a post on Russ Mayne’s Evidence Based ELT blog, along with the discussion which follows it. For anyone who would like to understand what behaviourism is, was, and is not (before they throw the term around as an insult), I’d recommend John A. Mills’ ‘Control: A History of Behavioral Psychology’ (New York University Press, 1998) and John Staddon’s ‘The New Behaviorism 2nd edition’ (Psychology Press, 2014).

There is a close connection between behaviourism and adaptive learning. Audrey Watters, no fan of adaptive technology, suggests that ‘any company touting adaptive learning software’ has been influenced by Skinner. In a more extended piece, ‘Education Technology and Skinner’s Box, Watters explores further her problems with Skinner and the educational technology that has been inspired by behaviourism. But writers much more sympathetic to adaptive learning, also see close connections to behaviourism. ‘The development of adaptive learning systems can be considered as a transformation of teaching machines,’ write Kara & Sevim[2] (2013: 114 – 117), although they go on to point out the differences between the two. Vendors of adaptive learning products, like DreamBox Learning©, are not shy of associating themselves with behaviourism: ‘Adaptive learning has been with us for a while, with its history of adaptive learning rooted in cognitive psychology, beginning with the work of behaviorist B.F. Skinner in the 1950s, and continuing through the artificial intelligence movement of the 1970s.’

That there is a strong connection between adaptive learning and behaviourism is indisputable, but I am not interested in attempting to establish the strength of that connection. This would, in any case, be an impossible task without some reductionist definition of both terms. Instead, my interest here is to explore some of the parallels between the two, and, in the spirit of the topic, I’d like to do this by comparing the behaviours of behaviourists and adaptive learning scientists.

Data and theory

Both behaviourism and adaptive learning (in its big data form) are centrally concerned with behaviour – capturing and measuring it in an objective manner. In both, experimental observation and the collection of ‘facts’ (physical, measurable, behavioural occurrences) precede any formulation of theory. John Mills’ description of behaviourists could apply equally well to adaptive learning scientists: theory construction was a seesaw process whereby one began with crude outgrowths from observations and slowly created one’s theory in such a way that one could make more and more precise observations, building those observations into the theory at each stage. No behaviourist ever considered the possibility of taking existing comprehensive theories of mind and testing or refining them.[3]

Positivism and the panopticon

Both behaviourism and adaptive learning are pragmatically positivist, believing that truth can be established by the study of facts. J. B. Watson, the founding father of behaviourism whose article ‘Psychology as the Behaviorist Views Itset the behaviourist ball rolling, believed that experimental observation could ‘reveal everything that can be known about human beings’[4]. Jose Ferreira of Knewton has made similar claims: We get five orders of magnitude more data per user than Google does. We get more data about people than any other data company gets about people, about anything — and it’s not even close. We’re looking at what you know, what you don’t know, how you learn best. […] We know everything about what you know and how you learn best because we get so much data. Digital data analytics offer something that Watson couldn’t have imagined in his wildest dreams, but he would have approved.

happiness industryThe revolutionary science

Big data (and the adaptive learning which is a part of it) is presented as a game-changer: The era of big data challenges the way we live and interact with the world. […] Society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what. This overturns centuries of established practices and challenges our most basic understanding of how to make decisions and comprehend reality[5]. But the reverence for technology and the ability to reach understandings of human beings by capturing huge amounts of behavioural data was adumbrated by Watson a century before big data became a widely used term. Watson’s 1913 lecture at Columbia University was ‘a clear pitch’[6] for the supremacy of behaviourism, and its potential as a revolutionary science.

Prediction and controlnudge

The fundamental point of both behaviourism and adaptive learning is the same. The research practices and the theorizing of American behaviourists until the mid-1950s, writes Mills[7] were driven by the intellectual imperative to create theories that could be used to make socially useful predictions. Predictions are only useful to the extent that they can be used to manipulate behaviour. Watson states this very baldly: the theoretical goal of psychology is the prediction and control of behaviour[8]. Contemporary iterations of behaviourism, such as behavioural economics or nudge theory (see, for example, Thaler & Sunstein’s best-selling ‘Nudge’, Penguin Books, 2008), or the British government’s Behavioural Insights Unit, share the same desire to divert individual activity towards goals (selected by those with power), ‘without either naked coercion or democratic deliberation’[9]. Jose Ferreira of Knewton has an identical approach: We can predict failure in advance, which means we can pre-remediate it in advance. We can say, “Oh, she’ll struggle with this, let’s go find the concept from last year’s materials that will help her not struggle with it.” Like the behaviourists, Ferreira makes grand claims about the social usefulness of his predict-and-control technology: The end is a really simple mission. Only 22% of the world finishes high school, and only 55% finish sixth grade. Those are just appalling numbers. As a species, we’re wasting almost four-fifths of the talent we produce. […] I want to solve the access problem for the human race once and for all.

Ethics

Because they rely on capturing large amounts of personal data, both behaviourism and adaptive learning quickly run into ethical problems. Even where informed consent is used, the subjects must remain partly ignorant of exactly what is being tested, or else there is the fear that they might adjust their behaviour accordingly. The goal is to minimise conscious understanding of what is going on[10]. For adaptive learning, the ethical problem is much greater because of the impossibility of ensuring the security of this data. Everything is hackable.

Marketing

Behaviourism was seen as a god-send by the world of advertising. J. B. Watson, after a front-page scandal about his affair with a student, and losing his job at John Hopkins University, quickly found employment on Madison Avenue. ‘Scientific advertising’, as practised by the Mad Men from the 1920s onwards, was based on behaviourism. The use of data analytics by Google, Amazon, et al is a direct descendant of scientific advertising, so it is richly appropriate that adaptive learning is the child of data analytics.

[1] Hunter, D. and Smith, R. (2012) ‘Unpacking the past: “CLT” through ELTJ keywords’. ELT Journal, 66/4: 430-439.

[2] Kara, N. & Sevim, N. 2013. ‘Adaptive learning systems: beyond teaching machines’, Contemporary Educational Technology, 4(2), 108-120

[3] Mills, J. A. (1998) Control: A History of Behavioral Psychology. New York: New York University Press, p.5

[4] Davies, W. (2015) The Happiness Industry. London: Verso. p.91

[5] Mayer-Schönberger, V. & Cukier, K. (2013) Big Data. London: John Murray, p.7

[6] Davies, W. (2015) The Happiness Industry. London: Verso. p.87

[7] Mills, J. A. (1998) Control: A History of Behavioral Psychology. New York: New York University Press, p.2

[8] Watson, J. B. (1913) ‘Behaviorism as the Psychologist Views it’ Psychological Review 20: 158

[9] Davies, W. (2015) The Happiness Industry. London: Verso. p.88

[10] Davies, W. (2015) The Happiness Industry. London: Verso. p.92