Posts Tagged ‘video’

Last month, I wrote a post about the automated generation of vocabulary learning materials. Yesterday, I got an email from Mike Elchik, inviting me to take a look at the product that his company, WeSpeke, has developed in partnership with CNN. Called en.news, it’s a very regularly updated and wide selection of video clips and texts from CNN, which are then used to ‘automatically create a pedagogically structured, leveled and game-ified English lesson‘. Available at the AppStore and Google Play, as well as a desktop version, it’s free. Revenues will presumably be generated through advertising and later sales to corporate clients.

With 6.2 million dollars in funding so far, WeSpeke can leverage some state-of-the-art NLP and AI tools. Co-founder and chief technical adviser of the company is Jaime Carbonell, Director of the Language Technologies Institute at Carnegie Mellon University, described in Wikipedia as one of the gurus of machine learning. I decided to have a closer look.

home_page

Users are presented with a menu of CNN content (there were 38 items from yesterday alone), these are tagged with broad categories (Politics, Opinions, Money, Technology, Entertainment, etc.) and given a level, ranging from 1 to 5, although the vast majority of the material is at the two highest levels.

menu.jpg

I picked two lessons: a reading text about Mark Zuckerberg’s Congressional hearing (level 5) and a 9 minute news programme of mixed items (level 2 – illustrated above). In both cases, the lesson begins with the text. With the reading, you can click on words to bring up dictionary entries from the Collins dictionary. With the video, you can activate captions and again click on words for definitions. You can also slow down the speed. So far, so good.

There then follows a series of exercises which focus primarily on a set of words that have been automatically selected. This is where the problems began.

Level

It’s far from clear what the levels (1 – 5) refer to. The Zuckerberg text is 930 words long and is rated as B2 by one readability tool. But, using the English Profile Text Inspector, there are 19 types at C1 level, 14 at C2, and 98 which are unlisted. That suggests something substantially higher than B2. The CNN10 video is delivered at breakneck speed (as is often the case with US news shows). Yes, it can be slowed down, but that still won’t help with some passages, such as the one below:

A squirrel recently fell out of a tree in Western New York. Why would that make news?Because she bwoke her widdle leg and needed a widdle cast! Yes, there are casts for squirrels, as you can see in this video from the Orphaned Wildlife Center. A windstorm knocked the animal’s nest out of a tree, and when a woman saw that the baby squirrel was injured, she took her to a local vet. Doctors say she’s going to be just fine in a couple of weeks. Well, why ‘rodent’ she be? She’s been ‘whiskered’ away and cast in both a video and a plaster. And as long as she doesn’t get too ‘squirrelly’ before she heals, she’ll have quite a ‘tail’ to tell.

It’s hard to understand how a text like this got through the algorithms. But, as materials writers know, it is extremely hard to find authentic text that lends itself to language learning at anything below C1. On the evidence here, there is still some way to go before the process of selection can be automated. It may well be the case that CNN simply isn’t a particularly appropriate source.

Target learning items

The primary focus of these lessons is vocabulary learning, and it’s vocabulary learning of a very deliberate kind. Applied linguists are in general agreement that it makes sense for learners to approach the building of their L2 lexicon in a deliberate way (i.e. by studying individual words) for high-frequency items or items that can be identified as having a high surrender value (e.g. items from the AWL for students studying in an EMI context). Once you get to items that are less frequent than, say, the top 8,000 most frequent words, the effort expended in studying new words needs to be offset against their usefulness. Why spend a lot of time studying low frequency words when you’re unlikely to come across them again for some time … and will probably forget them before you do? Vocabulary development at higher levels is better served by extensive reading (and listening), possibly accompanied by glosses.

The target items in the Zuckerberg text were: advocacy, grilled, handicapping, sparked, diagnose, testified, hefty, imminent, deliberative and hesitant. One of these ‘grilled‘ is listed as A2 by English Vocabulary Profile, but that is with its literal, not metaphorical, meaning. Four of them are listed as C2 and the remaining five are off-list. In the CNN10 video, the target items were: strive, humble (verb), amplify, trafficked, enslaved, enacted, algae, trafficking, ink and squirrels. Of these, one is B1, two are C2 and the rest are unlisted. What is the point of studying these essentially random words? Why spend time going through a series of exercises that practise these items? Wouldn’t your time be better spent just doing some more reading? I have no idea how the automated selection of these items takes place, but it’s clear that it’s not working very well.

Practice exercises

There is plenty of variety of task-type but there are,  I think, two reasons to query the claim that these lessons are ‘pedagogically structured’. The first is the nature of the practice exercises; the second is the sequencing of the exercises. I’ll restrict my observations to a selection of the tasks.

1. Users are presented with a dictionary definition and an anagrammed target item which they must unscramble. For example:

existing for the purpose of discussing or planning something     VLREDBETEIIA

If you can’t solve the problem, you can always scroll through the text to find the answer. Burt the problem is in the task design. Dictionary definitions have been written to help language users decode a word. They simply don’t work very well when they are used for another purpose (as prompts for encoding).

2. Users are presented with a dictionary definition for which they must choose one of four words. There are many potential problems here, not the least of which is that definitions are often more complex than the word they are defining, or they present other challenges. As an example: cause to be unpretentious for to humble. On top of that, lexicographers often need or choose to embed the target item in the definition. For example:

a hefty amount of something, especially money, is very large

an event that is imminent, especially an unpleasant one, will happen very soon

When this is the case, it makes no sense to present these definitions and ask learners to find the target item from a list of four.

The two key pieces of content in this product – the CNN texts and the Collins dictionaries – are both less than ideal for their purposes.

3. Users are presented with a box of jumbled words which they must unscramble to form sentences that appeared in the text.

Rearrange_words_to_make_sentences

The sentences are usually long and hard to reconstruct. You can scroll through the text to find the answer, but I’m unclear what the point of this would be. The example above contains a mistake (vie instead of vice), but this was one of only two glitches I encountered.

4. Users are asked to select the word that they hear on an audio recording. For example:

squirreling     squirrel     squirreled     squirrels

Given the high level of challenge of both the text and the target items, this was a rather strange exercise to kick off the practice. The meaning has not yet been presented (in a matching / definition task), so what exactly is the point of this exercise?

5. Users are presented with gapped sentences from the text and asked to choose the correct grammatical form of the missing word. Some of these were hard (e.g. adjective order), others were very easy (e.g. some vs any). The example below struck me as plain weird for a lesson at this level.

________ have zero expectation that this Congress is going to make adequate changes. (I or Me ?)

6. At the end of both lessons, there were a small number of questions that tested your memory of the text. If, like me, you couldn’t remember all that much about the text after twenty minutes of vocabulary activities, you can scroll through the text to find the answers. This is not a task type that will develop reading skills: I am unclear what it could possibly develop.

Overall?

Using the lessons on offer here wouldn’t do a learner (as long as they already had a high level of proficiency) any harm, but it wouldn’t be the most productive use of their time, either. If a learner is motivated to read the text about Zuckerberg, rather than do lots of ‘busy’ work on a very odd set of words with gap-fills and matching tasks, they’d be better advised just to read the text again once or twice. They could use a look-up for words they want to understand and import them into a flashcard system with spaced repetition (en.news does have flashcards, but there’s no sign of spaced practice yet). More, they could check out another news website and read / watch other articles on the same subject (perhaps choosing websites with a different slant to CNN) and get valuable narrow-reading practice in this way.

My guess is that the technology has driven the product here, but without answering the fundamental questions about which words it’s appropriate for individual learners to study in a deliberate way and how this is best tackled, it doesn’t take learners very far.

 

 

 

 

Advertisements

FluentU, busuu, Bliu Bliu … what is it with all the ‘u’s? Hong-Kong based FluentU used to be called FluentFlix, but they changed their name a while back. The service for English learners is relatively new. Before that, they focused on Chinese, where the competition is much less fierce.

At the core of FluentU is a collection of short YouTube videos, which are sorted into 6 levels and grouped into 7 topic categories. The videos are accompanied by transcriptions. As learners watch a video, they can click on any word in the transcript. This will temporarily freeze the video and show a pop-up which offers a definition of the word, information about part of speech, a couple of examples of this word in other sentences, and more example sentences of the word from other videos that are linked on FluentU. These can, in turn, be clicked on to bring up a video collage of these sentences. Learners can click on an ‘Add to Vocab’ button, which will add the word to personalised vocabulary lists. These are later studied through spaced repetition.

FluentU describes its approach in the following terms: FluentU selects the best authentic video content from the web, and provides the scaffolding and support necessary to bring that authentic content within reach for your students. It seems appropriate, therefore, to look first at the nature of that content. At the moment, there appear to be just under 1,000 clips which are allocated to levels as follows:

Newbie 123 Intermediate 294 Advanced 111
Elementary 138 Upper Int 274 Native 40

It has to be assumed that the amount of content will continue to grow, but, for the time being, it’s not unreasonable to say that there isn’t a lot there. I looked at the Upper Intermediate level where the shortest was 32 seconds long, the longest 4 minutes 34 seconds, but most were between 1 and 2 minutes. That means that there is the equivalent of about 400 minutes (say, 7 hours) for this level.

The actual amount that anyone would want to watch / study can be seen to be significantly less when the topics are considered. These break down as follows:

Arts & entertainment 105 Everyday life 60 Science & tech 17
Business 34 Health & lifestyle 28
Culture 29 Politics & society 6

The screenshots below give an idea of the videos on offer:

menu1menu2

I may be a little difficult, but there wasn’t much here that appealed. Forget the movie trailers for crap movies, for a start. Forget the low level business stuff, too. ‘The History of New Year’s Resolutions’ looked promising, but turned out to be a Wikipedia style piece. FluentU certainly doesn’t have the eye for interesting, original video content of someone like Jamie Keddie or Kieran Donaghy.

But, perhaps, the underwhelming content is of less importance than what you do with it. After all, if you’re really interested in content, you can just go to YouTube and struggle through the transcriptions on your own. The transcripts can be downloaded as pdfs, which, strangely are marked with a FluentU copyright notice.copyright FluentU doesn’t need to own the copyright of the videos, because they just provide links, but claiming copyright for someone else’s script seemed questionable to me. Anyway, the only real reason to be on this site is to learn some vocabulary. How well does it perform?

fluentu1

Level is self-selected. It wasn’t entirely clear how videos had been allocated to level, but I didn’t find any major discrepancies between FluentU’s allocation and my own, intuitive grading of the content. Clicking on words in the transcript, the look-up / dictionary function wasn’t too bad, compared to some competing products I have looked at. The system could deal with some chunks and phrases (e.g. at your service, figure out) and the definitions were appropriate to the way these had been used in context. The accuracy was far from consistent, though. Some definitions were harder than the word they were explaining (e.g. telephone = an instrument used to call someone) and some were plain silly (e.g. the definition of I is me).

have_been_definitionSome chunks were not recognised, so definitions were amusingly wonky. Come out, get through and have been were all wrong. For the phrase talk her into it, the program didn’t recognise the phrasal verb, and offered me communicate using speech for talk, and to the condition, state or form of for into.

For many words, there are pictures to help you with the meaning, but you wonder about some of them, e.g. the picture of someone clutching a suitcase to illustrate the meaning of of, or a woman holding up a finger and thumb to illustrate the meaning of what (as a pronoun).what_definition

The example sentences don’t seem to be graded in any way and are not always useful. The example sentences for of, for example, are The pages of the book are ripped, the lemurs of Madagascar and what time of day are you free. Since the definition is given as belonging to, there seems to be a problem with, at least, the last of these examples!

With the example sentence that link you to other video examples of this word being used, I found that it took a long time to load … and it really wasn’t worth waiting for.

After a catalogue of problems like this, you might wonder how I can say that this function wasn’t too bad, but I’ve seen a lot worse. It was, at least, mostly accurate.

Moving away from the ‘Watch’ options, I explored the ‘Learn’ section. Bearing in mind that I had described myself as ‘Upper Intermediate’, I was surprised to be offered the following words for study: Good morning, may, help, think, so. This then took me to the following screen:great job

I was getting increasingly confused. After watching another video, I could practise some of the words I had highlighted, but, again, I wasn’t sure quite what was going on. There was a task that asked me to ‘pick the correct translation’, but this was, in fact a multiple choice dictation task.translation task

Next, I was asked to study the meaning of the word in, followed by an unhelpful gap-fill task:gap fill

Confused? I was. I decided to look for something a little more straightforward, and clicked on a menu of vocabulary flash cards that I could import. These included sets based on copyright material from both CUP and OUP, and I wondered what these publishers might think of their property being used in this way.flashcards

FluentU claims  that it is based on the following principles:

  1. Individualized scaffolding: FluentU makes language learning easy by teaching new words with vocabulary students already know.
  2. Mastery Learning: FluentU sets students up for success by making sure they master the basics before moving on to more advanced topics.
  3. Gamification: FluentU incorporates the latest game design mechanics to make learning fun and engaging.
  4. Personalization: Each student’s FluentU experience is unlike anyone else’s. Video clips, examples, and quizzes are picked to match their vocabulary and interests.

The ‘individualized scaffolding’ is no more than common sense, dressed up in sciency-sounding language. The reference to ‘Mastery Learning’ is opaque, to say the least, with some confusion between language features and topic. The gamification is rudimentary, and the personalization is pretty limited. It doesn’t come cheap, either.

price table