MED Magazine - Issue 54 - August 2009

Feature
The future has arrived: a new era
in electronic dictionaries

by Michael Rundell

The talk among dictionary publishers these days is about ‘monetizing our assets’. Before you all head straight to the Ridiculous Business Jargon Dictionary, I should explain that this isn’t some clever ruse to extract more money from the people who buy our dictionaries. It’s more of a survival strategy for the brave new world of reference publishing – a way of ensuring that we can still generate the revenue we need in order to keep on improving our dictionaries.

Once upon a time, people bought dictionaries – big printed books – in very large numbers, just as families once bought prestigious multi-volume encyclopedias. As far as encyclopedias go, those days are already past: who’s going to pay hundreds of pounds for a reference book which will be out of date by the time you get it back from the bookshop? Especially when you can find all the encyclopedic information you need online, without paying a penny. Things haven’t reached this stage with dictionaries (yet), but the writing is on the virtual wall – and in some parts of the world (Japan and Korea, for example), electronic dictionaries have already more or less replaced printed ones. This creates problems for publishers: if dictionaries are free (as encyclopedias already are), how can we fund the next generation of new and better reference resources?

Leaving aside the financial implications for a moment, the shift to a digital medium – in which the paper dictionary is complementary to a larger and richer electronic resource – is something that dictionary-makers welcome. It offers fantastic opportunities for us to provide our users with even more of the lexical information they need. As anyone working with corpora knows, there is still plenty more we can do to improve and expand our description of language. We now have at our disposal vast corpus resources (including general English, ESP corpora, and learner English) as well as increasingly clever corpus-querying software. Dictionary publishers are hard at work turning this research data into useful (and user-friendly) information about how the English language works. But this is skilled and labour-intensive work, and if in the future fewer learners and teachers buy physical dictionaries, we’re left with the question of how to fund the developments we all want to see. This is where the ‘monetizing’ bit comes in. For dictionaries, the old revenue model (bookshop sells book to customer and passes a percentage of the price back to publisher) is in decline, and a new model is beginning to emerge.

Electronic dictionaries: the story so far

First, though, a bit of background. Until fairly recently, there were two kinds of electronic dictionary: small handheld devices with one or more dictionaries loaded on them, and optical disks (CD-ROMs and latterly DVD-ROMs) sold alongside the big (paper) learner’s dictionaries like the Oxford Advanced Learner’s and the Macmillan English Dictionary (MED). Handheld dictionaries – small devices about the size of a BlackBerry – have been around for many years: the Speak & Spell machine that ET cannibalized in order to phone home was an early and primitive example. This is the format of choice in Japan, and current models may include up to a hundred different dictionaries – monolingual, bilingual, general, specialised, you name it. Almost three million of these are sold annually in Japan alone, so it’s a huge market. All the well-known learner’s dictionaries appear on one or other of these devices, but the publishers make very little money from this kind of licensing – a fraction of what they would earn from selling a physical book. One can’t help feeling this is a transitional technology (albeit one that has shown remarkable staying power): it’s very hard to use these dictionaries effectively because their contents are so diverse, and minimally integrated. In any case, this ‘pile-it-high’ model isn’t well-adapted to the needs of language learners: a dictionary with two million terms on it may sound impressive, but who really needs it?

Longman’s Interactive English Dictionary was the first learner’s dictionary to appear in CD-ROM form, back in 1993. Early versions of CD-ROM dictionaries were partly a sales gimmick (‘Look how cutting-edge we are!’) and partly a genuine effort to engage with the new technology and see how it could improve access to the information in the dictionary. The new medium provided far more powerful search functions than the basic alphabetical order that conventional dictionaries rely on. Throw in audio pronunciations, and a few games and exercises, and that was the basic package for several years. Looking back, what is striking about those early electronic dictionaries is that the print medium was assumed to be the ‘primary’ one, with the electronic a sort of afterthought: the layout of the CD-ROM screens more or less replicated what you would find on the pages of the printed book, and publishers were slow to grasp the implications of the new medium. For example, dictionaries have traditionally handled idioms by explaining them at one entry, and using cross-references to redirect the user from other possible locations: thus, kick the bucket might be defined at the headword kick, and if you looked it up at bucket you would be referred to the ‘right’ entry. This was, simply, a space-saving strategy: paper dictionaries have to pack a lot of information into a limited space, so you can’t afford to have two (or more) entries for the same idiom. There is no need to do this is an electronic dictionary, of course – but old habits die hard.

Gradually, these products improved as they began to exploit the opportunities of the medium more intelligently. The CD-ROM for the Macmillan English Dictionary, for example, includes an ‘advanced search’ function that allows you to perform complex searches with minimum difficulty, by combining any number of features like register, frequency, and grammatical behaviour, in a Boolean search. So if you want a list of all the high-frequency transitive verbs which are never – or almost always – used in the passive, this is easily done. Or you might be interested in all the words and phrases marked both ‘British’ and ‘humorous’. Or a list of every entry that has the subject-label Cinema. These and other features – notably a thesaurus which provides near-synonyms for every word, phrase, and meaning in the dictionary – mean that the CD-ROM is not just an easily-searchable version of its paper counterpart, but a store of ‘new’ information which simply wouldn’t fit in a printed dictionary.

But this, too, is a transitional model: CD-ROMs are an ageing technology and many newer computers, especially the smaller netbooks, have dispensed with optical drives altogether – a trend that is set to continue until these go the same way as the floppy disk drive. Significantly, the most recent entrant to the ELT dictionary market – the Merriam-Webster Learner’s Dictionary – was launched in 2008 without a CD-ROM: it exists only in paper and online editions. The newest Longman dictionary – the 2009 edition of the Dictionary of Contemporary English – comes with a DVD-ROM, but this may well be the last gasp for a technology which has been overtaken by the Web.

Dictionaries on the Web

And so to the rapidly-approaching future. As long ago as 1990, Dwight Bolinger predicted the end of the paper dictionary, and although rumours of its imminent death have been exaggerated, we are now on the cusp of a revolution in dictionary publishing. Macmillan has just launched an online version of its flagship dictionary (MED), at www.macmillandictionary.com. ‘MED Online’ (or MEDO, as we call it) is not a stripped-down or ‘basic’ version: it has most of the content and functionality of the ‘paid-for’ dictionary – yet it is completely free to the user. The screenshots here give an idea of how it looks. The first shows what you find if you search for the word benefit as a noun: an outline of the full entry, with a definition and an example for each of the word’s four meanings (and for the phrase give someone the benefit of the doubt).

benefit1

The ‘Show More’ button brings up a fuller version of the entry, with IPA symbols, a ‘Collocation Box’, and a comprehensive account of the syntactic behaviour of the word.

benefit1

Finally, the third screenshot shows what happens when you hit the red ‘T’ button which appears next to every meaning: this activates the thesaurus (one of the trademark features of MED’s CD-ROM version): in this case, we see a thesaurus entry for the fourth meaning of benefit, a charitable event.

benefit1

The other big advantage of the online mode is that we can engage with the people who use our dictionaries. MEDO includes a whole area (called mPulse) which has articles and blogs about language issues, and an Open Dictionary, to which users can contribute their own entries, helping us keep track of changes in the language.

This is a rich resource, and in a different class from the standard offerings you find at the numerous free dictionary sites on the Web. Try looking up benefit at www.thefreedictionary.com, for example, and you will find, first, a confusing and horribly cluttered homepage, then an entry for benefit with only a single example sentence, definitions written in clunky ‘dictionary-speak’, and not a word about syntax or collocations. There is simply no comparison.

But how can Macmillan offer such high-quality resources for nothing? To understand the commercial logic, a good place to start is an article by technology guru Chris Anderson, the Editor of WIRED magazine and the person who came up with the idea of ‘the long tail’ – the business strategy of pursuing many little fish (rather than a few big fish). In a fascinating article called Free! Why $0.00 is the future of business, Anderson explains how ‘you can make money by giving something away’. In fact, of course, this model pre-dates the internet by many years: we don’t pay anything to watch commercial TV, for example, but the people who produce the programmes have to get their funding from somewhere – and that somewhere is advertising. Google’s vast fortune is based precisely on this approach: the public can use Google all day long if they like, without paying a penny to the company. It’s the adverts in the sidebar that generate Google’s revenue. And so it is with MEDO: as the screenshots show, there is generally a ‘banner’ advert running across the top of the screen, then several smaller ‘classified’ ads running down the side: the trick is to ensure these are visible but not intrusive.

Getting this right involves all sorts of interesting calculations which – as a career lexicographer – I never expected to be involved in. If, for example, we show too many adverts, the user will just switch off. (Google generally gets this balance right: you rarely feel you’re being swamped with annoying advertising.) Or again: how much of our dictionary do we show in the online version? If the free version is too ‘basic’, people will use one of the alternatives instead – but if we give away too much valuable data, won’t that devalue the (paid-for) book-plus-CD-ROM combo? To a certain extent, then, we are feeling our way here, and we expect this resource to improve and develop over time. But the great thing about an online resource is that users can tell us what they like (and what they don’t) and the whole thing can be continuously upgraded. From my point of view, it was worth learning about ‘monetizing assets’ in order to be involved in this ground-breaking venture. And for the dictionary user, we believe this new online resource – which also looks good on devices like the iPhone and other mobiles – takes dictionary-publishing into an exciting new future.


Article first published in IATEFL’s CALL Review, the newsletter of the Learning Technologies Special Interest Group, Spring 2009. We would like to thank CALL Review for permission to reprint this article.

Copyright © 2009 Macmillan Publishers Limited
This webzine is brought to you by Macmillan Education