Advanced hindsight

Mini-column ‘Alison in Wonderland’, published in the Observant, Maastricht

My favourite academic this week is Dan Ariely, an Israeli-American professor of Behavioural Economics. He gave people a series of simple maths problems and said he would pay a dollar per correct answer. The result? People cheat more if they see someone else cheating whom they identify with. And they cheat twice as much when rewarded with a token that they can later trade for cash. Consider this in view of the financial crisis: people act most immorally when members of their in-group are doing so, and when they are removed from the actual money; say, when a dollar is called a derivative. Not surprisingly, Ariely is also one of the founding members of the parody-like ‘Center for Advanced Hindsight’ at Massachusetts Institute of Technology.


A mixed bag of Dunglish

Corpus work can be either data driven or theory driven. Theory driven (top-down) means that we want to analyse a corpus with a particular hypothesis in mind (‘Dutch people always seem to say x funny; let’s see if this is true’). In this case, you search the corpus in a directed way, looking specifically for x. In contrast, a data driven (bottom-up) approach is more exploratory: you rummage around in the corpus and see what you can find, without anything in particular in mind; in other words, you let the data speak for itself.

Some people might argue that when it comes to Dutch English, the approach can only be data driven. This is because in a theory-driven approach, we would face a certain paradox. Corpus work shows us what the features of a language variety are: what grammar features are used, what words, how and where. Without corpus work, we can’t know these things (we can suspect, but we can’t know, systematically and empirically); therefore, we can’t formulate a hypothesis. In other words, you can’t form a theory about particular features of Dutch English to investigate in a corpus if you haven’t already done corpus work to figure out what the particular features of Dutch English are. Still with me?

Of course, the reality lies somewhere in the middle. It’s easier to search a corpus if you’re looking for something, even if you just have a few clues or hunches to begin with. Other things will then emerge during the process, and if you have an open mind (and an open research agenda) you can then pursue them if you want. So at the very least, we can direct the search to some extent based on hunches we might have about Dutch English. And those hunches, interestingly, come directly from work that is diametrically opposed to ours, at least in terms of intention. Anything that has been labelled a ‘common error’ that Dutch people make in English, for us now becomes a potential dialectal ‘feature’.

Think of anything that’s ever struck you as ‘typically’ Dunglish. Now we can see if these anecdotal, impressionistic and casual observations really are characteristic of Dutch English – i.e. used systematically by lots of people – or if they just stick in our minds because they are salient and funny, and therefore seem more predominant than they really are.

Over the years, I’ve compiled a list of all these casual observations. You can find any number of them in books like I always get my sin or Righting English that’s gone Dutch, and I’ve collected my own examples from the gazillion Dutch-authored texts I’ve read and edited over the years. You can find a random – but by no means exhaustive – selection of these below. Apologies in advance, on several fronts: I’ve not replaced all the jargon here and have categorised these hurriedly, without adding further explanation; and I’ve only included things vaguely related to grammar and vocabulary, ignoring, for the time being orthographic (spelling-related) things as well as discourse/pragmatics. Those are for another day. In the coming weeks, I intend to go into more detail which two or three aspects we’ve decided to investigate in painfully minute detail…

Grammatical features

Use of adjective instead of adverb

  • The aim is to organise the services as efficient as possible.
  • All answers are treated fully confidential.
  • More concrete this means we offer three types of programmes.

Sentence fragments

  • Whereas research shows that exercise has no added value.
  • Very interesting to study how they relate to other asset classes.
  • Goals that have given input to the company strategy.

Lack of perfect aspect

  • Since twenty years there is political stagnation.
  • In this organisation the tool is used for years.
  • Almost every laboratory makes such interventions traditionally.

Extension of perfect aspect

  • A study in 2004 has demonstrated that the intervention worked.
  • Yesterday you have received a formal invitation.
  • He has been the founding father of the institute.

Lack of progressive aspect

  • At this moment, we negotiate with other possible sponsors.
  • More and more, publishers allow open access to articles.
  • For now, he enjoys being in the Netherlands.

Extension of progressive aspect

  • Society isn’t working like that.
  • First students register for a course and subsequently tutors are being assigned.
  • As long as you will be receiving unemployment benefits, you will also be entitled to collective health care insurance.

Nonstandard use of that-clause

  • This is an initiative to achieve that terminology is translated unambiguously.
  • Students have the luxury that they can access two libraries.
  • To avoid that variables were selected by chance…


  • Looking forward to hear from you.
  • Sometimes I have problems now to find the right word.
  • Good education is worth to be given.

Auxiliary usage

  • Do you simply haven’t got a clue what your career prospects are?
  • They did not only work on that in the lab, but also at home.
  • Under no circumstances these men do want to lose their power.


  • But not only in Holland women had to deal with this traditional gender construction.
  • Only then, differences can be turned into vehicles for learning.
  • Not until the end of term students can go home.


  • We learned some lessons that we like to share with you now.
  • I give you an example.
  • There you find information such FAQs and contact details.

Of-structure with animates

  • This programme will lead to a healthier lifestyle of diabetes patients.
  • That is the car of my dentist.
  • The book of Feynmann was fantastic.

Word order

  • teacher English
  • course mathematics
  • opening hours library

Frontal overload

  • Especially for our external clients this could be an interesting offer.
  • Doctors can diagnose correctly at a very early stage thanks to this method.
  • The basic assumption that foreign students will not enter the Dutch labour market, but will practice at home, is the fundamental argument for this setup.

Countable use of mass nouns

  • In 2006 she published a research in Science.
  • We also offer an Outlook training.
  • On this website, you can find advices about …

Placement of phrasal modifier

  • The by critics highly praised movie …
  • Privileged or otherwise from disclosure protected information may be included in this message.
  • The in 2007 deceased co-founder of the company was responsible for the accounts.

Adverbial placement

  • Our models contain also clinical variables.
  • I have still a week to choose a proper outfit.
  • Later she made accidentally acquaintance with a famous artist.


Multiple titles

  • Prof. dr.
  • Dr. ir.
  • Mw. prof. ing.

Dutch titles

  • Drs.
  • mr.
  • heer

Dutch/nonstandard abbreviations

  • nr.
  • a.o.
  • f.e.

Nonstandard use of ‘in case’

  • In case your personal situation changes, you have to inform the Tax Office.
  • In case you don’t have one, please request one at the council.
  • One credit point is awarded, in case the course is completed successfully.

Dummy subject

  • It is not allowed to copy software to the system.
  • It will be advocated to pay special attention to methodology.

Lexical shift/borrowing/false friends etc.

  • accent for emphasis
  • actual for topical/current
  • agenda for mean diary/calendar:
  • beamer for projector
  • college for lecture
  • consequent for consistent
  •  diverse for various
  • eventual for possible
  • mail for email
  • miss for lack
  • paragraph for section
  • public for audience
  • sporter for athlete
  • stage for internship
  • technique  for technology

Nonstandard prepositions/phrasal verbs

  • We hope to see you on one of the events!
  • She will hold a lecture on an international conference.
  • Welcome at Schiphol.

Redundant prepositions

  • discuss about
  • emphasise on
  • attend on

Bad physics jokes

Mini-column ‘Alison in Wonderland’, published in the Observant, Maastricht

An atom walks into a bar and says to the barman, “I’ve lost my electron.” “Are you sure?” says the barman. The atom replies: “Yes, I’m positive.” It’s bad physics jokes like this that keep me going when my PhD feels like it’s sunk into, er, a black hole. And they get worse: Schrödinger’s cat walks into a bar. And doesn’t. Said to be true (though very likely not) is the joke about the German theoretical physicist Werner Heisenberg, who developed the uncertainty principle. This states that we can accurately measure the position of something, or its momentum, but not both at once. Stopped in his car by a police officer, he was asked, “Do you know how fast you were going?” The response: “No, but I know where I am” [insert laughter here].

We have a corpus.

Last time I wrote about my PhD, an age ago, back in November, data collection had just been completed and I was about to start building the corpus proper. Four months and a little bit of my soul later, I’m pleased to report that the corpus is now built. This is exciting, because it’s the first corpus of Dutch English that there is (that we’re aware of, anyway).

Creating the corpus, as I mentioned in my last post, consisted not just of converting many many Word documents into XML files, but also enriching those files with what we call textual markup or annotation. There are several reasons for doing this:

  1. To reinstate the formatting of the original text. An XML file is similar to a .txt file in that any text you copy into it is stripped of all its formatting. So to make sure you are producing a faithful representation of the original text, it’s important to reinstate this formatting using XML tags. For example, if a word in the original text was in italics, it is marked in the XML file like this: <it>word</it>. We do the same thing to indicate bold font and underlining, the start and end of paragraphs, headings, hyperlinks, footnotes , superscript and subscript, and changes of typeface. We also use special tags for quotes; if an academic text in the corpus quotes, say, a British scholar, then we mark the quote as ‘extra-corpus’ (<X>quote</X>) to make sure that it is not included in the analyses. And finally, we have various tags for untranscribed text, so for example if an original academic text had lots of mathematical formulae or tables, which are time-consuming to retype and irrelevant for the linguistic analysis anyway, they are simply marked as <untranscribed type=“formula”/>, <untranscribed type=“table”/>, etc.
  2. To ‘enrich’ the corpus for the purposes of linguistic research. By this we simply mean that tags are also used to provide useful information other than that related to formatting. As an example, Dutch words are marked as such: “The party was like oh my god totally <dutch>gezellig</dutch>.” This is because, in the future, someone, somewhere, may decide they want to research the phenomenon of code-switching, for example (where people switch back forth between languages). Marking every instance of a Dutch word in these English texts means that you would then only need to search the corpus for the tag <dutch> rather than searching individually for, well, every Dutch word there is.
  3. To ensure anonymity for the contributors. All those lovely people who bravely handed over their texts to some random researcher via email need to be guaranteed their privacy. So emails will now read “Dear Ms <anonymisation type=“family-name”/>” or “Didn’t  <anonymisation type=“first-name”/> look horrific the other day?” Naturally, this applies to all texts and all identifying items, like company names, addresses, phone numbers, bank account numbers and even pets’ names.

Not having umpteen undergraduates to do the leg work for me, this meant reading through every line of every text myself and inserting the appropriate markup. But all’s well that ends well: with all the XML files now complete, we can begin analyses! By which, of course, I mean we can begin the process of deciding what to analyse in excruciating detail. Tenses, as in “I am working here since five years”? Lexical items, like “Prof. dr. So-and-so”? Fabulously exciting word orders, like “The by critics highly praised movie was rubbish”? This means the coming months will be filled with a return to the literature: what sorts of analyses have been conducted using similar corpora, and what sorts of analyses seem as though they will be feasible, and interesting, in this corpus? Exciting times …

The confidence of the stupid

Mini-column ‘Alison in Wonderland’, published in the Observant, Maastricht

After ‘Super Tuesday’ in the US last week, it looks likely that Mitt Romney will bring home the Republican nomination. It hasn’t been an easy race. Critics were quick to jump on Romney’s casual mention of the ‘couple of Cadillacs’ his wife owns; not something that will endear you to a country still deep in economic crisis. Still, all he had to do was seem less stupid than his rivals. We saw Rick Perry forgetting the name of a government agency he would cut if elected, and Herman Cain forgetting what Libya is. But no-one can top the claim by former vice-presidential nominee Sarah Palin that ‘we’ve got to stand with our North Korean allies’. The philosopher Bertrand Russell said it best: ‘The trouble with the world is that the stupid are so confident, while the intelligent are full of doubt.’

Noble ambition

Mini-column ‘Alison in Wonderland’, published in the Observant, Maastricht

So Maastricht wants to be named European Capital of Culture 2018. A noble ambition, which puts us in the league of such esteemed former winners as Turku, the official ‘Christmas city’ of Finland, and Bruges, whose claim to fame remains a fleeting visit by the spunky Irish actor Colin Farrell. Not to mention Linz, once home to the mathematician Johannes Kepler, who took the scientific world by storm with his laws of planetary motion, and of course Adolf Hitler, who, er, took the world by stormtroopers and an unhealthy disregard for laws. Of course, the Capitals of Culture also include the all-important birthplace of jeans (Genoa, Italy) and of Skype (Tallin, Estonia). It remains to be seen what world-changing invention will spring from Maastricht. Or whether Colin Farrell will drop by.

“Good English is proper English”, and other fallacies

Last week I presented a paper called ‘Good English is proper English’, and other fallacies at the King’s College Cambridge lunchtime seminars, for a non-specialist audience. Thanks go to all those who laughed in the appropriate places. Here’s the abstract:

With the spread of English around the globe, the native speaker is said to be dead. Other people ‘own’ English, and can do what they like with it. In countries where English is an official language, legitimate varieties of the language now exist: Indian, Singaporean, Nigerian English and so on. But what about in northern Europe, where English is not an official language, but English competence is almost universal and it is pervasive in the media, commerce and education? Could it develop into full-fledged varieties there too, with e.g. Dutch or German English standing alongside British or Australian English? Will it one day be okay to say “I live here since three years”? This talk will look at why ‘proper’ English can be bad in some contexts, and why ‘bad’ English can sometimes be right and proper.

Golden showers

Mini-column ‘Alison in Wonderland’, published in the Observant, Maastricht

I’m looking for the Yeti. Well, not literally. I’m looking, in my PhD, for a ‘Dutch English’, a legitimate variety of English like, say, Australian or Indian English. Like the Yeti, it’s something we’ve all heard of, but no-one’s ever seen. While it’s easy to make a theoretical case for Dutch English, actually pinpointing this mythical dialect is much harder. Consider the English sentence: The movie, which was highly praised by critics, was rubbish. From a Dutch speaker, you might hear: The by critics highly praised movie was rubbish. Cute, sure. But if we allow that, must we consider all blunders and mistranslations as ‘legitimate’ Dutch English? I’m thinking here of the unfortunate phrasing of those like former VVD leader Frits Bolkestein, who used to refer to economic prospects as ‘golden showers’, ignorant (or not?) of the sexual connotation …