Tuesday, December 10, 2019

How Easy is it to Read Chaucer?

Geoffrey Chaucer's Canterbury Tales is often used as an example of how much the English language has changed in 600 years. If anyone knows any Middle English, it's probably the opening of the Tales, the first 18 lines of the General Prologue: "Whan that aprill with hise shoures soote...". English has changed a great deal since Chaucer's medieval tour bus pulled out of Southwark on its way to Canterbury, but how much has it changed? How difficult (or easy) is it to read Chaucer? 

The answer is subjective. However easy or difficult it is to read a text depends on the reader's knowledge of the language used. Despite this, I became interested in how Chaucer's English, dating from the last quarter of the 14th century, compared to 21st century English. I hypothesized that, turns of phrase, cultural references, and pronunciation differences aside, the closer a word was to modern spelling and meaning, the "easier" it would be for someone with no experience in Middle English to read. To test this, I took a simple section of the Canterbury Tales-- The first 18 lines of the General Prologue-- and compared it to modern English. While it is not representative of the Tales as a whole, it is the most well-known part and most commonly used as an example of what I wanted to investigate-- how different it is from English spoken today. I split words into three categories: 1) words that have the same meaning and spelling in Middle English as they do in modern English, 2) words that have Middle English spellings but are recognizable from modern English, and 3) words that are not recognizable with just a knowledge of modern English. This is the result of my study.

Part 1 - The Middle English

To get the Middle English, I decided to use my own transcription rather than one published by someone else. This is because published editions often have spelling changed to some degree, even if still in Middle English. I wanted to be sure that the Middle English I was comparing was as Middle English as possible. To do this, I used the "Ellesmere Chaucer" manuscript, Huntington MssEL 26 C 9 as my source. I transcribed the lines using a program called Transkribus.

Transcribing San Marino, Huntington MssEL 26 C f.1r on Transkribus
My completed transcription, with a few abbreviations filled out:

Whan that aprill with hise shoures soote
The droghte of march hath perced to the roote
And bathed euery veyne in swich licour
Of which vertu engendred is the flour
Whan zephirus eek with his sweete breeth
Inspired hath in euery holt and heeth
The tendre croppes and the younge sonne
hath in the Ram his half cours yronne
And smale foweles maken melodye
That slepen al the nyght with open eye
So priketh hem nature in hir corages
Thanne longen folk to goon on pilgrimages
And palmeres for to seken straunge strondes
To ferne halwes kouthe in soundry londes
And specially from euery shires ende
Of engelond to caunterbury they wende
The hooly blissful martyr for to seke
That hem hath holpen whan þat they were seeke

Part 2 - Asking someone who knows nothing about Middle English

To get an idea of what could be understood by someone with no experience in Middle English, I showed the transcription to my mother, who is not familiar with Chaucer or his language. She marked what she could more or less understand and what she didn't understand at all.

Yellow highlights indicate what someone with no experience in Middle English understood, blue for what wasn't understood.
The only things marked were things understood and things not understood. Surprisingly, most of the language was understood to some degree. I didn't count things like whatever "hath in the Ram his half cours yronne" (GP.l.8) actually means in context (halfway through the sign of Ares), just that the reader could understand the words in general. The things that weren't understood were extreme variations in spelling (like "shoures," "seken," and "holpen") and words that have no modern English equivalent (e.g., "yronne," "ferne," and "kouthe").

This only answered part of my question. I had identified three kinds of words and this part only gave me two of them-- things understood and things not understood. What about the in-between of words that were similar to modern English but spelled differently? That led me to:

Part 3 - Breaking things down further

I took the transcription and looked back at my original three criteria-- words that were the same in Middle and modern English, words that had spelling variations but same or similar meanings, and words that were not recognizable from modern English at all. Applying these three criteria, I marked the section according to my own understanding of modern English.
Lines 1-18 with three categories of words

Words marked in yellow are spelled the same as their modern English counterparts, green indicates a Middle English word similar to a modern English word but spelled differently, and blue for words that are either spelled so differently from modern English (like "hem" for "them," "þat" for "that," and "seeke" for "sick") that they are unrecognizable or have no modern English equivalent (like "halwes" and "eek").

There are 128 words in total in the first 18 lines of the General Prologue, and 90 unique words (counting spelling variations, like "his" and "hise"). Of these 90 unique words, 33 were the same as modern English, 42 were similar but with variations in spelling, and 15 were unrecognizable.

A chart representing the distribution of unique words in lines 1-18 of the General Prologue
I was surprised to find that the majority of unique words were similar to modern English, just with spelling variations (like "Caunterbury" for "Canterbury" and "droghte" for "drought"). I was also surprised to find that only a small portion of this text was completely unrecognizable. 

What's changed the most is minor spelling and pronunciation. The vocabulary in the first 18 lines of the Tales is largely still understandable in the modern day, with only 15 unique words-- 16 in total-- not having a clear modern equivalent-- and that's out of a total of 128 words.