The sunset of natural language

By Dr. Jedrzej Osinski
18 Oct 2021

Many would argue that the driving purpose behind technical innovation is to make people’s lives easier. This extends to the many applications that support automated translation. Today it takes seconds to translate pages of text across dozens of languages. Although the English language is still perceived as modern Latin that connects the world, mobile applications like voice-to-voice translators allow someone to communicate effectively and more spontaneously with no effort needed to learn a foreign language. Without a doubt, this innovation creates major value for both travellers and for businesses. Companies no longer need to create many documents or websites to get their products and services presented and sold to people across the globe. A company can quickly translate their websites automatically or even leave it purely to an Internet browser. Communication has never been easier. However, this way of working and living raises considerations that are worth discussing.

The first thing to highlight is that the output from the automated translation is never the same quality as the original text. The result may be grammatically correct and the semantics (meaning) of the text may be perfectly transferred too however, these solutions are not technically capable of analysing the wider context (pragmatics), including a cultural one. Let’s look at an example. Have you heard the anecdote of the interpreter who worked with Polish president Lech Walesa? While visiting Japan, Walesa said that communists in Poland are like radishes: red outside, while white inside. In Japan, the backdrop of this talk, radishes are completely white. The interpreter knowing the context well, used a prawn in the simile instead. Adapting in real-time to reach for the right replacement requires an excellent reflex and deep understanding of the cultural context in which information lands. Currently, no automated translation is a match for an interpreter like this! Similarly, text translated automatically would never keep the word play or unique melody captured within language.

To avoid issues in automated translation, content editors and the like tend to simplify English to make it easier to understand and translate using an automated tool. The review of a final output is also often minimised to reduce the costs. We can see this on some multi-language websites; as long as the text is grammatically correct and delivers the expected information, no extra effort is made. Little to no investments are made to make text richer or more sophisticated. The text becomes flattened, schematic. Only the most familiar words and sequences are used and any unobvious comparisons or rarely used verbs are avoided. As a result, to achieve a successful automated translation, simplified English has become the key language on the Internet. For many, the Internet is the key medium where content is discovered, English is learnt and new words and phrases are discovered. The result? Our own language becomes simplified, patterned and our mental dictionaries reduce in size. Just like any other muscle, language skills become weaker and less responsive when not stimulated and challenged on a regular basis.

Automation is not the only reason for English becoming ever simplified. Another reason is to make it more inclusive. English speakers often feel obligated to simplify their language and slow down while talking to foreigners. As the number of second or third language English speakers increases in global companies, simplified English becomes the dominant international language of business. However, although the information transfer seems easier and less ambiguous, losing spontaneity makes communication more formal and, exactly against the intention, may increase the distance between interlocutors. Intentional simplification may also remove individual features (specific words, grammar structures, accent etc.), which are unique to each person. Removing these linguistic fingerprints makes communication flattened, less emotional and the dialogue between people is less likely to be sustained; our conversations with one another are shorter and more sanitised. English is not my first language and I have been learning it for at least 25 years. I still encounter situations where I miss a single phrase or word, but I accept it and this motivates me to keep learning. The truth is, I really enjoy listening to people who talk English freely, being themselves and without trying to simplify the complexity so that I can easily follow. Only then can you hear the real beauty of English and its unique melody.

The simplification of a language is also intentionally performed to ensure higher text accessibility. This is especially important in helping to remove barriers and support people who do not have high reading comprehension or suffer from diverse types of disabilities to use common products, services and information sources. Readability levels required for documents in specific domains (e.g., related to banking, insurance, healthcare) are already often regulated by law. There are various formulas to measure it that are based on the number of words, sentences, syllables or letters in the given text. One of the widely used metrics is the Flesch Reading Ease that scores a text between 1 and 100, where 1 means extremely difficult while 100 is amazingly simple; 90-100 is easy to understand by 11-year-olds, while 30 is a specialist scientific paper. Although these kinds of regulations seem valuable, there are many tools and editors suggesting text simplifications by default regardless of the context of the prepared text. Here’s an example:

I believe that whenever we use automated tools while writing we need to be careful. It is important to focus on the output we are looking to achieve and to keep the balance between text being readable as well as context appropriate.

Here’s another example. I copy and pasted this masterpiece of a paragraph (by Hemingway) into one of the online services:

“He was an old man who fished alone in a skiff in the Gulf Stream and he had gone eighty-four days now without taking a fish. In the first forty days a boy had been with him. But after forty days without a fish the boy’s parents had told him that the old man was now definitely and finally salao, which is the worst form of unlucky, and the boy had gone at their orders in another boat which caught three good fish the first week.”

The tool gave me the following suggestion: This sentence is very long. Consider rewriting it to be shorter or splitting it into smaller sentences.

Very often, the need to simplify English comes with good reasons: inclusiveness, accessibility and building bridges between communities. However, we need to be aware of what we lose when we add technology. We should identify the situation when this kind of mechanism should be applied, rather than defaulting to the tech solutions on offer. The complexity of a language is what stimulates our brains on a regular basis. It helps us deal with emotions when we can name them more precisely. Building language constructions increases our creativity. Grappling with the messiness of language often motivates us to learn and explore. Language is a form of unique art we are all part of creating and shaping. It is often said that language is our civilization’s biggest and greatest invention. Let’s not replace it with a schematic algorithm.

I am often asked whether machines will ever speak the way humans do. If we continue on this path, it will be us who first learn how to speak like machines.


Artificial Intelligence & Machine Learning

Latest news