From Babble to Brilliance: Can Language Models Inf...

GabiBuchner · ‎11-13-2023

Until recently, language used to be something we thought was the exclusive skill of humans, a skill that sets us apart from all other species. Just a few years ago the idea of machines that could understand and generate language was the stuff of science-fiction movies. Artificial intelligence was not a topic of broad public interest for much of its history, but this changed dramatically with the emergence of LLMs (large language models) like ChatGPT. A milestone in AI development, these language models can produce human-like text that is both correct and meaningful. Known as syntax and semantics, these two aspects are essential in understanding how language works.

Following Chomsky’s famous “colorless green ideas sleep furiously” example, I asked ChatGPT to analyze the following sentence: "A lot of butter doesn't yet give a sky". The model’s answer: “The sentence appears to be grammatically well-formed in terms of its syntax and structure. However, semantically, it is nonsensical and does not convey a clear or coherent meaning.” Asked to create a similar text itself, the model responded with: "Butterflies wear hats made of algebraic equations during lunchtime." This shows that the model has learned and understood the concepts of syntax and semantics, blurring the line between human and machine.

Considering the power, performance, and pervasiveness of language models, they will most probably influence the way we write and speak, both on a positive and negative note.

Vocabulary

Language usually evolves gradually and incidentally. New words or neologisms often originate from changes or innovations in science, technology, or culture. Language models are if not creative at least “generative” and can contribute to the richness and variety of our language by inventing new words, idioms or phrases, eventually resulting in shifts in language usage. I asked ChatGPT to come up with some novel words and phrases:

ChromaGlide Serenisphere: A description of a vivid, colorful dream world where one's imagination knows no bounds.
Pour Starlight into a Teacup: Expressing the idea of trying to contain something vast or boundless within a limited space.
Viralizeation: combining "viral" and "realization," it could describe the moment when something goes viral on the internet, leading to widespread recognition.
Count the Petals of the Midnight Rose: Meaning to engage in a futile or endless task.

These creations sound natural, meaningful, and valid and could be adopted by the English-speaking community. While the average speaker of English masters about 20,000 words, LLMs have a much larger vocabulary of roughly 15,000,000 words. Using this power can help you make your writing more varied and expressive by having the model suggest synonyms or more precise words, depending on the text type you are writing.

Text Quality

LLMs can ensure that your grammar and spelling is correct and consistent. This contributes to the overall quality of your writing. Sloppily written text with typos is often seen as an indicator of sloppily researched content and might negatively affect your standing. LLMs also do a great job in simplifying content or checking for unnecessary jargon. For example, if you are giving a talk on how LLMs work and your audience is very diverse, you need to think about how technical your talk will be and find a good balance. LLMs can explain even complicated concepts in a simple way. This makes information accessible to a broader audience and ensures straightforward and clear communication. You can also ask a language models to check your text for consistency, filler words or awkward sentence structures.

Style

You may have noticed that unless you include detailed style instructions in your prompt, language models tend to produce the same sort of writing: a uniform, straightforward and flat type of prose that can be understood easily. This writing style can be okay for technical documents, but literature and poetry need an identity and a personality. I asked ChatGPT to write a poem about a robot in the style of Hamlet’s “To be or not be”. This is the first verse: “To bot or not to bot, that is the query, whether 'tis smarter in the code to suffer, the glitches and crashes of outrageous servers, or to take up arms against a sea of bugs, and, by debugging, end them?” The poem is nice, sort of, but not genuine or original. To misquote Sherlock Holmes: “I know what is good when I read it.”

Heavy style prompting can help you tweak the model output. But even after I trained ChatGPT with shots of my own writing the output didn’t sound like a text of mine. A study found that text generated by ChatGPT shows several features that can help you distinguish it from human-written text: models are overly polite and never aggressive or rude. They don’t use personal pronouns. They tend to use atypical words or exaggerated language. Unless you instruct them to do so, models won’t include metaphors, sarcasm, irony, or humor in their responses, whereas humans will. Models also seem to be unaware of the “Show, don’t tell” principle of narration. Like in the Swiss study, I had ChatGPT write a negative restaurant review: “The food was the most disappointing aspect of the experience. The sushi and sashimi were far from fresh, and the presentation was lackluster. It was evident that the quality of the ingredients was subpar. The rice was overcooked, and the fish had a distinct off-putting taste.” Negative, but still moderate, polite, and unfortunately lifeless.

To mitigate the risk of homogenizing and flattening our language and style, use LLMs responsibly. Use them as a tool that assists you. Integrate them into your writing and content creation process but don’t let them take the lead. Bring your personality, creativity, and uniqueness to work, guiding the style and tone to create engaging texts. Don’t adopt everything the model suggests, and don’t adapt your own style to the model’s recommendations. This is important to avoid shifts in writing norms and trends that don’t come naturally but are provoked by the massive (ab)use and spread of model-generated content. It’s like with so many things in life - be authentic. This will help protect our style from becoming dull, disengaging, and predictable.

Variety of Languages

Language models may also contribute to the decline of smaller languages and dialects. The main reason lies in the training data. Most LLMs are primarily trained on high-resource languages (HRL). These are languages for which a huge amount of digitalized data and other linguistic resources are available, such as English, German, Chinese or Japanese. Since research and development focuses on these languages, they are well supported by use cases like machine translation, speech recognition, or sentiment analysis. English is by far the best resourced language.

I asked ChatGPT which language it’s best in: “My proficiency may be highest in English, as it is one of the most widely represented languages in the training data and is often used as a source language for translation and understanding.” Most LLMs indeed work best in English. ChatGPT also works quite well in German although you may find some grammar mistakes now and then. But most LLMs don’t work well for low-resource languages (LRS) such as Thai or Hindi. This might further reinforce the primacy of English. ChatGPT told me that “kindergarten” is a German word taken over from English while it is the other way round: “Das deutsche Wort "Kindergarten" ist ein Beispiel für ein aus dem Englischen übernommenes Wort.”

HRL-dominant models can wipe out smaller languages or LRSs because these can’t compete, lacking data. But it’s not only about wiping out languages. It’s also about wiping out variety, diversity, and ways of thinking. The language that we speak shapes our perception of the world and our thought processes, something which is called linguistic relativity. For example, Inuit has over 50 words for snow, and Korean has six different words for blue. If speakers shift towards using dominant languages online and in digital communication, this could speed up the decline of smaller languages and cultures.

Dialects are endangered too. Since LLMs are primarily trained on standard language content such as books, journals, newspapers, and Web sites, they prioritize this type of language in their own output. This can reinforce the common wrong belief that dialects are inferior, less prestigious, and even less credible, discouraging people from using them.

However, it’s up to us to turn the power of language models into an opportunity to help save and protect endangered languages. Half of our 7,000 languages world-wide will die sooner or later. We can use AI and large language models in several ways to fight this language death. It’s important to digitally preserve the spoken or written material in affected languages. LLMs can help document, transcribe and translate this material to make it accessible to more people. Models can also generate language learning tools and materials, such as flashcards, quizzes, or sample sentences, so people can learn the endangered languages. The Duolingo app teaches Hawaiian, for example, which is one of the most endangered languages of the world. Again, this effort is a collaboration of humans and models. Since we might not have enough language data to train the models, we need to involve native speakers and experts that assist in this process. Once training is complete, models can create new written content and even speech in the endangered language, which can be useful for creating audio materials or pronunciation guides, helping to keep the language alive.

Bias

openAI admits that its flagship GPT4 model was trained with a “US-centric point of view”. This Americanization is a type of intentional bias. The “kindergarten” example shows that GPT is very English-centric. Favoring content from larger languages and cultures can lead to a loss of diversity but also to the spreading of poor model content created in underrepresented languages that were not trained well. Speaking of bias, LLMs can reinforce language biases because they are trained on human-created text which can contain biases and stereotypes. Models might produce output that reflects gender, race, or other forms of bias. It’s therefore important that you check the generated model content and don’t use and share it as is.

Linguistic Skills

Large language models provide a unique opportunity for improving language skills, if you use them reasonably and responsibly. Both children in the phase of language acquisition and learners of other languages, no matter their age, can benefit from using LLMs in their learning process. Thanks to their extensive training, models can offer a broad range of vocabulary, sentence structures, and language contexts. Learners can practice interactively with models by prompting them for quizzes or exercises, receiving real-time feedback and evaluation. Language models also provide immediate translations into a lot of languages which is also a plus for learners.

There are, however, limitations that we need to consider. If we rely too much on LLMs and their generative ability, our own thinking and linguistic skills may deteriorate as we’ll miss the training and learning. We might even become dependent on models, using them for every language-related task we might have. I see this risk in particular with children and language learners. If they use models excessively, they might not engage enough in real conversations with other people. Direct contact with speakers and their culture is essential for learning a language, no matter if it’s your local or a target language. It’s important that learners use language models as additional and not as the only resource. And as I said, models make mistakes, and if you are a language learner, you may not spot them. ChatGPT admits: “Yes, I can make mistakes in languages other than English. While I have been trained on a diverse range of texts in multiple languages, my accuracy and proficiency can vary depending on the language and the complexity of the task. […] In languages with less training data, I may be less accurate and capable of making errors in grammar, vocabulary, and cultural nuances.”

Conclusion

Large language models are great tools as long as you use them responsibly. As the saying goes, with great power comes great responsibility. Make sure to use models as your assistants. Don’t overrely on them. Don’t trust them blindly. Validate every output to prevent spreading misinformation, biased text, fake news, mistakes, and formulaic language.

Disclaimer: Text samples created with GPT-3.5, images created with DALL-E3 and Leonardo Diffusion XL.

From Babble to Brilliance: Can Language Models Influence Human Language?

Common mistakes in technical writing: #3 - Beating Stephen King Syndrome

Modern World Magic? A Closer Look at Large Language Models and Their Impact on Technical Writing

From Babble to Brilliance: Can Language Models Influence Human Language?