Do you often use the word “tapestryWandteppich, Tapisserietapestry” in English? How about “delveforschen, grabendelve” or “testamenthier: Beweistestament”? Many of us would say no. These words aren’t among the “Top 500 spoken words” — a list of the most common English words, by the Cambridge English Corpus. Realistically, they’re not in the top 1,000 either.
However, these words are becoming more common in some written content. A study in March 2024 by Australian researcherForscher(in)researcher Dr. Jeremy Nguyen, for example, found that five percent of all articles published on the research sitehier: Rechercheplattformresearch site PubMed used the word “delve.” In 2022, it was less than one percent.
Why the sudden rise in the use of this word? Since ChatGPT was release sth.etw. herausbringenreleased to the public, in November 2022, we’ve seen that artificial intelligence (AI)Künstliche Intelligenz (KI)artificial intelligence (AI) favors certain words — and ChatGPT’s favorites include (you guessed it) “tapestry,” “delve” and “testament.” This has been a warning sign for language enthusiasts around the world.
Why does ChatGPT seem to prefer certain words? And, more importantly, what kind of influence might AI have on language in the future? The answer is actually very human.
AI in everyday life
While AI seems to have appeared overnight, the technology has been around in subtlesubtil; hier: nicht so deutlich wahrnehmbarsubtle ways for years. The facial recognitionGesichtserkennungfacial recognition that unlocks our phones is a form of AI. When a streaming service recommends a new series to watch, that’s also AI.
However, these are narrowly focusedeng eingegrenztnarrowly focused systems designed to do a specific thing. Generative AI, which seems to think for itself, has appeared only recently. Of course, what AI really does is analyze a very large amount of human-generated information, and use the data to spit sth. outetw. ausspuckenspit out answers to our queryFrage, Anfragequeries. It doesn’t actually understand what it’s telling us.
When testers tell AI which responses sound genuine, their own preferences come through
The people behind AI’s vocabulary
The process of rate sth.etw. auswertenrating AI responses, as part of training the software, is called “reinforcement learningVerstärkungslernenreinforcement learning from human feedback” (RLHF), and this provides the first clueHinweis, Anhaltspunktclue about why ChatGPT favors certain words.
It’s not cheap to employ a lot of testers to rate thousands of responses from AI. So, as they often do, big companies outsource sth.etw. auslagernoutsource these jobs to lower-income countries, often places where English is not the first language. It’s quite common for non-native speakers of any language to speak more formally than native speakers usually do — after all, slang isn’t typically taught in language courses.
Certain regions also prefer certain words and expressions. Consider that the same room is called the “washroom” in Canada, the “bathroom” in the U.S. and the “loo” in Britain. Together, these factors lead each nation to develop its own list of the most common words.
One country that offers cheap laborhier: Arbeitskräftelabor is Nigeria[wg. Aussprache]Nigeria, and in Nigerian English, “delve” is a relatively common word in professional settinghier: Umfeldsettings. So, when Nigerian testers tell AI which responses sound genuineecht, authentischgenuine, their own linguistic preferences come through.
Natural evolution: AI or not
As AI-generated content becomes more common, will it also affect sth.etw. beeinflussenaffect the way we speak to each other? This has to do with the natural evolution of language itself. In some ways, AI has already leave one's markseine Spuren hinterlassenleft its mark — expressions such as “large language model” (LLM), “generative AI” (GenAI) and even “generative pre-training transformer” (GPT), for example, have become familiar.
Of course, language is a living thing, often influenced by current events (“social distancing” wasn’t a common expression before 2020) and by new technology. In the age of streaming, we don’t “tape something” that’s on TV anymore. On the other hand, we still “tape sth.etw. aufnehmenhang up” at the end of a phone call even though the physical hookHaken; hier: Telefongabelhook disappeared long ago. And, of course, each generation has its own slang.
The point is that language is always updating itself, and how we use language is influenced by who we talk to and by the media we consume. As AI-generated media spreads, could that influence the evolution of our own vocabulary? It’s possible; even likely. However, while all languages evolve, when it comes to AI, not all languages are the same.
If AI favors English, what will that mean for other languages?
AI and smaller languages
English content dominates the raw dataAusgangsdaten, unverarbeitete Datenraw data that AI is trained on, so AI works best in English. However, less than five percent of the world’s population speaks English as a first language. If the world depends on AI more and more, and AI favors English, what will that mean for other, particularly smaller, languages?
Icelandicisländisch; Isländisch (Sprache)Icelandic linguist Dr. Linda Heimisdóttir says the rise of AI comes with the risk of the digital death of smaller languages such as her own, which has just 350,000 speakers. One reason is ease of useleichte Anwendbarkeitease of use. Apple’s Siri, for example, doesn’t understand Icelandic, so it’s easier to use English commandBefehlcommands. A Google search in Icelandic won’t produce many results. If you’re an Icelandic teenager and want to make the most of sth.etw. (voll) ausnutzenmake the most of AI, you will use English a lot more than your own language. This is how the extinctionAussterbenextinction of a language begins.
However, this doesn’t necessarilynotwendigerweise, zwangsläufignecessarily mean the end of linguistic diversityVielfaltdiversity. Heimisdóttir is optimistic that AI developers can correct this biasTendenzbias if they integrate smaller languages early on. Heimisdóttir has proven that this is possible through her own partnership with OpenAI. If successful, Heimisdóttir believes: “The future of linguistic diversity is bright.”
Balancing AI and human vocabularies
Anyone who wants to keep AI-preferred words out of their everyday speech can learn to identify such phrases and actively avoid using them. This kind of awarenessBewusstseinawareness may also help people spot sth.etw. erkennenspot AI-written content more easily. And yet, languages are always changing, influenced by a lot more than just AI. So, since English is the world’s lingua franca, maybe a little Nigerian influence isn’t so bad?
This article originally appeared on https://getfreewrite.com/
Neugierig auf mehr?
Dann nutzen Sie die Möglichkeit und stellen Sie sich Ihr optimales Abo ganz nach Ihren Wünschen zusammen.