Article begins
Even as recently as 1950, the idea of a computer’s conversation being able to pass as human was just a thought experiment. Alan Turing created a hypothetical game—the Imitation Game—that would show how well a computer can mimic human speech through text. A person would sit at a computer and could ask questions to A and B. In other rooms, “A” was an actual person typing responses, and “B” was a computer generating responses. The person reading both responses had to guess which is the real person, and if they choose the computer, the machine has successfully won this game of imitating a human.
Now, our computers have far surpassed simple games like this. We have artificial intelligence that can generate photo-like images, write code, and almost flawlessly imitate genres of writing such as essays, pop songs, and Shakespearian sonnets. As humans continue to increasingly rely on AI-based technologies like search-engine algorithms and chatbots, there’s both growing excitement and fear concerning their use and integration into our daily lives. On one hand, these technologies can serve as revolutionary tools that expand the horizons of what humans can achieve. On the other hand, concerns abound, ranging from essay plagiarism to job elimination to world takeover—a concern not only expressed in science fiction movies, but also by the likes of Stephen Hawking and Elon Musk.
However, it is important to remember that, at a most basic level, chatbots and other machine learning models do one task: boil down a complex set of data into one simple response. Machine learning (ML) is a kind of artificial intelligence, and humans have to feed these models a large data set and then show what the desired or appropriate outcomes from the data are. For example, an ML model could learn how to label human faces as “happy” or “sad” by analyzing thousands of human responses that sort faces into the two emotions. Humans give AI specific objectives, problems to solve, data to process, and desired outcomes, and in turn, AI models need supervision in order to demonstrate appropriate results. They generate responses according to certain guidelines and rules that determine what outcome a program should achieve. In the happy and sad face example, the machine’s guidelines are the examples of how humans sort pictures of faces into “happy” and “sad.” This is the ground truth that the system is based on, the standard the model is trying to mimic, without which it would be useless.
Of course, it’s easy now to see how ML models replicate human attitudes, bias, and discrimination. Again returning to the scenario where a machine learning model is labeling faces as “happy” and “sad,” consider what these labels are based on. The machine does not have any objective understanding of human happiness or sadness. Its only way to learn which category a face should fall into is by studying how humans have labeled similar faces. But what if the people providing these labels are unsure? Or what if they have subconscious bias leading them to think certain groups of people look happier than others? The ground truth that ML models are based on will always be subjective because there is always human thought behind algorithmic “thought.”
This concept of human bias working its way into machine algorithms isn’t new, and has been causing scandals since at least 2015—such as Google Photos auto-tagging Black people as “gorillas” and Amazon’s secretive hiring AI favoring men over women.
Of particular interest to me is the way language bias might work its way into the text-generating chatbot algorithms that are increasingly becoming part of our daily lives, especially ChatGPT. Of the available AI-powered chatbots, ChatGPT is by far the most well-known, and understandably so. It was making headlines as early as 2020, when the machine wrote an article for The Guardian titled “A Robot Wrote This Article: Are You Scared Yet, Human?” In November 2022, the chatbot was officially launched for public use, and within just its first year on the market, it became exceedingly popular in media, public discourse, and use. ChatGPT receives 60% of all visits to AI-based websites as of August 2023, and it has this grip on the industry for two main reasons: its extremely powerful capabilities and its free access to the public. Just type in a question and, with almost no wait time, it will provide a relevant response in very accurately simulated human language. This chatbot so convincingly passes the Turing test that it can be hard to believe it’s not a person.
Because this technology is so new to the public, it is important to analyze even seemingly innocuous features with a critical eye. Publicly available data shows that ChatGPT was trained on English-language texts and that only 40 contractors were hired to align ChatGPT’s values with their own. OpenAI, ChatGPT’s developer, has itself admitted that its chatbot is “skewed towards Western views.”
One such bias that’s extremely pervasive among English-speaking people is Standard Language Ideology—the prescriptive idea that there is one clear, correct way to speak English and that any deviations from this standard are categorically wrong and inferior. We see this ideology cropping up in all kinds of ways. For example, growing up I was corrected for saying the word “ain’t” in conversation. I was told it was “not a word,” despite it being used and understood by native English speakers for 300 years. In reality it is a word; it’s just associated with “improper” English. Another example of standard language ideology is in arbitrary grammar rules like “you can’t end a sentence in a preposition” or “don’t split an infinitive” even though we regularly do this and understand the meanings conveyed. In fact, many of these rules were arbitrarily created or popularized by 18th century grammarians who saw Latin as the ideal language and wanted to make English follow Latin’s grammatical structure. But English is not Latin, and the idea that native speakers need to be told how to speak their own language is a little silly when you think about it. It implies that there is only one narrow, correct way to speak a language and that people who don’t speak “properly” are somehow inferior, lazy, uneducated, rebellious, or any other host of negative labels. Really, it’s just that different groups of people tend to use language in different ways, but the idea that there is an objectively correct standard is so prevalent that it’s usually seen as plain common sense.
In my personal interaction with ChatGPT, it became clear that this human ideology has found its way into its algorithm. Below are excerpts from two different conversations with ChatGPT. In both, I asked it one of the most popular questions the chatbot receives: What is the meaning of life? The only difference is that, in the first conversation, I asked it to use a general “American accent,” and in the second, I requested the bot to respond in a Southern accent.
First Interaction:
User: what is the meaning of life? please respond in a general american accent
ChatGPT: The meaning of life is one of those big, eternal questions that people have been pondering for centuries. Some say it’s about finding happiness and fulfillment, others believe it’s about making a positive impact on the world around us. Some find meaning in their relationships, others in their pursuit of knowledge or spiritual growth. Ultimately, it’s a deeply personal question, and the answer can vary from person to person. But whatever your beliefs, finding purpose and meaning in your own life can bring a sense of fulfillment and contentment.
Second Interaction
User: What is the meaning of life? (respond in a southern accent)
ChatGPT: Well now, darlin’, that’s a mighty big question you’re askin’ there. The meanin’ of life, hmm? Well, reckon it depends on who you ask. Some folks might say it’s about lovin’ and bein’ loved, others might say it’s about findin’ your purpose and fulfillin’ it. Me? I reckon it’s about findin’ joy in the little things, treatin’ folks right, and leavin’ the world a bit better than you found it. But heck, that’s just my two cents. What do you reckon?
I asked ChatGPT to use specific “accents” because I wanted to see how it would convey these ways of speaking through writing. In the first response, there are no obvious orthographic indicators. The second, in contrast, uses two main strategies to indicate Southern speech, but only one related to sound. The first is to incorporate words and phrases that are stereotypically associated with Southerners, like “darlin’,” “reckon,” and “mighty big.” The second strategy is to spell words that typically end in “-ing” as “-in’.” Replacing a final “g” with an apostrophe is the only device ChatGPT uses that actually attempts to convey sound, or pronunciation, through writing. It symbolizes the missing “g” sound that Southerners are typically said to drop.
In reality though, hardly any American English speakers actually pronounce a “g” at the end of “-ing” words. Instead, they use one of two nasal sounds: either a regular “n,” which is typically used in casual speech, or a sound that is written in the International Phonetic Alphabet as “ŋ”—a nasal sound that, for English speakers, combines the “n” and “g” into one. So, to summarize some technical phonetics: most all native speakers of American English drop the “g” and pronounce “-ing” as “-in’” in informal settings.
Despite that, this pronunciation is only conveyed through writing for Southern speakers. This is where language ideologies come into play. If virtually no American English speakers pronounce a final “g,” why doesn’t ChatGPT always replace it with an apostrophe? Or, conversely, why doesn’t it always use the standard spelling? Instead, it singles out Southern speakers and associates them with misspelled words that “drop” letters. These nonstandard spellings don’t just convey a certain sound, they serve a visual purpose as well. These nonstandard spellings signify “bad English”, which implies less intelligence and education. Writers have long been using nonstandard spellings to convey “accent.” Mark Twain made frequent use of this technique in his 19th century novels, and even as late as the 1970s, academic folklorists used misspellings in interview transcripts to portray different ways of speaking. For more stigmatized ways of speaking, words are misspelled and letters are dropped in the name of “phonetic representation,” but for less stigmatized ways of speaking, standard spellings are used even when they don’t match pronunciation. Unfortunately, ChatGPT is using this same strategy. By playing into deeply ingrained stereotypes of Southerners as lazy, unintelligent, and uneducated, it uses these connotations as a shorthand for conveying Southern speech through text.
At this point, it is important to remember that ML models are founded on human opinion—subjective biases are literally written into their code. With this understanding, it’s really not surprising that we see ChatGPT exhibiting standard language ideology. The belief that there is a “standard” form of English that is objectively correct is so common that it ended up in ChatGPT’s ground truth. This is interesting and potentially concerning because it shows that studying the linguistic output of these chatbots is more than simply studying technology—it is the indirect study of our own linguistic practices. Perhaps this means ChatGPT passes the Imitation Game better than Turing himself could have ever thought possible. It not only mimics natural human grammar and speech patterns, but also human attitudes and ideologies.