ChatGPT threatens language diversity. More needs to be done to protect our differences in the age of AI

The Conversation6 Mar 2023

498 4 minutes read

The buzz around artificial intelligence (AI) technologies like ChatGPT is palpable. People are both optimistic and frightened by the possibilities of these tools. Clearly, these technologies will change how people write. But in terms of what people write, these technologies seem to be embracing the status quo.

In fact, the way these tools are currently built appears to homogenise writing – making everything sound the same. And writing that sounds the same is not just boring; it also perpetuates inequity.

When writing tools prioritise one way of writing over another, they reinforce existing hierarchies that unfairly position Standard American English (SAE) and the Queen’s English over other languages and ways of writing.

How does ChatGPT work?

Technologies like ChatGPT are called large language models (LLMs). LLMs provide textual responses to human commands, by using machine learning to study patterns of words in a massive archive of texts.

Crucially, however, ChatGPT does not know the meaning of words. ChatGPT generates definitions by sorting through a mountain of definitions and then collating those into a single response that suits the context of a query.

In other words, without meaning as its guide, ChatGPT responds to queries by relying on context clues, stylistic structures, writing forms, linguistic patterns and word frequency.

This functionality means that, by default, ChatGPT perpetuates dominant modes of writing and language use while sidelining less common ones.

Some New York City high schoolers reportedly don’t like ChatGPT-created lessons, calling them ‘biased’ and ‘very bland’ https://t.co/gwZ64yVoE2

— Insider Business (@BusinessInsider) February 8, 2023

Erasing diversity

Dominant modes of writing don’t become dominant by accident. They become dominant because one social group wants to assert power over another social group.

There is not, for example, one kind of English. There are many Englishes.

The decision to prioritise Standard American English in many US classrooms, for example, means that speakers of Black English – a language with its own grammar, lexicon and remarkable history of resistance – are penalised and shamed for writing as they speak.

Similarly, in Aotearoa New Zealand, the Queen’s English became dominant not because it’s intrinsically better than te reo Māori. Rather, European colonisers wanted to stamp out Māori culture, and writing in the Queen’s English became a key tool for furthering that objective. In the 20th century, students were regularly beaten for speaking Māori in schools.

Going against the default

Supporters of ChatGPT will be quick to note that ChatGPT can read, analyse and generate content in many languages, including in Black English and te reo Māori.

But the concern is not about what ChatGPT can do.

It’s about what its default settings are. It’s about how ChatGPT is configured to treat some forms of writing as normal, typical and expected. And it’s about how ChatGPT requires a special request to generate non-normative forms of writing.

This problematic default behaviour also occurs in ChatGPT’s sister programme, Dall-E 2. This image-generating AI was asked to create an image for this article based on this prompt: “close up photo of hands typing on a laptop.” The programme created four images. All had white masculine hands.

The programme needed a more specific prompt to generate an image that included a person of colour because even the ways that AI visualises writing is dominated by white men.

AI created image to depict a close up of someone writing on a keyboard. Initial efforts to create this image returned images of white male hands. Provided by author, Author provided. — AI created image to depict a close up of someone writing on a keyboard. Initial efforts to create this image returned images of white male hands. Collin Bjork, Author provided.

Ultimately, this kind of algorithmic bias continues to make white English-speaking men the standard of writing culture, while ushering everyone else to the margins.

How did it get like this?

It’s no surprise that ChatGPT’s default functionality seems to prioritise forms of English writing developed by white people. White English-speaking men have long dominated many writing-intensive sectors, including journalism, law, politics, medicine, computer science and academia.

These white English-speaking men have collectively written billions of words, many times more than their colleagues of colour. The sheer volume of words these authors have written means that they likely constitute the majority of ChatGPT’s learning models, even though ChatGPT’s parent company, OpenAI, doesn’t publicly reveal its source material.

So when users ask ChatGPT to generate content in any of these disciplines, the default output is written in the voice, style and language of those same white English-speaking men.

Challenging the norm

Some people will say that we need defaults and standards in writing. They argue that we need to teach people to write in the Queen’s English or SAE so that people don’t miss out on jobs and promotions because they write in a different way.

But that line of thinking just means capitulating to workplace prejudice and reinforcing an unjust system through our participation in it. Instead, other scholars say we need to challenge those unfair writing standards and encourage writers to embrace the rich rhetorical possibilities in their linguistic diversity.

Good talk with journalist Shanti Mathias about ChatGPT and power. One thing we discussed was AI’s tendency to make writing sound like it’s written by white (English-speaking) dudes. https://t.co/WVWof3jaCQ

— Collin Bjork (@collin_bjork) February 9, 2023

Educators who want to embrace linguistic diversity might be tempted to ban text-generating AI from their schools and universities.

But it’s worth remembering that writing itself is a technology that has been, and still is, used to further inequality. Literary scholar Alice Te Punga Somerville calls this “the inextricability of writing from historical and ongoing violence.”.

In response to this threat, however, Professor Somerville does not advocate abandoning writing altogether. Rather, she insists on using the tool of writing critically and creatively to resist oppression.

Taking her lead, educators might instead encourage students to develop new ways of deploying these tools to compose a more equitable future. Doing so means, as Professor Vershawn Young says in Black English

that good writin gone look and sound a bit different than some may now expect. And another real, real good result is we gone help reduce prejudice.

Collin Bjork, Senior Lecturer, Massey University

Article source: This article is republished from The Conversation under a Creative Commons license. Read the original article.

Header image source: Collin Bjork, Author provided.

Rate this post

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

How does ChatGPT work?

Erasing diversity

Going against the default

How did it get like this?

Challenging the norm

The Conversation

Related Articles

ChatGPT is great – you’re just using it wrong

Video & slides from KM4Dev Knowledge Café 28: Useful applications of ChatGPT

The Emoji Evolution: The short and long of the emoji

We’re not very good at telling when text has been written by AI