ChatGPT has been hailed as the future of AI text generators — and the end of high school writing assignments — for its ability to craft believable, oftentimes stylized human speech from prompts. The text AI works by drawing from examples of human language cataloged online. That includes the worst of people’s language, too, like hate speech, violence, or sexual abuse. Robots turning out racist and sexist happens so consistently it’s practically a punchline, and we only have humanity’s own biases and language to blame.
Despite the public impression that programs like ChatGPT work magically on their own, culling the internet for everything they need to know, human labor is still needed to keep AIs from generating inappropriate content by teaching them to recognize bad stuff. According to an investigation by Time, the bot’s parent company, OpenAI, paid data labelers in Kenya less than $2 per hour to review dangerous content to help ChatGPT recognize violence and hate speech. OpenAI confirmed to Time that Sama employees in Kenya contributed to a tool it was building to identify toxic content, which was eventually built into ChatGPT. They also said some workers earn more than $2 hourly and that “classifying and filtering harmful [text and images] is a necessary step in minimizing the amount of violent and sexual content included in training data” for AI systems.
It’s the latest example of the human cost of content moderation and the unseen, oftentimes exploitative working conditions that power multi-billion-dollar companies. A 2017 investigation by Rolling Stone detailed the mental toll on workers for Facebook and Microsoft who were tasked with looking at beheadings, suicides, and other violent content for hours a day. Earlier reporting by Time also revealed that Facebook content moderators in Africa were paid as little as $1.50 per hour.
Four employees described being “mentally scarred” by the work of data labeling. One Sama worker who read text for OpenAI during that time said he was traumatized by reading a graphic description of a man having sex with a dog in front of a young child. “That was torture,” he said. “You will read a number of statements like that all through the week. By the time it gets to Friday, you are disturbed from thinking through that picture.”
Documents revealed another Sama employee had read an erotica scene of a villian raping Batman’s sidekick Robin. The scene began nonconsensually but eventually, Robin reciprocated, leaving the worker confused about whether to categorize it as sexual violence or not. The employee asked OpenAI’s researchers for advice. A reply, if it came, was not logged in the document reviewed by Time.
Employees told Time they were entitled to counseling, but that the sessions were unhelpful and rare because of work productivity demands. Two said they were only offered the chance to attend group sessions, and one said Sama denied their requests to see a counselor one-to-one. Sama denied that only group sessions were available, saying, “We take the mental health of our employees and those of our contractors very seriously.”
In order to label offensive content for Chat GPT, OpenAI sent tens of thousands of text samples to an outsourcing firm in Kenya called Sama, starting in November 2021, Time reported. Junior data labelers, who comprised the majority of the workers hired for the contract, were paid $170 monthly, with $70 monthly bonuses for viewing content that was explicit in nature. After taxes, workers took home between $1.32 and $1.44 per hour, according to contracts. More senior labelers, could make up to $2 per hour if they met all their targets. A Sama spokesperson told Time workers earned between $1.46 and $3.74 per hour after taxes.
In Feb. 2022, Sama began working on a new, image-based project for OpenAI, seemingly unrelated to the text-only ChatGPT. Sama was collecting and labeling sexual and violent images, some of which were illegal in the U.S. Later that month, a billing document showed
Sama sent OpenAI a batch of 1,400 images. These included depictions of child sexual abuse, violent sex acts, and graphic violence. Sama canceled its contract weeks after that. The company said in a statement to Time that the request for illegal images had come from OpenAI after Sama had already begun the project. “The East Africa team raised concerns to our executives right away. Sama immediately ended the image classification pilot and gave notice that we would cancel all remaining [projects] with OpenAI,” a Sama spokesperson said.