ChatGPT Went Rogue

13-Aug-2024

Last week, OpenAI published the GPT-4o "scorecard," a report that details "key areas of risk" for the company's latest large language model, and how they hope to mitigate them. In one terrifying instance, OpenAI found that the model's Advanced Voice Mode — which allows users to speak with ChatGPT — unexpectedly imitated users' voices without their permission, Ars Technica reports.

"Voice generation can also occur in non-adversarial situations, such as our use of that ability to generate voices for ChatGPT's advanced voice mode," OpenAI wrote in its documentation. "During testing, we also observed rare instances where the model would unintentionally generate an output emulating the user's voice."

An appended clip demonstrates the phenomenon, with ChatGPT suddenly switching to an almost uncanny rendition of the user's voice after shouting "No!" for no discernible reason. It's a wild breach of consent that feels like it was yanked straight out of a sci-fi horror movie.

"OpenAI just leaked the plot of Black Mirror's next season," BuzzFeed data scientist Max Woolf tweeted.

Voice Clone

In its "system card," OpenAI describes its AI model's capability of creating "audio with a human-sounding synthetic voice." That ability could "facilitate harms such as an increase in fraud due to impersonation and may be harnessed to spread false information," the company noted.

OpenAI's GPT-4o not only has the unsettling ability to imitate voices, but "nonverbal vocalizations" like sound effects and music as well.

By picking up noise in the user's inputs, ChatGPT may decide that the user's voice is relevant to the ongoing conversation and be tricked into cloning the voice, not unlike how prompt injection attacks work.

Fortunately, OpenAI found that the risk of unintentional voice replication remains "minimal." The company has also locked down unintended voice generation by limiting the user to the voices OpenAI created in collaboration with voice actors.

"My reading of the system card is that it’s not going to be possible to trick it into using an unapproved voice because they have a really robust brute force protection in place against that," AI researcher Simon Willison told Ars.

"Imagine how much fun we could have with the unfiltered model," he added. "I’m annoyed that it’s restricted from singing—I was looking forward to getting it to sing stupid songs to my dog."