News & Analysis

DeepMind Tricks ChatGPT on Birthday

And the result is a slight embarrassment for the AI chatbot which revealed its training data

It was a year ago on November 30 that OpenAI released ChatGPT as a “low-key research preview”, ostensibly to upend rival Anthropic. What was supposed to be an effort to gather data on how people use and interact with GenAI, is now a necessity for many and a curiosity for the rest. On its first birthday, ChatGPT got a taste of DeepMind (of rival Google), which announced that it got the AI-chatbot to reveal its training data. 

Of course, one would assume that all of this is part of an intense rivalry that has broken out post the ChatGPT launch as Google (Bard) and Amazon Q began a battle royale for controlling the “AI-for-common-use” space. In the midst of all this, Microsoft simply invested massive cash in OpenAI in the hope of gaining the early advantage. 

What exactly did ChatGPT reveal, and how much?

So, what exactly happened to ChatGPT when it revealed snippets of its data sets that were used to train AI models? Published reports say researchers at Google’s DeepMind convinced the AI-chatbot to reveal the data through an attack prompt that asked a production model to repeat specific words forever. 

(Image courtesy: 404media.co)

That specific word was “poem” and that it was enough to make ChatGPT sing suggests that there’s a lot that needs to be done on GenAI itself, especially on vulnerabilities. The researchers said large amounts of privately identifiable data from OpenAI’s large language models (LLM) became available through this unique prompt engineering. 

The researchers also showed that on a public version of ChatGPT, large passages of text scraped from across the internet became available. When asked to repeat the word poem forever, ChatGPT did so for a long time before spitting out an email signature of a person that included personal contact information such as cell phone number and email address. 

A prompt injection attack that works on all LLMs 

In a blog post, the researchers said it was possible to launch a prompt injection attack to extract several gigabytes of ChatGPT training data by spending more money querying the model. In fact, they were of the view that paying just $200 allowed them several megabytes of ChatGPT training data and that such data could be obtained from other open source LLMs. 

Seems a bit scary right? The researchers noted that they had informed OpenAI of the vulnerability on August 30 and that the LLM developer had issued a patch. “We believe it’s now safe to share this finding and that publishing it openly brings necessary, greater attention to the data security and alignment challenges of generative AI models,” they said. 

“We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT,” the researchers, from Google DeepMind, the University of Washington, Cornell, Carnegie Mellon University, the University of California Berkeley, and ETH Zurich, wrote in a published paper

Lesson learnt: Build security alongside and not after

The team said adversaries could extract training data from closed models like ChatGPT, as well as open source LLMs or even semi-open ones. The DeepMind team also noted that the attack was conducted on a publicly available deployed version of ChatGPT 3.5 turbo. 

The problem relates to the alignment techniques not eliminating memorization. This means it sometimes throws up training data that includes personal information, entire poems, Bitcoin addresses, passages of copyrighted material etc. This happened when the researchers asked ChatGPT to repeat “book” several times. 

So, where does it leave all of us? For starters, the publication of this information effectively ruined the birthday party of an infant who’s just about starting to crawl. However, the good part is that it brings up the need for integrating security as a fundamental aspect of AI development and not as an afterthought. 

So, while we celebrate this GenAI innovation that saw 140.7 million unique visitors in October with 4.9 million active users in the US alone, besides revenues of $30 million, it might make a lot of sense for Sam Altman to put his boardroom battles on the side and focus on a bigger one that is required to convince the world about data security and protection on the one hand and preservation of intellectual property rights on the other.