News & Analysis

Google Pauses Images on Gemini

This follows concerns over racism as well as reverse racism, especially in historic data

Barely a couple of weeks after launching its latest iteration of Gemini and two new large language models (LLMs) thereafter, Google has now suspended the ability of its GenAI suite to generate images of people. This after users complained about historical inaccuracies resulting from hardcoding under the hood to avoid biases. 

In a post on the social media platform X, the company said it was putting a pause on generating people’s images and that it was working to address “recent issues” related to these inaccuracies. “While we do this, we’re going to pause the image generation of people and will re-release an improved version soon,” it said.

Later, Google put out a blog post with the headline that said it all: “Gemini image generation got it wrong. We’ll do better.” Senior VP Prabhakar Raghavan says, “It’s clear that this feature missed the mark. Some of the images generated are inaccurate or even offensive. We’re grateful for users’ feedback and are sorry the feature didn’t work well.” 

In the post, Raghavan acknowledged the mistake and said Google had temporarily paused image generation of people in Gemini while they worked on an improved version.

So, what exactly happened here?

Google attempts to clarify that its conversational app under the Gemini brand was separate from Search as well as the underlying AI models and other products. “Its image generation feature was built on top of an AI model called Imagen 2 which was tuned to sidestep similar issues with generation tech that created violent or sexually explicit images or depictions of real people. 

“And because our users come from all over the world, we want it to work well for everyone. If you ask for a picture of football players, or someone walking a dog, you may want to receive a range of people. You probably don’t just want to only receive images of people of just one type of ethnicity (or any other characteristic),” says the post. 

The post also noted that if Gemini receives more specific prompts such as “Black teacher in a classroom” or “a white veterinarian with a dog” or similar set of people in specific cultural or historical accounts, the results should be so that it reflects what is asked for. Currently, this doesn’t seem to be happening. 

Two things went wrong, according to Google. “First, our tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range. And second, over time, the model became way more cautious than we intended and refused to answer certain prompts entirely — wrongly interpreting some very anodyne prompts as sensitive,” says Raghavan adding that these two issues led the model to “overcompensate in some cases, and be over-conservative in others, leading to images that were embarrassing and wrong.”

What’s next and how’s Google fixing it?

Google obviously does not want Gemini to refuse images of any particular group, nor does it want it to create inaccurate historical images. “So, we turned the image generation of people off and will work to improve it significantly before turning it back on. This process will include extensive testing,” the post said. 

However, what comes next suggests that Google’s GenAI models – or for that matter those of a score of others in the market – isn’t always reliable as it was built solely to enhance creativity and productivity. Hallucinations are a challenge for all LLMs and things may or may not improve as more data is used in existing models or newer models are made. 

Here’s how Raghavan describes the challenge: “Gemini is built as a creativity and productivity tool, and it may not always be reliable, especially when it comes to generating images or text about current events, evolving news or hot-button topics. It will make mistakes. As we’ve said from the beginning, hallucinations are a known challenge with all LLMs — there are instances where the AI just gets things wrong…” 

So, where does all of this leave us?

Right where we started. Ever since OpenAI launched ChatGPT in the winter of 2022, the debate over hallucinations, inaccuracies and generic content has been bandied around. And the latest gaffe from Google shows us that while GenAI models and the LLMs working under the hood are a great way forward, lots need to be done to make things error-free. 

And there’s no guarantee that errors would be a thing of the past in the near future. Here’s how the blog post describes the challenges and we have quoted verbatim: 

“Gemini tries to give factual responses to prompts — and our double-check feature helps evaluate whether there’s content across the web to substantiate Gemini’s responses — but we recommend relying on Google Search, where separate systems surface fresh, high-quality information on these kinds of topics from sources across the web.

“I can’t promise that Gemini won’t occasionally generate embarrassing, inaccurate or offensive results — but I can promise that we will continue to take action whenever we identify an issue. AI is an emerging technology which is helpful in so many ways, with huge potential, and we’re doing our best to roll it out safely and responsibly,” says the post.