Did Google’s Gemini Pull off a Fake?
Barely a day after its launch to a mixed response, Google finds itself in the eye of a storm now
When OpenAI launched ChatGPT in November 2022, the world sat up and took notice. That the same cannot be said for every successive launch of GenAI chatbots is understandable as the results would at best be incremental. And Google’s latest Gemini AI model also received a lukewarm reception. However, what followed thereafter could be a disaster in the making.
A report published by TechCrunch says users could become less confident about Gemini and even question Google’s integrity as a hands-on video with the AI-chatbot that went viral with over a million views might have been faked. Just so that readers are abreast with what went on, take a look at this YouTube video.
The video and the questions around it
Having watched the video, we must say there’s a lot to like about it. It highlights some cool interactions with Gemini including one where a drawing is correctly described and where based on certain cues, the chatbot correctly identifies the country. The video depicts how the multimodal AI model understands and mixes language and visual cues.
However, the problem that could end up with Google having egg on its face is that the video could be a fake. Tech columnist Parmy Olsen who contributes for Wall Street Journal and Forbes, says on X (formerly Twitter), “In its YouTube description Google also admits the video is edited for latency – which makes it look like the model is responding more quickly than it is.”
She also provides a link to her article for Bloomberg (now throwing up a 404 error). However, TechCrunch quotes extensively from it to suggest that while Gemini could actually do some of the things that the video suggests, it may not have done it the way the video implies. Hence the need for a video where Google itself cautions us about the latency challenges.
Is Gemini smart or are those behind it smarter?
“In actuality, it was a series of carefully tuned text prompts with still images, clearly selected and shortened to misrepresent what the interaction is actually like. You can see some of the actual prompts and responses in a related blog post — which, to be fair, is linked in the video description, albeit below the “…more”,” the article in TechCrunch says.
It specifically deals with a scene in the video where a hand silently makes gestures and Gemini responds with “I know what you’re doing! You’re playing Rock, Paper, Scissors!” However, the doubters believe that the chatbot doesn’t reason based on individual gestures but only when shown all three at once and prompted with “What do you think I am doing? Hint: It’s a game.”
Personally, we do not think this is all that bad coming from a chatbot. But, Olsen and those at TechCrunch note that these interactions do not feel real. In fact, there are a few more examples of similar stuff that the articles point out to.
“They feel like fundamentally different interactions, one an intuitive, wordless evaluation that captures an abstract idea on the fly, another an engineered and hinted interaction that demonstrates limitations as much as capabilities. Gemini did the latter, not the former. The “interaction” shown in the video didn’t happen,” is how the article describes it.
Google has responded to the criticism – but is it late?
Incidentally, Google has responded to this criticism. Oriyol Vinyals, the VP of Research & Deep Learning Lead, Google DeepMind. Gemini co-lead, took to X (formerly Twitter) to suggest that there was no dearth of transparency on Google’s part. To prove this point, the official even shared a blog post that described the making of the video.
“The video illustrates what the multimodal user experiences built with Gemini could look like. We made it to inspire developers.” We gave Gemini sequences of different modalities — image and text in this case — and had it respond by predicting what might come next. Devs can try similar things when access to Pro opens on 12/13. The knitting demo used Ultra,” he said.
Of course, it may take some time before the truth eventually comes out, as it must. Google has promised to release the AI Studio with Gemini next week. However, the company could have saved itself some embarrassment had they released the blog post alongside the video as a form of ready reckoner.
Meanwhile, it could be worthwhile to understand whether Bloomberg, in its own wisdom, took down the post by Olsen?