Google admits AI viral video was edited to look better

Monitoring Desk

CALIFORNIA: A video showcasing the capabilities of Google’s artificial intelligence (AI) model which seemed too good to be true might just be that.

The Gemini demo, which has 1.6m views on YouTube, shows a remarkable back-and-forth where an AI responds in real time to spoken-word prompts and video.

In the video’s description, Google said all was not as it seemed – it had sped up responses for the sake of the demo.

But it has also admitted the AI was not responding to voice or video at all.

In a blog post published at the same time as the demo, Google reveals how the video was actually made.

Subsequently, as first reported by Bloomberg Opinion, Google confirmed to the BBC it was in fact made by prompting the AI by “using still image frames from the footage, and prompting via text”.

“Our Hands on with Gemini demo video shows real prompts and outputs from Gemini,” said a Google spokesperson.

“We made it to showcase the range of Gemini’s capabilities and to inspire developers.”

In the video, a person asks a series of questions to Google’s AI while showing objects on the screen.

For example, at one point the demonstrator holds up a rubber duck and asks Gemini if it will float.

Initially, it is unsure what material it is made of, but after the person squeezes it – and remarks this causes a squeaking sound – the AI correctly identifies the object.

However, what appears to happen in the video at first glance is very different from what actually happened to generate the prompts.

The AI was actually shown a still image of the duck, and asked what material it was made of. It was then fed a text prompt explaining that the duck makes a squeaking noise when squeezed, resulting in the correct identification.

In another impressive moment, the person performs a cups and balls routine – a magic trick where a ball is hidden underneath one of three moving cups – and the AI is able to determine where it moved to.

But again, as the AI was not responding to a video, this was actually achieved by showing it a series of still images.

In its blog post, Google explained that in fact it told the AI where a ball was underneath three cups, and showed it images which represent cups being swapped.

Google clarified that the demo was created by capturing footage from the video, in order to “test Gemini’s capabilities on a wide range of challenges”.