AI

DeepMind CEO Demis Hassabis says Google will eventually combine its Gemini and Veo AI models

In a recent appearance PossiblyA podcast co-organized by LinkedIn Co-founder Reid Hoffman, Google DeepMind CEO Demis Hassabis said that Google is planning to eventually combine its Gemini AI models with his VEO video-oriented models to improve the understanding of the first of the physical world.

“We have always built Gemini, our foundation model, multimodal from the start,” said Hassabis, “and the reason we did that [is because] We have a vision for this idea of ​​a universal digital assistant, an assistant who actually helps you in the real world. “

The AI ​​industry gradually moves to “Omni” models, if you want -models that can understand and synthesize many forms of media. The newest Gemini models from Google can generate both audio and images and text, while the standard model of OpenAI in Chatgpt can make native images-inclusive, natural, Studio Ghibli-style art. Amazon has also announced plans to launch an “any-to-any” model later this year.

These omni models require a lot of training data – images, videos, audio, text, etc. Hassabis implied that the video data for VEO comes mainly from YouTube, a platform that Google owns.

“In short, by viewing YouTube videos -Many YouTube videos – [Veo 2] Can, you know, sort out the physics of the world, “said Hassabis.

Google previously told WAN that his models “possible” were trained on “some” YouTube content in accordance with the agreement with YouTube makers. Allegedly, Google broadened the service conditions Last year partly so that the company can tap more data to train its AI models.

See also  Exploring the Architecture of Large Language Models

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button