Gemini 2.0 - A Model for”Everything”
Google unveiled Gemini 2.0, an experimental AI model heralded as a transformative step toward a "universal assistant."
Capable of autonomously navigating websites, the model aims to empower users to develop advanced AI agents.
CEO Sundar Pichai described it as Google's most capable creation yet, designed for the "agentic era."
We’re kicking off the start of our Gemini 2.0 era with Gemini 2.0 Flash, which outperforms 1.5 Pro on key benchmarks at 2X speed (see chart below). I’m especially excited to see the fast progress on coding, with more to come.
Developers can try an experimental version in AI… pic.twitter.com/iEAV8dzkaW
— Sundar Pichai (@sundarpichai) December 11, 2024
This launch underscores Google's commitment to leading the AI race amidst fierce competition from industry giants like Meta and Microsoft.
The Model Will Be Rolled Out Across Products
Pichai announced that Gemini 2.0, featuring advanced multimodal capabilities, will soon be integrated into its suite of products, supporting native image and audio output.
We're excited to introduce Gemini 2.0 - our most capable AI model yet - with 2.0 Flash Experimental.
Starting today, all Gemini users can now try out a chat-optimized version of Gemini 2.0 Flash Experimental, with enhanced performance on a number of key benchmarks and speed.… pic.twitter.com/HTIn1dDg7J
— Google Gemini App (@GeminiApp) December 11, 2024
This follows the December 2023 release of Gemini 1.0, touted as the first "natively multimodal" model capable of processing and responding to text, video, images, audio, and code queries.
The latest version reflects Google's drive to remain at the forefront of the competitive AI landscape.
Pichai noted:
“If Gemini 1.0 was about organizing and understanding information, Gemini 2.0 is about making it much more useful.”
Gemini 2.0, which debuts nearly 10 months after the intermediate 1.5 model, remains in experimental preview.
Currently, only the smaller, cost-effective 2.0 Flash variant is available, primarily to developers and testers.
Demis Hassabis, CEO of Google DeepMind, described the launch as a significant milestone for the company, despite its limited initial release.
Hassabis explained:
“It's as good as the current Pro model is. So you can think of it as one whole tier better, for the same cost efficiency and performance efficiency and speed. We're really happy with that.”
Other Gemini users still have access to 1.5 Flash, recognised for its speed and efficiency.
While our experimental models are safety-tuned, in alignment with our approach & guidelines, they are an early preview and might not work as expected. Additionally, some Gemini features won't be compatible with these models in their experimental state.
— Google Gemini App (@GeminiApp) December 11, 2024
Not Only Gemini 2.0, Google Announces a Plethora of Features
Google has outlined ambitious plans for its latest AI model, Gemini 2.0, which Pichai says will enhance the AI Overviews feature already available to one billion users.
Pichai noted that AI Overviews is rapidly becoming one of Google's most popular search tools.
With the integration of Gemini 2.0, the feature will be capable of handling complex, multi-step queries, such as solving mathematical equations and addressing multimodal questions.
Limited testing for the model began this week, but broader access to its reasoning capabilities is slated for early next year.
The model operates on Google's 6th-generation AI chip, Trillium, which debuted alongside the announcement.
According to the company, Trillium offers four times the performance and is 67% more energy-efficient than its predecessor.
Google Cloud customers now have access to this cutting-edge hardware.
Among the new features powered by Gemini 2.0 is "Deep Research," an advanced research assistant available within Gemini Advanced.
This tool leverages reasoning and long-context capabilities to compile detailed research reports.
We are investing in the frontiers of agentic capabilities with a few early prototypes. Project Mariner is built with Gemini 2.0 and is able to understand and reason across information - pixels, text, code, images + forms - on your browser screen, and then uses that info to… pic.twitter.com/zM1SKahg86
— Sundar Pichai (@sundarpichai) December 11, 2024
Google DeepMind CEO Demis Hassabis remarked that these advancements set the stage for a transformative 2025:
“We really see 2025 as the true start of the agent-based era.”
Google also unveiled Project Mariner, an experimental Chrome extension capable of autonomously navigating web browsers, and introduced Jules, an AI agent designed to help developers identify and fix coding errors.
Another Gemini-powered feature, described as an "Easter egg" by Hassabis, is a gaming assistant capable of analysing a user's screen and improving gameplay—a testament to the model's true multimodal capabilities.
ICYMI: We're in our Gemini 2.0 era 🧵↓ https://t.co/w2pHRWutgJ
— Google Gemini App (@GeminiApp) December 12, 2024