Technology company Google announced the launch of Gemini 2.0, the latest AI model in its Gemini family, starting with an experimental version called Gemini 2.0 Flash.
Building on the success of Gemini 1.5 Flash, which became a favorite among developers, Gemini 2.0 Flash delivers improved performance while maintaining fast response times. Notably, the new model surpasses the 1.5 Pro in key benchmarks at twice the speed. Additionally, Gemini 2.0 Flash introduces expanded capabilities, including support for multimodal inputs such as images, videos, and audio, as well as multimodal outputs like text paired with AI-generated images and steerable multilingual text-to-speech (TTS) audio. This AI model can also natively call tools such as Google Search, perform code execution, and access user-defined third-party functions.
Currently available to developers through the Gemini API in Google AI Studio and Vertex AI, the experimental version of 2.0 Flash supports multimodal input with text output. Advanced features like text-to-speech and native image generation are accessible to early-access partners, with broader availability expected in January alongside additional model sizes.
To further support developers in creating dynamic, interactive applications, Google is also introducing a new Multimodal Live Application Programming Interface (API). This API allows real-time audio and video-streaming input, along with the capability to integrate multiple tools for combined functionality.
Starting today, users worldwide can try an experimental chat-optimized version of Gemini 2.0 Flash by selecting it from the model drop-down on desktop and mobile web platforms. The model will also be available on the Gemini mobile application in the near future.
Google Explores Gemini 2.0 Flash’s Capabilities Through Research Projects
Gemini 2.0 Flash introduces advanced capabilities that enhance user interactions, including multimodal reasoning, long-context understanding, complex instruction handling, planning, compositional function-calling, and seamless integration with native tools. These features, combined with improved latency, work together to create a foundation for a new generation of autonomous AI experiences.
Presently, Google is researching how AI agents can assist people with real-world tasks through prototypes designed to enhance productivity and streamline workflows. Examples include the updated Project Astra, a research initiative focused on the potential capabilities of a universal AI assistant, the new Project Mariner, which reimagines human-agent interactions, beginning with browser-based experiences, and Jules, an AI-driven coding assistant created to support developers in their work. By utilizing Gemini 2.0 Flash in these projects, Google was able to effectively evaluate its capabilities and achieve enhanced outcomes, highlighting the vast potential of the new model.
The post Google Unveils Gemini 2.0 Flash AI Model, Now Accessible To Developers appeared first on Metaverse Post.