OpenAI Unveils GPT-4V and Multimodal Conversational Modes for ChatGPT

According to Cointelegraph, OpenAI has introduced GPT-4V, a vision-capable model, and multimodal conversational modes for its ChatGPT system. With these upgrades, users can engage the chatbot in conversations, as the models powering ChatGPT, GPT-3.5 and GPT-4, can now understand plain language spoken queries and respond in one of five different voices.
A blog post from OpenAI explains that the new multimodal interface will enable users to interact with ChatGPT in innovative ways, such as snapping a picture of a landmark and having a live conversation about it, or taking photos of their fridge and pantry to figure out what to cook for dinner. The upgraded version of ChatGPT will be available to Plus and Enterprise users on mobile platforms within the next two weeks, with access for developers and other users to follow soon after.
This multimodal upgrade comes shortly after the launch of DALL-E 3, OpenAI's most advanced image generation system. DALL-E 3 also integrates natural language processing, allowing users to talk to the model to fine-tune results and to incorporate ChatGPT for assistance in creating image prompts. In related AI news, OpenAI competitor Anthropic announced a partnership with Amazon on September 25, with Amazon investing up to $4 billion to include cloud services and hardware access. In return, Anthropic will provide enhanced support for Amazon's Bedrock foundational AI model and secure model customization and fine-tuning for businesses.
Explore More From Creator

Latest News