Artificial intelligence developer OpenAI entered October with several updates to its models, helping its AI models engage in better conversations and improve image recognition.
On Oct. 1, OpenAI unveiled four updates that introduce new tools designed to make it easier for developers to build on its AI models.
It speaks!
One major update is the Realtime API, which allows developers to create AI-generated voice applications using a single prompt.
The tool, available for testing, supports low-latency, multimodal experiences by streaming audio inputs and outputs, enabling natural conversations similar to ChatGPT’s Advanced Voice Mode.
Previously, developers had to “stitch together” multiple models to create these experiences. Audio input would typically need to be fully uploaded and processed before receiving a response, which meant higher latency for real-time applications like speech-to-speech conversations.
With Realtime API’s streaming capability, developers can now enable immediate, natural interactions, much like voice assistants. The API runs on GPT-4, released in May 2024, which can reason across audio, vision and text in real time.
AI can see clearly now
Another update includes a fine-tuning tool for developers, allowing them to improve AI responses generated from images and text inputs.
The image-based fine tuners enable the artificial intelligence to have a better capacity to understand images, in turn enhancing visual search and object detection capabilities, according to the developer. The process includes feedback from humans who provide examples of good and bad responses.
In addition to its voice and vision updates, OpenAI also rolled out “model distillation” and “prompt caching,” which allow smaller models to learn from larger ones and reduce development costs and time by reusing already processed text.
The advanced capabilities of its models are a key selling point, as a major chunk of revenue for OpenAI comes from businesses building their own applications on top of OpenAI’s technology.
According to Reuters, OpenAI projects its revenue to rise to $11.6 billion next year, up from an estimated $3.7 billion in 2024.
Magazine: AI may already use more power than Bitcoin — and it threatens Bitcoin mining