OpenAI’s Latest Upgrade Essentially Lets Users Livestream With ChatGPT

Cointelegraph · 2024-05-14T00:33:07.000Z

ChatGPT creator OpenAI has announced its latest AI model, GPT-4o, a chattier, more humanlike AI chatbot, which can interpret a user’s audio and video and respond in real time. A series of demos released by the firm shows GPT-4 Omni helping potential users with things like interview preparation — by making sure they look presentable for the interview — and calling a customer service agent to get a replacement iPhone, translate a bilingual conversation in real time. Demos show it can share dad jokes, be the judge of a rock-paper-scissors match between two users, and respond with sarcasm when asked. One demo even shows ChatGPT being introduced to the user’s puppy for the first time. "Well hello, Bowser! Aren't you just the most adorable little thing?" the chatbot exclaimed. Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqNText and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx — OpenAI (@OpenAI) May 13, 2024 “It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” said the firm’s CEO, Sam Altman, in a May 13 blog post. “Getting to human-level response times and expressiveness turns out to be a big change.” A text and image-only input version was launched on May 13, with the full version set to roll out in the coming weeks, OpenAI said in a recent X post. GPT-4o will be available to both paid and free ChatGPT users and will be accessible from ChatGPT’s API. OpenAI said the “o” in GPT-4o stands for “omni” — which seeks to mark a step toward more natural human-computer interactions. Introducing GPT-4o, our new model which can reason across text, audio, and video in real time.It's extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction): pic.twitter.com/VLG7TJ1JQx — Greg Brockman (@gdb) May 13, 2024 GPT-4o’s ability to process any input of text, audio and image at the same time is a considerable advancement compared with OpenAI’s earlier AI tools, such as ChatGPT-4, which often “loses a lot of information” when forced to multi-task. Related: Apple finalizing deal with OpenAI for ChatGPT iPhone integration: Report OpenAI said “GPT-4o is especially better at vision and audio understanding compared to existing models,” which even includes picking up on a user’s emotions and breathing patterns. It is also “much faster” and “50% cheaper” than GPT-4 Turbo in OpenAI’s API. The new AI tool can respond to audio inputs in as little as 2.3 seconds, with an average time of 3.2 seconds, OpenAI claims, which it says is similar to human response times in an ordinary conversation. Magazine: How to stop the artificial intelligence apocalypse: David Brin, Uplift author

ChatGPT veidotājs OpenAI ir paziņojis par savu jaunāko AI modeli GPT-4o, kas ir pļāpīgāks, cilvēciskāks AI tērzēšanas robots, kas var interpretēt lietotāja audio un video un reaģēt reāllaikā.
Uzņēmuma izdoto demonstrāciju sērija parāda, ka GPT-4 Omni palīdz potenciālajiem lietotājiem, piemēram, sagatavoties intervijai, pārliecinoties, ka tie izskatās reprezentabli intervijai, un zvanot klientu apkalpošanas aģentam, lai iegūtu iPhone nomaiņu, tulkot bilingvālu sarunu reāli. laiks.
Demonstrācijas liecina, ka tā var dalīties ar tēva jokiem, būt tiesnesim akmeņu, papīra un šķēru spēlē starp diviem lietotājiem un atbildēt ar sarkasmu, kad tas tiek jautāts. Vienā demonstrācijā pat parādīts, ka ChatGPT pirmo reizi tiek iepazīstināts ar lietotāja kucēnu.
"Nu sveiks, Bowser! Vai tu neesi tikai pats burvīgākais sīkums?" čatbots iesaucās.
Sasveicinieties ar GPT-4o — mūsu jauno vadošo modeli, kas var reāllaikā izmantot audio, attēlu un tekstu: https://t.co/MYHZB79UqNTTeksta un attēla ievade šodien tiek ieviesta API un ChatGPT ar balsi un video. nākamās nedēļas. pic.twitter.com/uuthKZyzYx
— OpenAI (@OpenAI) 2024. gada 13. maijs
“Šķiet, ka AI no filmām; un man joprojām ir mazliet pārsteidzoši, ka tas ir īsts,” 13. maija emuāra ierakstā sacīja uzņēmuma izpilddirektors Sems Altmens.
"Ir liela pārmaiņa, lai sasniegtu cilvēka līmeņa reakcijas laiku un izteiksmīgumu."
Tikai teksta un attēlu ievades versija tika palaista 13. maijā, un pilnā versija tiks izlaista tuvāko nedēļu laikā, OpenAI teica nesenajā X ierakstā.
GPT-4o būs pieejams gan maksas, gan bezmaksas ChatGPT lietotājiem un būs pieejams no ChatGPT API.
OpenAI teica, ka “o” GPT-4o nozīmē “omni”, kas mēģina iezīmēt soli ceļā uz dabiskāku cilvēka un datora mijiedarbību.
Iepazīstinām ar GPT-4o — mūsu jauno modeli, kas var izmantot tekstu, audio un video reāllaikā. Tas ir ļoti daudzpusīgs, ar to ir jautri spēlēties, un tas ir solis ceļā uz daudz dabiskāku cilvēka un datora (un pat cilvēka) mijiedarbības veidu. -datora un datora mijiedarbība): pic.twitter.com/VLG7TJ1JQx
— Gregs Brokmens (@gdb) 2024. gada 13. maijs
GPT-4o spēja vienlaikus apstrādāt jebkuru teksta, audio un attēla ievadi ir ievērojams sasniegums salīdzinājumā ar OpenAI agrākajiem AI rīkiem, piemēram, ChatGPT-4, kas bieži “zaudē daudz informācijas”, ja ir spiests veikt vairākus uzdevumus. .
Saistīts: Apple pabeidz darījumu ar OpenAI ChatGPT iPhone integrācijai: ziņojums
OpenAI teica, ka “GPT-4o ir īpaši labāka redzes un audio izpratne salīdzinājumā ar esošajiem modeļiem”, kas pat ietver lietotāja emociju un elpošanas modeļu uztveršanu.
Tas ir arī “daudz ātrāks” un “par 50% lētāks” nekā GPT-4 Turbo OpenAI API.
Jaunais AI rīks var reaģēt uz audio ievadi tikai 2,3 sekundēs, un vidējais laiks ir 3,2 sekundes, apgalvo OpenAI, kas, pēc tā teiktā, ir līdzīgs cilvēka reakcijas laikiem parastā sarunā.
Žurnāls: Kā apturēt mākslīgā intelekta apokalipsi: Deivids Brins, Uplift autors

OpenAI jaunākais jauninājums būtībā ļauj lietotājiem tiešraidē straumēt, izmantojot ChatGPT

Apskati vairāk satura no autora

Jaunākās ziņas