According to TechCrunch, OpenAI CEO Sam Altman revealed in a Reddit AMA that the company is facing significant compute capacity limitations, which are hindering the frequency of product releases. Altman explained that the complexity of the models and the need to make tough decisions on compute allocation are major factors in the delays. Reports indicate that OpenAI has been struggling to secure sufficient compute infrastructure for running and training its generative models. Recently, Reuters reported that OpenAI has been collaborating with Broadcom to develop an AI chip, expected to be available by 2026.
Due to these capacity constraints, OpenAI's Advanced Voice Mode for ChatGPT will not be receiving the vision capabilities initially demonstrated in April. During the April press event, OpenAI showcased the ChatGPT app responding to visual cues via a smartphone camera. However, Fortune later reported that the demo was rushed to divert attention from Google's I/O developer conference, and many within OpenAI believed GPT-4o was not ready for release. Consequently, the voice-only version of Advanced Voice Mode experienced months of delays.
In the AMA, Altman mentioned that there is no set timeline for the next major release of OpenAI's image generator, DALL-E. Additionally, Sora, OpenAI's video-generating tool, has been delayed due to the need for model perfection, safety considerations, and scaling compute. Kevin Weil, OpenAI's chief product officer, noted that Sora has faced technical challenges, making it less competitive compared to rival systems from Luma and Runway. The original system, unveiled in February, required over 10 minutes to process a 1-minute video clip. In October, Tim Brooks, one of Sora's co-leads, left for Google.
Altman also discussed the possibility of allowing