OpenAI co-founder Ilya Sutskever pointed out at the NeurIPS conference held in Vancouver, Canada on December 15 that the current development of AI has reached a critical turning point, as pre-training technology is gradually facing bottlenecks, and in the future, it will move towards Artificial Super Intelligence (ASI).
AI pre-training data has hit a 'ceiling'; transformation is imperative.
Sutskever stated at the conference that the era of AI pre-training is coming to an end. He believes that the amount of online data is nearing its limit, and new technologies will be needed in the future to continue advancing AI to the next stage, ultimately developing Artificial Super Intelligence (ASI).
Sutskever noted that as software, hardware, and algorithms continue to improve, the computing power of AI has significantly increased, but the data used to train AI cannot be infinitely expanded. Sutskever likened data to AI's 'fossil fuel,' stating, 'Data will not grow endlessly because there is only one internet. Data is like fossil fuel for AI, and it is nearly burned out. In the future, we must find ways to make full use of existing data.'
(Note: Pre-trained models refer to models that do not need to start training from scratch, as they have already learned basic knowledge.)
Three key technologies advancing AI development.
Although Sutskever pointed out the current problems facing AI at the conference, he also proposed three key technologies that could influence the evolution of AI into Artificial Super Intelligence (ASI):
Agentic AI: Capable of making decisions and executing tasks independently without human intervention, adjusting behavior dynamically according to goals and the environment. Unlike AI Agents, which primarily act passively or based on fixed logic and require more human intervention.
Synthetic Data: Using AI to generate high-quality synthetic data to address the issue of insufficient data volume. For example, if we want to train an AI model to recognize vehicles on the road, but there is insufficient traffic data in the real world, we can use synthetic techniques to 'generate' many simulated vehicles and scenarios to replace it.
Inference Time Computing: Enhancing the computing capability of AI models, enabling AI to solve complex problems more quickly.
Sutskever believes that these three major technologies can advance current AI technology towards 'Artificial Super Intelligence' (ASI).
The AI boom sweeps through the blockchain and LLM markets.
The concept of AI agents is not only gaining attention in the tech field, but many meme coins and large language models (LLMs) are also beginning to integrate AI technology. For instance, the AI agent Truth Terminal is promoting the meme coin GOAT on social media, which has surged in market value to $600 million, and even prominent venture capitalist a16z founder Marc Andreessen has expressed his amazement at Truth Terminal.
The most well-known recent case of AI agents combined with large language models is the Gemini 2.0 model launched by Google DeepMind. According to Google, Gemini 2.0 can directly generate images, text, and even convert text to speech, while also adjusting the sound effects for different languages. It can also directly use Google search, execute code, and utilize user-customized third-party tools.
The advantages of autonomous AI solve the 'AI hallucination' problem.
Sutskever pointed out that autonomous AI and real-time inference computing help address the issue of 'hallucinations' in AI training. AI hallucinations refer to the erroneous or false information that AI models may produce due to insufficient training data. As the next generation of AI models still relies on data generated by older models, this problem will only worsen.
Sutskever stated that to solve the 'hallucination' issue, autonomous AI can enhance reasoning and real-time computing capabilities to effectively verify data authenticity, improving the reliability and performance of AI.
Facing the major issue of 'hallucinations' caused by AI training data reaching its limits is actually different from Jensen Huang's thoughts. Previously, Huang also pointed out this issue in an interview and proposed three important stages for improving 'hallucinations' in the future:
Pre-training:
For the foundational phase of AI, it involves absorbing a large amount of real-world data to 'learn' and 'discover' various knowledge, but this is just the beginning and not deep enough.
Post-training:
This is the phase of strengthening AI, through human feedback, such as humans assisting with scoring. Additionally, AI's own feedback and the use of synthetic data to simulate more scenarios. At this stage, techniques like reinforcement learning and multi-path learning will be incorporated to help AI focus on improving specific skills, enabling it to better understand how to solve problems.
Test Time Scaling:
This phase can be understood as AI starting to 'think.' When faced with complex problems, AI will break down the issues step by step, repeatedly simulating different solutions, and continuously adjusting to find the best answer. Jensen Huang believes that if AI is given more 'thinking time,' the answers it arrives at may be more accurate or of higher quality.
This article marks the end of AI pre-training! OpenAI co-founder: Autonomous AI and synthetic data accelerate the arrival of the era of Artificial Super Intelligence, first appearing in Blockchain News ABMedia.