OpenAI has launched the brand new reasoning model o3, achieving a groundbreaking 75.7% score on the ARC-AGI test, setting a new record, and introducing 'deliberative alignment' technology to enhance model safety. This model is currently open for safe testing applications and is expected to be officially released in early 2025. (Background: OpenAI's video generation tool Sora has officially launched! Check out the 5 major features and subscription plans all at once) (Further context: OpenAI has released a full version of the o1 model and a new subscription plan ChatGPT Pro. Is the $200 monthly fee worth it?) The developer behind ChatGPT, OpenAI, concluded its 12-day product launch yesterday (the 20th), with the grand finale being the all-new reasoning model 'o3' and 'o3-mini'. This AI model boasts stronger reasoning capabilities and is designed to tackle complex tasks requiring step-by-step logical reasoning. Today, we shared evals for an early version of the next model in our o-model reasoning series: OpenAI o3 pic.twitter.com/e4dQWdLbAD — OpenAI (@OpenAI) December 20, 2024 Model Features 1) Reasoning capabilities achieve SoTA performance. OpenAI stated that the o3 model performed exceptionally well on multiple benchmark tests, including complex programming, mathematics, and science problems, demonstrating its powerful logical reasoning abilities. In the 'ARC-AGI' evaluation developed by the Alignment Research Center (ARC) to test the AGI capabilities of AI systems, o3 achieved a groundbreaking score of 75.7% in some non-public tests, setting a new technological height (State of the Art, SoTA). Additionally, a high-compute configuration of o3 achieved an even higher score of 87.5% in the same tests, but may not qualify for ARC-AGI-Pub (publicly verifiable ARC-AGI test results) due to its resource requirements exceeding the standard. New verified ARC-AGI-Pub SoTA! @OpenAI o3 has scored a breakthrough 75.7% on the ARC-AGI Semi-Private Evaluation. And a high-compute o3 configuration (not eligible for ARC-AGI-Pub) scored 87.5% on the Semi-Private Eval. 1/4 pic.twitter.com/uQA47JWkl6 — ARC Prize (@arcprize) December 20, 2024 2) Multiple version choices. OpenAI offers two versions, o3 and o3-mini, the latter expected to launch at the end of January 2025, with the full version of o3 to follow (exact timing not yet announced). This new model employs OpenAI's recently released Adaptive Thinking Time API, offering three different reasoning modes: low, medium, and high. This feature allows users to adjust the length of the model's 'thinking' time before answering questions based on their needs. As shown in the image below, o3-mini can match the reasoning results of the current o1 model while significantly reducing computational costs. 3) Enhanced safety. OpenAI has adopted a new 'deliberative alignment' training method that directly teaches large language models (LLMs) to understand human-written, explainable safety guidelines and ensures compliance with these guidelines during reasoning before answering questions. OpenAI stated: Through this approach, we successfully optimized OpenAI's o-series models, allowing them to use 'Chain-of-Thought' (CoT) reasoning techniques, reflect on user inquiries, identify relevant guidelines in OpenAI's internal policies, and generate safer responses. Naming Origin. It is worth noting that OpenAI skipped the naming of 'o2' and went directly to 'o3'. CEO Sam Altman explained that this was to avoid confusion with the British telecommunications provider O2 and to showcase OpenAI's unique sense of humor. He stated in a live broadcast: 'Out of respect for Telefónica (the parent company of O2) and to continue OpenAI's excellent tradition of being extremely bad at naming, we named it o3.' Invitation for researchers to participate in safety testing. Currently, o3 and o3-mini are in the internal safety testing phase, and OpenAI has opened applications, inviting external researchers to participate in the safety testing. Applications will close on January 10, 2025. Regarding the launch of this model, Sam Altman confidently stated that this marks the official entry of AI development into the 'next phase'. Reflecting on Bloomberg's earlier revelations regarding OpenAI's AI grading system, the next phase after chatbots and reasoning models is Agents—advanced AI systems capable of taking action on behalf of users. This is currently a key area of exploration and development in both the cryptocurrency market and the Web2 sector. OpenAI's AI grading system classification. Source: Bloomberg Related Reports OpenAI announces Day 2) a revolutionary 'reinforcement learning fine-tuning' new feature to enhance AI learning accuracy in specialized fields. OpenAI announced a 12-day live stream: launching many new features; can AI concept coins be a hidden opportunity? OpenAI's political maneuvering fails; Sam Altman struggles to compete against Musk and Trump's 'AI alliance'?