OpenAI is reportedly working on a project named “Strawberry.” According to a source familiar with the matter and internal documentation reviewed by Reuters, this project aims to significantly advance the reasoning capabilities of AI models significantly, potentially bringing them closer to human-level intelligence.
Strawberry represents a novel approach to AI, focusing on enabling the models to plan ahead, navigate the internet autonomously, and perform deep research.
The details of Strawberry have not been previously reported, highlighting the secretive nature of the project. Teams within OpenAI have been working on Strawberry for some time, as revealed by a recent internal document seen by Reuters in May.
The exact date of the document remains unclear, but it outlines how OpenAI plans to leverage Strawberry for research purposes. Described as a work in progress, the project’s timeline for public availability is yet to be determined.
The secrecy surrounding Strawberry is intense, with even internal details tightly guarded. Previously known as Q, the project has been considered a breakthrough within OpenAI.
Q demos shown to some OpenAI staff earlier this year demonstrated the models’ ability to answer complex science and math questions that current commercially available models cannot tackle. These capabilities are seen as a significant step towards achieving AI with advanced reasoning skills.
OpenAI has a clear vision for Strawberry. The project aims to enhance the AI’s ability to generate answers to queries, plan ahead, and navigate the internet autonomously. This involves a specialized post-training process where the AI models are further refined after being initially trained on vast datasets.
This post-training phase, akin to methods developed at Stanford University, like the “Self-Taught Reasoner” (STaR), allows AI models to iteratively create their own training data, potentially leading to higher intelligence levels.
The goal of Strawberry is to perform long-horizon tasks, which require an AI model to plan ahead and execute a series of actions over an extended period. OpenAI is creating and training these models on what it calls a “deep-research” dataset, although the specifics of this dataset remain undisclosed.
One of the envisioned applications for Strawberry is enabling the AI to conduct research autonomously by browsing the web with the assistance of a “computer-using agent” (CUA). This agent would take actions based on the AI’s findings, pushing the boundaries of what AI can achieve in terms of independent research capabilities.
OpenAI has been signaling to developers and other stakeholders that it is on the verge of releasing technology with significantly more advanced reasoning capabilities.
This message aligns with the company’s broader strategy to improve AI models’ reasoning abilities, a key area of focus highlighted by OpenAI CEO Sam Altman earlier this year. Altman emphasized that progress in AI reasoning is crucial for achieving human or super-human-level intelligence.
Despite the excitement surrounding Strawberry, it is clear that OpenAI faces challenges. The current limitations of large language models include issues with common sense reasoning and solving intuitive problems.
Researchers agree that enhancing reasoning in AI models is essential for enabling them to perform complex tasks reliably. This involves forming a model that allows the AI to plan ahead, understand the physical world, and work through multi-step problems.
Other tech giants like Google, Meta, and Microsoft are also exploring techniques to improve AI reasoning, highlighting the competitive landscape in AI research. However, there is debate among researchers about the capability of LLMs to incorporate long-term planning and reasoning.
Yann LeCun from Meta has expressed skepticism about LLMs’ ability to achieve humanlike reasoning.
Strawberry is a critical component of OpenAI’s strategy to address these challenges. The project involves post-training techniques to adapt and enhance the base AI models’ performance in specific ways. This approach includes fine-tuning, where humans provide feedback based on the model’s responses, and feeding it examples of good and bad answers.
The potential impact of Strawberry is substantial. If successful, it could enable AI models to perform complex research tasks, make major scientific discoveries, and develop new software applications.