OpenAI Aims for Fully Automated AI Researcher by 2028
By Alexander Cole
Image / Photo by Possessed Photography on Unsplash
OpenAI plans to build a fully autonomous AI researcher that can chase big questions without human prompts.
The company has outlined a multi-year “north star” focused on agent-based research automation. By September, it intends to deliver an autonomous AI research intern capable of tackling a small set of research problems on its own, a precursor to a fully automated multi-agent system slated to debut in 2028. The plan, disclosed in an exclusive interview with chief scientist Jakub Pachocki, signals a shift from prompting ever-smarter models to architecting systems that can design, run, and interpret experiments with limited human intervention.
If successful, the move could compress the time from hypothesis to insight by orders of magnitude. OpenAI envisions a stack of coordinated agents: planners, experiment runners, data analyzers, and risk monitors that can operate in concert across domains—from physics to biology to economics. In practice, that means less hand-holding of every step and more autonomous iteration: the system conceives experiments, executes simulations or live tests where safe, analyzes results, and surfaces next steps. It’s the sort of leap that would push AI research from “assisted by humans” to “led by machines,” at least for defined problem spaces.
But the ambition carries hefty implications for compute, data, and governance. Agent-based systems demand robust orchestration across tools, environments, and validation channels. They require careful design to prevent runaway experimentation, misinterpretation of results, or biased data loops from seeping into conclusions. The paper-trail here isn’t just about smarter APIs; it’s about building a reliable, auditable, and safety-conscious research engine that can justify its conclusions to human reviewers.
The timing also intersects with a broader cautionary note the same week: even as AI accelerates research, some scientific frontiers remain stubbornly difficult to study with current methods. MIT Technology Review highlights that psychedelic drugs—psilocybin and related compounds—are experiencing a surge of interest across depression, PTSD, addiction, and obesity. Yet two studies published recently emphasize how hard it is to draw clean conclusions in this domain, underscoring that technology alone isn’t a silver bullet for every research bottleneck. In other words, the automation stack can speed exploration, but it won’t replace the need for rigorous study design, careful data interpretation, and domain-specific safeguards—especially in areas with high clinical stakes.
For practitioners, the launch offers both promise and peril. Here are 2–4 concrete takeaways to watch as OpenAI’s initiative unfolds:
Analogy helps: this is like assigning a self-driving science notebook to a research team—one that can draft plans, run simulations or controlled experiments, and pull in results, but still needs a supervising driver, a map of safety rules, and a trusted checklist to avoid veering into misinterpretation or unsafe territory. If the guardrails hold, the lab could scale its exploratory velocity dramatically; if not, it risks amplifying erroneous conclusions just as fast as it accelerates discovery.
The big practical question for product teams this quarter isn’t “Will this replace researchers?” but “Where will automation add reliable value first, and how will we govern it?” Expect OpenAI to roll out cautious pilots, with emphasis on evaluation, safety protocols, and governance frameworks before any broader operational use in real-world research programs.
Sources
Newsletter
The Robotics Briefing
Weekly intelligence on automation, regulation, and investment trends - crafted for operators, researchers, and policy leaders.
No spam. Unsubscribe anytime. Read our privacy policy for details.