On Tuesday, Openai published new tools designed to help developers and companies build AI agents – automated systems that can accomplish tasks independently – using the company’s own models and executives.
The tools are part of the API of OPENAI responses, which allows companies to develop personalized AI agents that can do web research, browse business files and navigate websites, a bit like the product of the Openai operator. The API of responses effectively replaces the API of OpenAi assistants, which the company plans to sunset in the first half of 2026.
The media threshing around AI agents has increased considerably in recent years despite the fact that the technology industry has had trouble showing people, or even defining what “AI agents” are really. In the most recent example of the agent’s media threw in front of the public service, the Chinese starting butterfly effect this week has become viral for a new AI agent platform called manus that users have quickly discovered did not hold many promises of the company.
In other words, the challenges are high for OpenAi to obtain correct agents.
“It is quite easy to demonstrate your agent,” said Olivier Godemont, the OPENAI API product manager in Techcrunch in an interview. “To evolve an agent, it is quite difficult to make people use it often.”
Earlier this year, Openai introduced two AI agents in Chatgpt: Operator, which sails on websites on your behalf, and Deep Research, which compiles search reports for you. The two tools offered an overview of what agentic technology can achieve, but has left a little desire in the “autonomy” department.
Now, with the API of responses, Openai wishes to sell access to the components that feed AI agents, allowing developers to build their own agent research and in -depth research applications. Openai hopes that developers can create applications with its agent technology that feel more independent than what is available today.
Using the API Responses, the developers can press the same models of AI (in preview) under the hood of the OPENAI Chatgpt search web search tool: GPT-4O search and Mini Search GPT-4O. Models can browse the web to get answers to questions, quoting sources because they generate answers.
OPENAI says GPT-4O Search and GPT-4O Mini Search are very accurate. On the simple reference of the company, which measures the capacity of models to answer short questions and in search of facts, GPT-4O Search score 90% while GPT-4O Mini Search Scores 88% (higher, it’s better). For comparison, GPT -4.5 – The much larger and recently published model of OpenAI – only stipulates 63%.
The fact that the research tools fed by AI are more precise than traditional AI models is not necessarily surprising – in theory, GPT -4O research can simply look for the right answer. However, research on the web does not give up the hallucinations a problem solved. Beyond their factual precision, AI research tools also tend to fight against short navigation queries (such as “Lakers Score Today”), and recent reports suggest that Chatgpt quotes are not always reliable.
The API of responses also includes a file search utility which can quickly scan files in a company’s databases to recover information. (OPENAI claims that it will not form models on these files.) In addition, developers using the API of responses can press the OPENAI computer user model (CUA), which feeds the operator. The model generates mouse and keyboard actions, allowing developers to automate computer use tasks such as data entry and application workflows.
Companies can possibly execute the CUA model, which publishes in the preview of the research, locally on their own systems, said Openai. The consumer version of the CUA available in the operator can only take measures on the web.
To be clear, the API of responses will not solve all the technical problems that afflict AI agents today.
Although the research tools fueled by AI are more precise than traditional AI models – a fact that is not surprising since they can simply look for the right answer – web search does not make hallucinations a problem solved. GPT-4O research still gets 10% of erroneous factual issues. Beyond their precision, AI research tools also tend to fight with short and navigation queries (such as “Lakers score Today”), and recent reports suggest that Chatgpt quotes are not always reliable.
In a blog article provided in Techcrunch, Openai said that the CUA model was “not yet very reliable for the automation of tasks on operating systems” and that it is likely to make “unprecedented” errors.
However, Openai said they are early iterations of their agent tools, and it is constantly working to improve them.
In addition to the API of responses, Openai publishes an open source toolbox called agent SDK, which offers developers free tools to integrate models into their internal systems, implemented guarantees and monitor the activities of AI agents for debugging and optimization. The agents’ SDK is a kind of follow-up to Openai’s Swarm, a multi-agent orchestration framework that the company published at the end of last year.
Godemont said he hoped that Openai will be able to fill the gap between the demos and the products of the agent of AI this year, and that, in his opinion, “the agents are the most impactful application of the AI that will occur.” This echoes a CEO of OPENAI proclamation, Sam Altman, produced in January: This 2025 is the year when AI agents enter the labor market.
Whether or not 2025 becomes “the year of the AI agent”, the latest versions of Openai show that the company wants to pass demos of flashy agent to punchy tools.