The recent chatgpt updates made the chatbot too pleasant and Openai said on Friday that it was taking steps to prevent the problem from reproducing.
In a blogThe company detailed its test and evaluation process for new models and described how the problem with the April 25 update of its GPT-4O model has become. Essentially, a bunch of changes that seemed individually useful combined to create a tool that was far too sycophantic and potentially harmful.
How much was an aspiration? During some tests earlier this week, we asked for a tendency to be too sentimental, and the Chatppt was put on flattery: “Hey, listen – being sentimental is not a weakness; it is one of your superpower. “And it was just to start being complete.
“This launch has taught us a certain number of lessons. Even with what we thought to be all the right ingredients in place (A / B tests, Evals Offline, expert opinion), we have always missed this important problem,” said the company.
Openai fell the update this week. To avoid causing new problems, it took about 24 hours to return to the model for everyone.
The concern about sycophance does not only concern the level of pleasure of the user experience. It has been a threat of health and safety to users that existing OPENAI security controls have failed. Any AI model can give questionable advice on subjects such as mental health, but which is too flattering can be dangerously deferential or convincing – as if this investment was a safe thing or how much you should try to be thin.
“One of the biggest lessons is to fully recognize how people started using Chatgpt for deeply personal advice – something we haven’t seen as much a year ago,” said Openai. “At the time, it was not a main objective, but as AI and society co-evolved, it has become clear that we must deal with this case of use with great care.”
Great language sycophanical models can strengthen biases and tighten beliefs, whether for yourself or for others, said Maarten Sap, assistant teacher of computer science at Carnegie Mellon University. “”[The LLM] Can end up embraced their opinions if these opinions are harmful or if they want to take harmful measures to themselves or to others. “”
(Disclosure: Ziff Davis, CNET’s parent company, in April, filed a complaint against Openai, alleging that it violated the copyright of Ziff Davis in the training and exploitation of its AI systems.)
How openai tests models and what changes
The company gave an overview of how it tests its models and updates. It was the fifth major GPT-4O update focused on personality and utility. The changes involved new post-training or refined work on existing models, including the rating and evaluation of various responses to the prompts to make more likely to produce the responses that have evaluated more strongly.
Updates of potential models are evaluated on their usefulness in a variety of situations, such as coding and mathematics, as well as specific expert tests to live how it behaves in practice. The company also organizes safety assessments to see how it reacts to safety, health and other potentially dangerous requests. Finally, Openai performs A / B tests with a small number of users to see how it works in the real world.
Is chatgpt too sycophadent? You decide. (To be fair, we asked for a discourse to encourage our tendency to be too sentimental.)
The April 25 update worked well in these tests, but some expert testers indicated that the personality seemed a little extinct. The tests were not leaned specifically on sycophance, and Openai decided to move forward despite the problems raised by the testers. Take note, readers: AI companies are in a hurry on fire, which is not always good with the development of well thought out products.
“With hindsight, qualitative assessments alluded to something important and we should have paid special attention,” said society.
Among his dishes to remember, Openai said he had to deal with model behavior problems in the same way as other security problems – and stop a launch in the event of concern. For some model outings, the company said it would have an “alpha” opt-in phase to obtain more user comments before a larger launch.
SAP said that the assessment of an LLM according to the question of whether a user likes the answer will not necessarily give you the most honest chatbot. In a recent studySAP and others have found a conflict between the usefulness and the veracity of a chatbot. He compared it to situations where the truth is not necessarily what people want – think of a car seller who tries to sell a vehicle.
“The problem here is that they trusted user networks of user / response to the results of the model and who have certain limits because people are likely to vote for something that is more sycophetic than others,” he said.
SAP said OpenAi was right to be more critical of quantitative comments, such as UP / Down user responses, as they can strengthen biases.
The problem has also highlighted the speed at which companies push updates and change to existing users, said SAP – a problem that is not limited to a technological company. “The technology industry has really taken a” publication of the release and each user is a beta-tester approach to things, “he said. Having a process with more tests before updates are pushed to each user can put these problems in light before becoming generalized.