ChatGPT gained momentum since its debut in November 2022. During these months, it has been on everyone’s lips. Some people use this AI chatbot for fun, while many firms use it to make money. But recent studies have raised concerns about ChatGPT performance degradation over time. Plus, the new version of this chatbot has begun showing odd effects of AI model drift.
ChatGPT performance is showing a downward trend
This study united scholars from Stanford University and the University of California, Berkeley, revealing a surprising trend. Compared to GPT-3.5, ChatGPT’s premium version, GPT-4, showed huge changes in response accuracy across a range of activities.
The study assessed ChatGPT performance across four distinct tasks: solving mathematical problems, responding to personal queries, writing software code, and visual reasoning. We were surprised. GPT-4’s math accuracy fell from 97.6% (March) to 2.4% (June). On the other hand, GPT-3.5 showed a huge increase in mathematical accuracy from 7.4% in March to 86.8% in June.
The researchers also saw unusual behavior when they asked both versions of ChatGPT to explain their answers. In March, ChatGPT explained everything it did. But in June, the chatbot stopped providing this insightful info, making it difficult for scholars to understand how it made decisions. Also, GPT-4 and GPT-3.5 offered different strategies when faced with sensitive questions. The June versions refused to answer questions without giving any reason, leaving users in the dark.
Because ChatGPT and similar AI models are ‘black boxes‘, the reasons for these changes are unclear. Scholars struggle to comprehend the links causing these changes. They lack access to brain structures, data, and model updates. The ChatGPT performance changes seen in June were likely influenced by the July release of GPT-4. The latter included API access.
Gizchina News of the week
Firms using the ChatGPT API in their products and services have much to lose from this model drift. The varying nature of AI can disrupt plans and affect crucial applications. Balancing ease with trust and having backup plans to reduce risks is crucial.
In a thought-provoking episode, Christopher Penn and Katie Robbert, hosts of Trust Insights’ In-Ear Insights program, explore the question of whether ChatGPT’s responses are becoming less accurate over time. This highlights the need to set clear AI goals. Balance ease and trust to prevent model drift issues.
AI model issues
The results of the study provide insight into the inner workings of AI models and the challenges of maintaining their performance over time. When tuning a large language model for specific tasks, it can impact the ChatGPT performance in other areas. This highlights the need for continuous model performance monitoring and the ability to adapt to the changing world.
As a result of the study, OpenAI has highlighted the problem of model drift. So it’s obvious that scholars ought to continue to address this issue. Building trust with users and stakeholders requires transparency in AI development and providing insight into model upgrades.
Those looking to deploy chatbots and other AI-powered solutions need to take a proactive stance in the face of the changing world of AI. A good AI integration plan needs to define goals, set reasonable expectations, and put safeguards in place against model drift.
In light of these findings, experts advise considering open-source AI models that may be housed on proprietary servers. Such a strategy gives firms more control over their AI solutions and protects them from the consequences of vendor changes to open APIs.