
AI plateau? GPT-5 sparks debate over limits of generative models
After a disappointing launch, researchers question whether meaningful progress can still be made at scale.
In March 2023, just days after OpenAI stunned the world with its new AI model, hundreds of AI and computer science experts issued a dire warning calling for a halt to the development of more powerful systems for at least six months. “AI labs are locked in an out-of-control race to develop and deploy ever more powerful digital minds that no one – not even their creators – can understand, predict, or reliably control,” they warned.
But companies continued to push ahead with their models. OpenAI itself spent more than two years developing the next-generation model, GPT-5. At the end of last week, the company unveiled it, and the fears of senior scientists turned out to be misplaced. Not because OpenAI implemented robust safeguards to prevent the nightmare scenarios they warned of, but because the model itself is simply not that impressive. It is not a “powerful digital mind,” not a major leap forward, and perhaps not even a noticeable improvement over the previous generation.
OpenAI indirectly acknowledged this by re-releasing its GPT-4o model, the very model GPT-5 was supposed to replace. GPT-5’s underwhelming debut raises a fundamental question: are modern generative AI models already nearing their limits?
Struggling with simple math
Last week’s launch was meant to be OpenAI’s triumphant moment for 2025. After sparking a revolution with ChatGPT in 2022 and cementing its leadership with GPT-4 in 2023, the company had faced setbacks. Competitors like Anthropic were producing models equal to, and in some areas surpassing, OpenAI’s capabilities. From the other end of the market, Chinese companies like DeepSeek were building comparable models at a fraction of the cost. Meanwhile, GPT-5’s development was plagued by delays, technical challenges, and higher-than-expected expenses.
The launch was supposed to correct course. OpenAI promised a dramatic leap forward, a smarter, more reliable model. According to founder and CEO Sam Altman, if speaking with GPT-4 felt like conversing with a student, GPT-5 would feel like talking to an expert with a doctorate. The company pledged sweeping improvements in writing, programming, health-related advice, and a sharp reduction in hallucinations.
Yet cracks appeared almost immediately. Social media lit up with users noting that the model still fabricated information and struggled with basic math and spelling. One example: when asked how many times the letter “b” appeared in the word “blueberry,” GPT-5 replied “three.” In another, a user uploaded a photo of a zebra with five legs; GPT-5 responded that it was an optical illusion and that zebras have four legs.
On OpenAI’s own subreddit, typically a space for supportive discussion, the most popular post after the launch argued that the company was now focused “on reducing costs, not redrawing the boundaries.”
When Altman engaged with Reddit users, he maintained that GPT-5’s writing quality was far superior to its predecessor. When he asked if users thought it was worse the overwhelming response was yes. “4 and 4.5 excelled at creative and dynamic writing. 5 is so sterile, it’s like asking a math professor to be Emily Dickinson,” one user replied.
Mathematicians were unimpressed too. “The launch was disappointing,” said Noah Giansiracusa, a mathematician at Bentley University. “There were some improvements, but they were much more marginal than I thought.”
Dr. Gary Marcus, professor emeritus at NYU and one of the world’s leading AI experts, called GPT-5 “disappointing” and its launch “chaotic.” “OpenAI basically blew itself up – and not in a good way,” he wrote on his blog. “Aside from a few influencers who praise every new model, the dominant reaction was major disappointment. People had grown to expect miracles, but GPT-5 is just the latest incremental advance. And it felt rushed at that.”
Another major complaint centered on OpenAI’s decision to remove the ability for paying subscribers to choose which model they used. Previously, subscribers could select from multiple models, tailoring answers for creativity, logic, deep research, or even cross-checking results. GPT-5 eliminated this flexibility, offering only the standard and “thinking” versions (and for Pro subscribers at $200/month, a “research-level” intelligence version).
OpenAI said ChatGPT would automatically route queries to the best model, a move that angered many. A routing glitch on launch day only fueled frustration. Thousands signed a petition demanding GPT-4o’s return, and on Tuesday Altman relented, restoring access to the older model for paid subscribers.
The reappearance of a more than two-year-old model, which should have been eclipsed by GPT-5, underscores the challenges AI companies now face. When GPT-4 launched, the leap forward was so dramatic that top experts feared it could trigger existential risks. But after years of work and billions in investment, OpenAI has delivered only modest gains, and even those are contested by its most experienced users.
It is possible the major leaps in generative AI are behind us. From here, significant improvements may demand enormous investment for only marginal returns.
Fragile illusion
A study from the University of Arizona, posted online last week and not yet peer-reviewed, adds weight to this theory. Researchers trained an AI model and tested its ability to solve problems using “chain-of-thought” reasoning, a method where the model explains its logic step-by-step. They found this reasoning chain to be a fragile illusion, collapsing as soon as the model faced problems beyond its training data.
“The failure to generalize adequately outside distribution tells us why all the dozens of shots on goal at building “GPT-5 level models” keep missing their target. It’s not an accident. That failing is principled,” write Marcus.
If so, it’s yet another warning sign for a sector already weighed down by massive infrastructure, operational, and staffing costs, and by the lack of a proven business model. Without the ability to deliver clear, significant breakthroughs, AI companies may find it increasingly hard to stand out, let alone sustain themselves, in a crowded market.














