NEW BOOK!
Explore a better way to work – one that promises more calm, clarity, and creativity.

What if AI Doesn’t Get Much Better Than This?

In the years since ChatGPT’s launch in late 2022, it’s been hard not to get swept up in feelings of euphoria or dread about the looming impacts of generative AI. This reaction has been fueled, in part, by the confident declarations of tech CEOs, who have veered toward increasingly bombastic rhetoric.

“AI is starting to get better than humans at almost all intellectual tasks,” Anthropic CEO Dario Amodei recently told Anderson Cooper. He added that half of entry-level white collar jobs might be “wiped out” in the next one to five years, creating unemployment levels as high as 20%—a peak last seen during the Great Depression.

Meanwhile, OpenAI’s Sam Altman said that AI can now rival the abilities of a job seeker with a PhD, leading one publication to plaintively ask, “So what’s left for grads?”​

Not to be outdone, Mark Zuckerberg claimed that superintelligence is “now in sight.” (His shareholders hope he’s right, as he’s reportedly offering compensation packages worth up to $300 million to lure top AI talent to Meta.)

But then, two weeks ago, OpenAI finally released its long-awaited GPT-5, a large language model that many had hoped would offer leaps in capabilities, comparable to the head-turning advancements introduced by previous major releases, such as GPT-3 and GPT-4. But the resulting product seemed to be just fine.

GPT-5 was marginally better than previous models in certain use cases, but worse in others. It had some nice new usability updates, but others that some found annoying. (Within days, more than 4,000 ChatGPT users signed a change.org petition asking OpenAI to make their previous model, GPT-4o, available again, as they preferred it to the new release.) An early YouTube reviewer concluded that GPT-5 was a product that “was hard to complain about,” which is the type of thing you’d say about the iPhone 16, not a generation-defining technology. AI commentator Gary Marcus, who had been predicting this outcome for years, summed up his early impressions succinctly when he called GPT-5 “overdue, overhyped, and underwhelming.”

This all points to a critical question that, until recently, few would have considered: Is it possible that the AI we are currently using is basically as good as it’s going to be for a while?

​In my most recent article for The New Yorker, which came out last week, I sought to answer this question. In doing so, I ended up reporting on a technical narrative that’s not widely understood outside of the AI community. The breakthrough performance of the GPT-3 and GPT-4 language models was due to improvements in a process called pretraining, in which a model digests an astonishingly large amount of text, effectively teaching itself to become smarter. Both of these models’ acclaimed improvements were caused by increasing their size as well as the amount of text on which they were pretrained.

At some point after GPT-4’s release, however, the AI companies began to realize that this approach was no longer as effective as it once was. They continued to scale up model size and training intensity, but saw diminishing returns in capability gains.

In response, starting around last fall, these companies turned their attention to post-training techniques, a form of training that takes a model that has already been pretrained and then refines it to do better on specific types of tasks. This allowed AI companies to continue to report progress on their products’ capabilities, but these new improvements were now much more focused than before.

Here’s how I explained this shift in my article:

“A useful metaphor here is a car. Pre-training can be said to produce the vehicle; post-training soups it up. [AI researchers had] predicted that as you expand the pre-training process you increase the power of the cars you produce; if GPT-3 was a sedan, GPT-4 was a sports car. Once this progression faltered, however, the industry turned its attention to helping the cars that they’d already built to perform better.”

The result was a confusing series of inscrutably named models—o1, o3-mini, o3-mini-high, -4-mini-high—each with bespoke post-training upgrades. These models boasted widely-publicized increases on specific benchmarks, but no longer the large leaps in practical capabilities we once expected. “I don’t hear a lot of companies using AI saying that 2025 models are a lot more useful to them than 2024 models, even though the 2025 models perform better on benchmarks,” Gary Marcus told me.

The post-training approach, it seems, can lead to incrementally better products, but not the continued large leaps in ability that would be necessary to fulfill the tech CEO’s more outlandish predictions.

None of this, of course, implies that generative AI tools are worthless. They can be very cool, especially when used to help with computer programming (though maybe not as much as some thought), or to conduct smart searches, or to power custom tools for making sense of large quantities of text. But this paints a very different picture from one in which AI is “better than humans at almost all intellectual tasks.”

For more details on this narrative, including a concrete prediction for what to actually expect from this technology in the near future, read the full article. But in the meantime, I think it’s safe, at least for now, to turn your attention away from the tech titans’ increasingly hyperbolic claims and focus instead on things that matter more in your life.

Leave a Comment