Tech
Full frame Close up Background Blueberries
(Sunlight7/Getty Images)

GPT-5: “A legitimate PhD level expert in anything” that sucks at spelling and geography

OpenAI spent a lot of time talking about how “smart” GPT-5 is, yet it failed spectacularly at tasks that a second grader could achieve.

Jon Keegan

Yesterday, OpenAI spent a lot of time talking about how “smart” its new GPT-5 model was.

OpenAI cofounder and CEO Sam Altman said of GPT-5: “Its like talking to an expert, a legitimate PhD level expert in anything,” and called it “the most powerful, the most smart, the fastest, the most reliable and the most robust reasoning model that weve shipped to date.”

Demos showed GPT-5 effortlessly creating an interactive simulation to explain the Bernoulli effect, diagnosing and fixing complex code errors, planning personalized running schedules, and “vibecoding” a cartoony 3D castle game. The company touted benchmarks showing how GPT-5 aced questions from a Harvard-MIT mathematics tournament and got high scores on coding, visual problem-solving, and other impressive tasks.

But once the public got its chance to kick GPT-5’s tires, some cracks started to emerge in this image of a superintelligent expert in everything.

AI models are famously bad at spelling different kinds of berries, and GPT-5 is no exception.

I had to try the “blueberry” thing myself with GPT5. I merely report the results.

[image or embed]

— Kieran Healy (@kjhealy.co) August 7, 2025 at 8:04 PM

Another classic failure of being bad at maps persisted with GPT-5 when it was asked to list all the states with the letter R in their names. It even offered to map them, with hilarious results:

My goto is to ask LLMs how many states have R in their name. They always fail. GPT 5 included Indiana, Illinois, and Texas in its list. It then asked me if I wanted an alphabetical highlighted map. Sure, why not.

[image or embed]

— radams (@radamssmash.bsky.social) August 7, 2025 at 8:40 PM

To be fair to OpenAI, this problem isn’t unique to GPT-5, as similar failures were documented with Google’s Gemini.

During the livestream, in an embarrassing screwup, a presentation showed some charts that looked like they had some of the same problems.

If this was all described as a “technical preview,” these kinds of mistakes might be understandable. But this is a real product from a company thats pulling in $1 billion per month. OpenAI’s models are being sold to schools, hospitals, law firms, and government agencies, including for national security and defense.

OpenAI is also telling users that GPT-5 can be used for medical information, while cautioning that the model does not replace a medical professional.

“GPT‑5 is our best model yet for health-related questions, empowering users to be informed about and advocate for their health.”

Why is this so hard?

The reason why such an advanced model can appear to be so capable at complex coding, math, and physics yet fail so spectacularly at spelling and maps is that generative models like GPT-5 are probabilistic systems at their core — they predict the most likely next token based on the volumes of data they have been trained on. They don’t know anything, and they don’t think or have any model of how the world works (though researchers are working on that).

When the model writes its response, you see the thing that has the highest score for what you should see next. But with math and coding, the rules are more strict and the examples its been trained on are consistent, so it has higher accuracy and can ace the math benchmarks with flying colors.

But drawing a map with names or counting the letters in a word are weirdly tough, as it requires a skill the model doesn’t really have and has to figure out step by step from patterns, which can lead to odd results. That’s a simplification of a very complex and sophisticated system, but applies to a lot of the generative-AI technology in use today.

Thats also a whole lot to explain to users, but OpenAI boils those complicated ideas down to a single warning below the text box: “ChatGPT can make mistakes. Check important info.”

OpenAI did not immediately respond to a request for comment.

More Tech

See all Tech
tech

Anthropic reportedly doubles current fundraising round to $20 billion

Anthropic has doubled its current fundraising round to $20 billion on strong investor demand, according reporting from the Financial Times. The new fundraising round would value the company at a staggering $350 billion. That’s up 91% from September, when it raised at a valuation of $183 billion.

The company reportedly received interest totaling 5x to 6x its original $10 billion fundraising goal, and it’s expected to haul in several billion more than that tally before the current round closes.

Anthropic’s success with enterprise customers and the popularity of its Claude Code product are boosting the company’s momentum as it chases the current valuation leader of the AI startup pack: OpenAI.

The company reportedly received interest totaling 5x to 6x its original $10 billion fundraising goal, and it’s expected to haul in several billion more than that tally before the current round closes.

Anthropic’s success with enterprise customers and the popularity of its Claude Code product are boosting the company’s momentum as it chases the current valuation leader of the AI startup pack: OpenAI.

Produce At Whole Foods Market's Flagship Store

Amazon says it’s doubling down on opening Whole Foods stores. That sounds familiar.

The company says it’ll open 100 Whole Foods locations in the next few years. That sounds similar to plans Whole Foods’ CEO laid out in 2024 for opening 30 stores a year. Since then, it appears to have added 14, total.

Incredulous Man

One year after the DeepSeek freak, the AI industry has adjusted and roared back

A look back at how the Chinese startup shattered conventions, changed the way Big Tech thought about AI, and blew a $1 trillion hole in the stock market that got filled right back up... and then soared to new levels.

tech

Georgia lawmakers introduce data center construction moratorium amid statewide pushback

More and more communities across the US are wrestling with the pros and cons of having a data center come to town. Georgia has become a hotspot of resistance to the data centers planned by Big Tech, according to a new report from The Guardian. The Atlanta metro area led the nation in data center construction in 2024.

Georgia state representatives introduced legislation that would place a one-year moratorium on data center construction in the state. Ten Georgia municipalities have already passed local bans on data centers.

Per the report, at least three other states have seen similar data center moratorium legislation introduced in the last week, including Maryland and Oklahoma.

Georgia state representatives introduced legislation that would place a one-year moratorium on data center construction in the state. Ten Georgia municipalities have already passed local bans on data centers.

Per the report, at least three other states have seen similar data center moratorium legislation introduced in the last week, including Maryland and Oklahoma.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.