OpenAI’s record-breaking test score might have cost $30,000 per puzzle
In December, OpenAI CEO Sam Altman announced that its new o3 “reasoning” model had, for the first time, achieved a winning score on the ARC-AGI benchmark, a notoriously difficult test that had stumped all prior AI models.
But that power came at a steep price. The ARC Foundation (which maintains the test) estimated that the price for the “high compute” configuration of the winning test was about $3,400 per puzzle.
But OpenAI has not yet released the computing costs of its winning tests.
Last week, the ARC Foundation updated its leaderboard of test results, and o3’s winning score was no longer on the chart:
“Only systems which required less than $10,000 to run are shown. Notably missing from this chart is o3 (high compute). For more information on this see our announcement blog post.”
The ARC Foundation thinks the actual o3 costs are more in line with the superexpensive o1-pro model, which is the most expensive in the industry. Based on the sky-high pricing of the o1-pro model, that means it may have cost up to $30,000 of computation to solve each puzzle. The ARC Foundation wrote:
“o3 pricing costs have been updated to use o1-pro pricing. We will update again once official o3 pricing is publicly available. The amount of compute was roughly 172x the low-compute configuration.”
OpenAI did not immediately respond to a request for comment.
But that power came at a steep price. The ARC Foundation (which maintains the test) estimated that the price for the “high compute” configuration of the winning test was about $3,400 per puzzle.
But OpenAI has not yet released the computing costs of its winning tests.
Last week, the ARC Foundation updated its leaderboard of test results, and o3’s winning score was no longer on the chart:
“Only systems which required less than $10,000 to run are shown. Notably missing from this chart is o3 (high compute). For more information on this see our announcement blog post.”
The ARC Foundation thinks the actual o3 costs are more in line with the superexpensive o1-pro model, which is the most expensive in the industry. Based on the sky-high pricing of the o1-pro model, that means it may have cost up to $30,000 of computation to solve each puzzle. The ARC Foundation wrote:
“o3 pricing costs have been updated to use o1-pro pricing. We will update again once official o3 pricing is publicly available. The amount of compute was roughly 172x the low-compute configuration.”
OpenAI did not immediately respond to a request for comment.