Tech
Blue and Orange 3D Cubes Representing Interconnected AI Systems and Digital Transformation
Getty Images

Alibaba researchers devise efficient GPU pooling system, reducing GPU use 82%

Drastically reducing the amount of GPUs needed for running AI models could have big consequences for the scale of huge data centers, while benefiting smaller organizations. It also could reduce demand for pricey new GPUs from Nvidia.

Jon Keegan, Matt Phillips

Researchers at Peking University and Alibaba have announced a new system that can drastically reduce GPU demand, by efficiently “pooling” computing across multiple models rather than assigning each model its own GPU.

Named “Aegaeon,” the system addresses a problem with assigning computing resources to the many AI models on the market: dedicating a set of GPUs to a specific model leaves precious processing cycles underutilized when the model is not receiving a lot of requests.

In the research paper, the authors noted that a small number of popular models, like Meta’s Llama, DeepSeek, and Qwen, dominate utilization, and 17.7% of GPUs serve only 1.35% of requests. That’s a lot of wasted GPU cycles.

The researchers use a system of “token-level auto-scaling,” which assigns computing at a granular level using tokens (the smallest unit of text an LLM processes, sometimes only a few letters) rather than at the “request” level, which might see one heavy computational task holding up the queue.

Using the Aegaeon system, in Alibaba Cloud’s production tests, the company was able reduce GPU demand by 82%. What would normally take 1,192 GPUs, the researchers were able to do with just 213 Nvidia H20 GPUs.

The consequences of this system could be significant. If AI companies can do more with less, maybe those massive data centers running AI models don’t need to be so huge, and maybe they don’t have to find as many complicated financing schemes to pay for all those GPUs.

But this also means that smaller players could be more competitive, especially in places like China, where export controls are making the most powerful processors hard to come by.

It could also be bad news for Nvidia, though Aegaeon is built on Nvidia software. And on Monday, some analysts on Wall Street pointed to the reports on Aegaeon as a reason for the day’s weakness in some previously high-flying data center stocks.

Oracle was down sharply for the second straight session. Hard disk drive makers Seagate Technology Holdings and Western Digital — big beneficiaries of the data center trade this year — also declined, as did AI energy plays Constellation Energy and Vistra.

More Tech

See all Tech
tech
Jon Keegan

Report: SpaceX planning for IPO late next year

SpaceX has told investors that it is planning for an IPO in late 2026, according to a report from The Information.

Elon Musk’s rocket company is in talks for a share sale for employees and investors that would put the company’s valuation at $800 billion, making it the world’s most valuable private company, recapturing that crown from OpenAI.

Per the report, all of SpaceX including Starlink would be listed as one company, rather than spinning off Starlink, which Musk had discussed a few years ago.

Per the report, all of SpaceX including Starlink would be listed as one company, rather than spinning off Starlink, which Musk had discussed a few years ago.

tech
Rani Molla

Meta reignites on-again, off-again relationship with news organizations with multiple AI content licensing deals

Meta has a long and tumultuous relationship with news organizations: first flooding them with traffic, then cutting it off; declaring news a priority, then deprioritizing it in people’s feeds; even hiring its own team to curate breaking news before abruptly disbanding it.

Now it seems media companies are back in Meta’s good graces. The social media company has struck a number of content licensing deals with publishers — including USA Today, People, CNN, Fox News, and The Daily Caller — in order to use information from their articles in Meta’s AI tools, Axios reports. The company first inked an AI news deal with Reuters last year.

Meta has been integrating its AI chatbots across its suite of products, and these licensing deals, which the company reportedly plans to expand to more news organizations, will give users better access to real-time information.

Now it seems media companies are back in Meta’s good graces. The social media company has struck a number of content licensing deals with publishers — including USA Today, People, CNN, Fox News, and The Daily Caller — in order to use information from their articles in Meta’s AI tools, Axios reports. The company first inked an AI news deal with Reuters last year.

Meta has been integrating its AI chatbots across its suite of products, and these licensing deals, which the company reportedly plans to expand to more news organizations, will give users better access to real-time information.

tech

Cloudflare just went down again, but apparently only for 20 minutes this time

Another day, another massive network outage taking down huge sections of the internet... and, once again, the cause of the hiccup was Cloudflare.

On Friday morning, the American IT giant reported that a change made to “how Cloudflares Web Application Firewall parses requests” caused its network to “be unavailable for several minutes.”

Roughly 20 minutes later, the company said that “a fix has been implemented,” helping to soothe the stock’s losses after falling as much as 6% in premarket trading, according to Bloomberg. Shares of Cloudflare are trading about 2% lower at the time of writing.

Users reported that sites including LinkedIn, Zoom, Fortnite, Shopify, and Coinbase were all made unavailable by the outage — or at least they would’ve reported that, if Downdetector weren’t also down, per The Verge. Even so, some are still seeing issues as the service supposedly gets back on its feet.

Cloudflare went down only last month, though that time the network was down for roughly three hours and took OpenAI, X, and League of Legends with it — and that incident followed in the digitally disruptive footsteps of Amazon Web Services, which saw a major outage in October lasting some 15 hours.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.