Tech
Blue and Orange 3D Cubes Representing Interconnected AI Systems and Digital Transformation
Getty Images

Alibaba researchers devise efficient GPU pooling system, reducing GPU use 82%

Drastically reducing the amount of GPUs needed for running AI models could have big consequences for the scale of huge data centers, while benefiting smaller organizations. It also could reduce demand for pricey new GPUs from Nvidia.

Jon Keegan, Matt Phillips

Researchers at Peking University and Alibaba have announced a new system that can drastically reduce GPU demand, by efficiently “pooling” computing across multiple models rather than assigning each model its own GPU.

Named “Aegaeon,” the system addresses a problem with assigning computing resources to the many AI models on the market: dedicating a set of GPUs to a specific model leaves precious processing cycles underutilized when the model is not receiving a lot of requests.

In the research paper, the authors noted that a small number of popular models, like Meta’s Llama, DeepSeek, and Qwen, dominate utilization, and 17.7% of GPUs serve only 1.35% of requests. That’s a lot of wasted GPU cycles.

The researchers use a system of “token-level auto-scaling,” which assigns computing at a granular level using tokens (the smallest unit of text an LLM processes, sometimes only a few letters) rather than at the “request” level, which might see one heavy computational task holding up the queue.

Using the Aegaeon system, in Alibaba Cloud’s production tests, the company was able reduce GPU demand by 82%. What would normally take 1,192 GPUs, the researchers were able to do with just 213 Nvidia H20 GPUs.

The consequences of this system could be significant. If AI companies can do more with less, maybe those massive data centers running AI models don’t need to be so huge, and maybe they don’t have to find as many complicated financing schemes to pay for all those GPUs.

But this also means that smaller players could be more competitive, especially in places like China, where export controls are making the most powerful processors hard to come by.

It could also be bad news for Nvidia, though Aegaeon is built on Nvidia software. And on Monday, some analysts on Wall Street pointed to the reports on Aegaeon as a reason for the day’s weakness in some previously high-flying data center stocks.

Oracle was down sharply for the second straight session. Hard disk drive makers Seagate Technology Holdings and Western Digital — big beneficiaries of the data center trade this year — also declined, as did AI energy plays Constellation Energy and Vistra.

More Tech

See all Tech
tech

Apple reportedly delays its foldable phone to 2029 or later

Apple has pushed back the debut of its $3,000 foldable phone — part of its three-year plan to update how the iPhone looks — to 2029 or even later, Bloomberg reports. Originally Bloomberg reported that the iPhone maker had hoped for the foldable phone to come out in 2026, but thanks to “engineering challenges tied to weight, features and display technology” customers will have to wait a few years longer.

For what it’s worth, as is the case with its upcoming touchscreen MacBook Pro, many of Apple’s competitors, including Samsung and Google, already have foldable phones.

For what it’s worth, as is the case with its upcoming touchscreen MacBook Pro, many of Apple’s competitors, including Samsung and Google, already have foldable phones.

tech

OpenAI has an army of ex-investment bankers making financial models to train ChatGPT

OpenAI is looking for its killer app for the business world. After all, you can only sell so many $20 monthly subscriptions to consumers — which currently accounts for 70% of its $13 billion annually recurring revenue.

Bloomberg is reporting that OpenAI is beefing up ChatGPT’s financial chops to target the deep pockets of the banking industry.

According to the report, “Project Mercury” has lined up over 100 former investment bankers getting paid $150 an hour to help teach OpenAI’s models how to do the grueling work of junior bankers, including tweaking PowerPoint slides and building financial models in Microsoft Excel.

According to the report, “Project Mercury” has lined up over 100 former investment bankers getting paid $150 an hour to help teach OpenAI’s models how to do the grueling work of junior bankers, including tweaking PowerPoint slides and building financial models in Microsoft Excel.

tech

Warner Bros. Discovery just raised the price of HBO Max

Warner Bros. Discovery, which announced today it’s open to being bought, also said it’s raising prices on its HBO Max streaming subscribers.

Effective immediately for new customers and at the next renewal date for existing ones, subscribers to the ad-supported tier will pay an extra dollar a month ($10.99) and those who don’t want ads will see prices go up $1.50 a month (to $18.49). It joins the ranks of Disney, Apple, and NBC Universal, which also recently raised prices. WBD is also reportedly cracking down on password-sharing.

Here’s how the prices of their services compare now:

Here’s how the prices of their services compare now:

tech

Amazon aims to automate 75% of its operations and avoid hiring 600,000+ people

Amazon might be one of few companies hiring ahead of the holiday season, but the e-commerce giant hopes to limit headcount additions in the years ahead as it leans more deeply into automation, according to The New York Times’ interviews and a survey of internal documents.

Some numbers from the report:

  • Amazon thinks robots can help it forgo hiring more than 160,000 people in the US by 2027.

  • That would mean $0.30 in savings on each item that Amazon sells.

  • The company would ultimately like to automate 75% of its operations.

  • Automation could potentially lessen its hiring of humans by more than 600,000 by 2033.

  • It expects to sell 2x as many products in 2033.

  • Currently Amazon employs 1.2 million people.

Happy holidays!

  • Amazon thinks robots can help it forgo hiring more than 160,000 people in the US by 2027.

  • That would mean $0.30 in savings on each item that Amazon sells.

  • The company would ultimately like to automate 75% of its operations.

  • Automation could potentially lessen its hiring of humans by more than 600,000 by 2033.

  • It expects to sell 2x as many products in 2033.

  • Currently Amazon employs 1.2 million people.

Happy holidays!

tech
Rani Molla

Apple closes at record high for first time in 2025

After spending the day at intraday highs, Apple set an all-time closing high of $262.24 Monday, following reports of increased iPhone 17 sales and an analyst upgrade. Loop Capital raised its price target to a Street high of $315.

The stock’s previous all-time closing high was in December 2024.

Apple reports its fiscal year 2025 results later this month, during which analysts expect the company’s all-important iPhone sales to return to growth.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.