“If you, as a CIO, are not speaking with your operations and facilities teams around forecasting power requirements versus power availability, start immediately,” said Matt Kimball, VP/principal analyst for Moor Insights & Strategy. “Having lived in the IT world, I am well aware of how separate these organizations can be, where power is just a line item on a budget and nothing more. Talk to the team that’s managing power, cooling and datacenter infrastructure — from the rack out — to better understand how to use these resources most efficiently.”
It’s not just computing capacity that contributes to the cost of AI: IT needs to reexamine existing storage operations too, Kimball said.
“I would take a long look at my storage infrastructure and how to better optimize on and off prem. The infrastructure populating most enterprise datacenters is out of date and underutilized. Moving to servers that have the latest, densely populated CPUs is a first start,” he said. “Moving on-prem storage from spinning media to all flash has a higher up-front cost, but is far more energy efficient and performant. It’s easy to buy into the NVIDIA B300 or AMD MI355X craze. Or the Dell, HPE, or Lenovo AI factories. But is this much horsepower required for your AI and accelerated computing needs? Or are, say, RTX6000 PRO GPUs good enough? They are far more affordable and about 40% of the power consumption compared with a B300.”
Read the full article here