Enterprise Datacenters Only Use 56% of Their Capacity

Enterprise Datacenters Only Use 56% of Their Capacity

I wrote about Future Facilities on the day we acquired them. See my post?Cadence Acquires Future Facilities, a Pioneer in Datacenter Digital Twins.

I was recently on a Zoom call with?Hassan Moezzi, until recently CEO of Future Facilities, now presumably Vice-President of something at Cadence. I asked him?about how Future Facilities presents itself to potential customers. He told me what seemed an unbelievable statistic. According to the 451 Global Digital Infrastructure Alliance:

Enterprise datacenters only use 56% of their capacity

That just struck me as something with enormous financial impact, $100 bills lying on the ground waiting to be picked up. An average datacenter is, say, 100,000 square feet at $1,000 per square foot.?So for every three datacenters built in the world, with better utilization only two are really required, saving (gulp) $100 million.

There are three aspects to putting equipment into a datacenter: physical, electrical power, and thermal/cooling.

  • Physical is generally not a big issue since everyone knows how big every box is, and whether there is empty space. Of course, there is a "Tetris" problem that the blocks don't always fit together perfectly, but in a big datacenter that is fairly minimal.
  • Electrical power is generally not a big issue since everyone also knows how much power provisioning is required for each rack/box, and it is fairly easy to measure. That works at both the small scale ("how much power is this rack using?) and the big ("how much power is the whole datacenter using?")
  • Thermal is the big challenge, and the reason datacenters don't get close to their capacity is due to the perceived risk of adding more equipment which might cause thermal/cooling issues, and result in failure of individual units or dramatic failures where a failure of, say, a cooler causes incremental failures that cascade.

On the first two points, it is straightforward to determine whether additional equipment can be added to a datacenter without running out of space or power. But the third one is a problem. At the thermal level, predicting in advance whether there will be a problem is harder. The electrical power to provision a box translates into some amount of heat that has to be addressed. But, typically the electrical power requirement is higher than even the maximum power ever used, let alone anything closer to average. IT managers are risk-averse and so the solution of putting the equipment into the datacenter and discovering whether or not there is a thermal/heating problem is not really an option. That's the equivalent in the EDA world of taping out the chip to avoid having to do more verification. Airflow challenges are made worse by the fact that air is invisible, and heat even more invisible.

Let's make an example more concrete. Assume you want to add a new rack full of Dell compute servers with a top of rack router. The rack will affect airflow in the entire datacenter by blocking some airflow. It will also create heat. The big question for the IT staff is whether or not they can guarantee that the Dell server boards and the router will have enough air at a low enough temperature to meet their specification, and guarantee that the extra heat won't adversely affect any of the other equipment already in the datacenter.

In traditional Cadence EDA, it is often the case that you have a choice between excessive pessimism or accuracy. It is the same here. Either you use handwaving arguments (or more likely Excel spreadsheets) to convince yourself you have a lot of cooling/thermal margin. Or you need to do the modeling accurately.

Future Facilities

It is at this point that Future Facilities comes in with its datacenter digital twins. 6SigmaDCX, its datacenter product, can analyze the implication of building out a datacenter, or of making changes to a datacenter over time. There are two parts to the technology. First, models of pretty much anything that you might want to put into a datacenter, from individual server boards, network switches, chillers, and so forth. Then there is analysis of the thermal movement within the datacenter, which is mostly from air movement but can also be from liquid-cooled equipment. Of course a rack of equipment can affect both aspects: creating heat inside the rack, and also blocking the flow of air through the datacenter, or changing it due to intakes and exhausts. But using 6SigmanDCX allows you to give an accurate answer to the Dell server rack?example that I posed above. The Excel spreadsheets may not be convincing enough to let the datacenter to manager?allow?the rack into the building, hence the 56% number that we started with. But 6SigmaDCX will often show that there is plenty of thermal and airflow headroom, and so pushing that 56% number up. I'm not naive enough to think that pushing up the efficiency of a $100M datacenter is worth 1% per percent, but I do know it is not nothing, or even close.

No alt text provided for this image

Read more...

Mark Lohry

NASA Langley Computational Aerosciences

2 年

An implicit assumption here seems to be that 100% utilization is the goal or desirable. What's the cost of a developer sitting on their hands because they're unable to use datacenter allocations, or an engineering simulation you need to run being delayed?

Great technology

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了