On-premises AI enterprise workloads? Infrastructure, budgets starting to align
Sponsored by Hitachi Vantara
On-premise enterprise AI workloads are being talked about more as technology giants are betting that enterprise demand will launch in 2025 due to data privacy, competitive advantage, and budgetary concerns.
The progression of these enterprise AI on-premises deployments remains to be seen, but the building blocks are now in place.
To be sure, the generative AI buildout so far has been focused on hyperscale cloud providers and companies building large language models (LLMs). These builders, many of them valued at more than a trillion dollars, are paying another trillion-dollar giant in Nvidia. That GPU reality is a nice gig if you can get it, but HPE, Dell Technologies, and even the Open Compute Project (OCP) are thinking ahead toward on-prem enterprise AI.
During HPE's AI day, CEO Antonio Neri outlined the company's market segments including hyperscalers and model builders. "The hyperscaler and model builders are training large language AI models on their own infrastructure with the most complex bespoke systems. Service providers are providing the infrastructure for AI model training or fine-tuning to customers so they can place a premium on ease and time to deployment," said Neri.
Hyperscalers and model builders are a small subset of customers, but can have more than 1 million GPUs ready, added Neri. The third segment is sovereign AI clouds to support government and private AI initiatives within distinct borders. Think of these efforts as countrywide on-prem deployments.
The enterprise on-premises AI buildout is just starting, said Neri. Enterprises are moving from "experimentation to adoption and ramping quickly." HPE expects the enterprise addressable market is expected to grow at a 90% compound annual growth rate to represent a $42 billion opportunity over the next three years.
Neri said:
"Enterprises must maintain data governance, compliance, security, making private cloud an essential component of the hybrid IT mix. The enterprise customer AI needs are very different with a focus on driving business productivity and time to value. Enterprises put a premium on simplicity of the experience and ease of adoption. Very few enterprises will have their own large language AI models. A small number might build language AI models, but typically pick a large language model off the shelf that fits the needs and fine-tune these AI models using their unique data."
Neri added that these enterprise AI workloads are occurring on-premises or in colocation facilities. HPE is targeting that market with an integrated private cloud system with Nvidia and now AMD .
HPE's Fidelma Russo, GM of Hybrid Cloud and CTO, said enterprises will look to buy AI systems that are essentially "an instance on-prem made up of carefully curated servers, networking and storage." She highlighted how HPE has brought LLMs on-premises for better accuracy and training on specific data.
These AI systems will have to look more like hyper-converged systems that are plug-and-play because enterprises won't have the bandwidth to run their own infrastructure and don't want to pay cloud providers so much. These systems are also likely to be liquid-cooled.
Neil MacDonald, EVP and GM of HPE's server unit outlined the enterprise on-prem AI challenges that go like this:
Dell Technologies' recent launches of AI Factories with Nvidia and AMD highlight how enterprise vendors are looking to provide future-proof racks that can evolve with next-generation GPUs, networking, and storage. These racks obviously appeal to hyperscalers and model builders but play a bigger role by giving enterprises faith that they aren't on a never-ending upgrade cycle.
To that end, the Open Compute Project (OCP) added designs from Nvidia and various vendors to standardize AI clusters and the data centers that host them . The general idea is that these designs will cascade down to enterprises looking toward on-premises options.
George Tchaparian, CEO at OCP, said the goal of creating a standardized "multi-vendor open AI cluster supply chain" is that it "reduces the risk and costs for other market segments to follow."
Rest assured that the cloud giants will be talking about on-premises-ish deployments of their clouds. At the Google Public Sector Summit , the company spent time talking to agency leaders about being the "best on-premises cloud" for workloads that are air-gapped, separated from networks and can still run models. Oracle’s partnership with all the big cloud providers is fueled in part by being a bridge to workloads that can’t go to the public cloud.
The cynic in me would dismiss these on-premises AI workload mentions and think everything would go to the cloud. But there are two realities to consider that make me more upbeat about on-prem AI:
The first item is relatively obvious, but the accounting one is more important. On a Constellation Research client call about the third quarter AI budget survey, there was a good bit of talk about the stress enterprise operating expenses were seeing.
Simply put, the last two years of generative AI pilots have taken budget from other projects that can't necessarily be put off much longer. Given the amount of compute, storage and cloud services required for generative AI science projects, enterprises are longing for the old capital expenditure approach.
If an enterprise purchases AI infrastructure it can depreciate those assets, smooth out expenses and create more predictable costs going forward.
The problem right now is that genAI is evolving so fast that a capital expenditure won't have a typical depreciation schedule. That's why these future-proof AI racks and integrated systems from HPE and Dell start to matter.
With AI building blocks being more standardized, enterprises will be able to have real operating expense vs. capital expense conversations. CFOs are arguing that on-prem AI is simply cheaper. To date, generative AI means that enterprises can't manage operating expenses well and budgets aren't sustainable. The bet here is that capital budgets are going to make more sense once the hyperscale giants standardize a bit.
Bottom line: AI workloads may wind up being even more hybrid than cloud computing.
Impressions of Google Public Sector Summit
I attended the Google Public Sector Summit in Washington DC this week and here are some high-level impressions .
The Google Public Sector unit is independent from Google Cloud with its own governance. Nevertheless, Google Public Sector leverages the Google Cloud stack. Here’s how Google Public Sector CEO Karen Dahut?explained it: “When we came into this market, what we found was traditional gov clouds. They're walled off and lack parity. It lacks the compute scale and doesn't have resiliency. What if we made our commercial cloud available to the government by a software-defined community cloud with all of the guardrails built in? OMB came to that same conclusion independent from us."
In many ways, Google Public Sector is like Google Cloud in that it thrives in the big data and AI layer. The public sector, however, is heavy on edge AI and use cases that can be complicated. Once those use cases are solved though, Google Public Sector can scale to other government agencies and state and local governments. Here’s the recap and takeaways from the top-notch customers at the conference:
From our underwriter:
Suva, Switzerland's national accident insurer, partnered with Hitachi Vantara to modernize its IT infrastructure and accelerate its digital transformation . Suva, which insures over two million employees, has deployed Hitachi's Virtual Storage Platform (VSP) 5500 to support real-time analytics, automation, and machine learning, enhancing both operational efficiency and customer service.
NOTEBOOK
?? Ashwin Rangan, who has been in the CxO game for three decades at ICANN, Rockwell International, Walmart and Bank of America, has seen his share of technology cycles and generative AI is just the latest. In an interview, Rangan, currently Managing Director of the Insight Group and?BT150 ?member, outlined the progression of the CIO role and connected the dots between today's AI-driven technology inflection points and past innovation curves .
?? Adobe launched its Firefly Video Model in beta , added Adobe GenStudio for Performance Marketing and layered generative AI features throughout its suite of products. Also see our interview with SuperNova finalist Joe Prota on how IBM implemented Adobe Firefly on the front end of the creative process to speed up ideation and iteration.
?? SMRs are becoming a craze. Amazon is investing in X-Energy Reactor Company's $500 million venture round as it becomes clear that AI factories will be increasingly tethered to nuclear reactors.?Google is the latest cloud giant to tap into nuclear power to power its AI workloads . The company said that it inked a purchase agreement to use Kairos Power small modular reactors (SMRs) to power data centers.
?? Citibank reported its third-quarter earnings and gave an update on its latest transformation efforts. The company said it retired about 450 legacy applications through September year-to-date and more than 1,250 since 2022. The company also launched an operations capacity planning tool to replace more than 20 different systems to forecast processing volumes. Citibank also said it reduced data center consumption by moving to private cloud.
Databricks broadened its partnership with Amazon Web Services in a move that will put Databricks Mosaic AI on AWS for custom models. In addition, Databricks will use AWS Trainium chips as its preferred infrastructure for model training.
?? UiPath said its platform will be integrated with SAP Build Process Automation and sold as SAP Solutions Extensions. The deal will give UiPath an avenue to automate processes across SAP systems as well as non-SAP systems. The move is aimed at simplifying and accelerating SAP S/4HANA Cloud migrations in addition to be an overall process automation play.
?? The German-speaking SAP User Group (DSAG) said that SAP on-premises customers are being discriminated against because the software vendor is requiring that new innovations, notably generative AI, will be delivered on its cloud platform.
?? Tired of being asked for feedback? You’re not the only one. A QualtricsXM report found that less than a third of consumers give feedback directly and they are least likely to post something on social media. The somewhat ironic thing is that QualtricsXM is often the company asking you for feedback behind the scenes.
INSIGHTS ARCHIVE
Like getting our weekly round-up??Subscribe to Constellation Insights? for consistent news and analysis on technology trends, earnings, enterprise updates and more.
Want to work with Constellation analysts??Our Sales team would love to get in touch!
Research Director @ IDC
4 周It’s real