Cloud or Edge: Where Should AI Inference Workloads Run?
We expect that within 2-3 years, 85% of enterprise AI workloads will be Inference based, rather than the current predominance of training workloads. This will have significant implications for how organizations run such workloads, and how to determine the best place to deploy resources to run those workloads. Most current AI workloads for training need high end specialized GPUs and run in the cloud at hyperscalers due to advantages like quick time to implementation, scalability of compute resources, software/model availability and ease of implementation. Most enterprise AI workloads running today are still experimental and/or small scale. As AI moves to production level inference-based solutions, the need for high end GPUs is less important and standard server SoCs are more appropriate. There are many variables to evaluate to pick the best infrastructure assets to run these workloads.
In the chart below, we look at several factors that should be evaluated to determine whether it is best to run production inference-based AI workloads in a centralized cloud environment, or whether it makes more sense to run them in an Edge solution localized to the users of the solution. We provide guidance on which we expect to have advantages for enterprises deploying production systems, especially those interested in providing maximum productivity and security at minimal total cost of ownership. Each organization will be unique but we believe these generalized guidelines are a valid place to start.
Figure1: Cloud vs. Edge Deployment Evaluation of AI Inference Workloads
Bottom Line: We have outlined a number of evaluation criteria in the chart above for determining whether Cloud (including hybrid cloud or remote cloud) or Edge deployments of AI Inference workloads is the best alternative. Each organization may have different requirements and these are guidelines, but for many enterprises, deploying AI Inference-based workloads at localized Edge computing resources on standard systems provides a much better solution for price, performance and security/privacy.
Copyright 2025? J.Gold Associates, LLC.
J.Gold Associates provides advisory services, syndicated research, strategic consulting and in context analysis to help its clients make important technology choices and to enable improved product deployment decisions and go to market strategies. Sign up to receive our newsletter on LinkedIn, and join our mailing list to receive updates on our research and overviews of our reports. Email us at:? info (at) jgoldassociates (dot) com
?
Spurring innovation and IPR protection with 5G, AI, IoT, and Robotics at every industry level - global startups to global giants.
2 天前Interesting analysis, Jack. The only one I differ on is Flexibility. Given the different types of workloads the device edge, or on-prem edge or the network/cell edge encounter, I'd posit that edge offers more flexibility of heterogenous AI compute than the GPU centric cloud. Unless you had a different connotation for flexibility.