登录查看更多内容

Edge AI: The Network may be less important than you think

Dean Bubley

发布日期: 2021年7月15日

(This is an amended & edited version of a post first published on a client Deeplite's blog here and also incorporates learnings from a webinar I moderated for them, with speakers from ARM, H1 and PJC Ventures on July 13th - a recording is here)

Introduction

A recurring theme for my work on Edge Computing is "orders of magnitude". Depending on the company and individual involved, edge discussions span 10 or more orders of magnitude of power, distance and latency: milliwatts to megawatts, millimetres to 100s of km, femtoseconds to days. I've written various times on this, such as this post on latency.

Recently, I've been looking at the "small" end of the edge space - what is happening in terms of compute on end-devices or nearby gateways. In particular, I've looked into how AI-based applications can improve their inferencing efficiency, to the extent they may not need low-latency "realtime" network access, or cloud support, for tasks such as image/video recognition, or audio analytics and speech processing.

(Note: many people in the mobile/telecom & datacentre industries don't realise the difference between training & inference for AI and especially deep neural-network models - training a model needs lots of data and processing, but is not time-critical and only gets done once/a few times. Inference is the actual *use* of that model to do stuff - recognise images, interpret speech and so on).

That has implications not just for the aspirations of cloud/edge providers, but also for 5G (and fixed) network traffic overall. A sizeable proportion of expected mobile data (especially uplink) is imagined to be from cameras, sensors and other sources uploading or streaming bulk content to cloud-based AI and "big data" platforms. But if most of it gets handled locally and doesn't transit the network at all, that's a meaningful shift. (This isn't new - almost 5 years ago I wrote this article on the same broad topic).

For example, it's common for 5G discussions and events to cite self-driving cars "generating 4TB of data per hour" or similar stats, as justification for roadway coverage/capacity and edge compute. Yet if 99.99% of the data stays on the vehicle, and most quick decisions are taken by "self-sufficient" models locally, then that has huge ramifications on everything from connected vehicle revenue projections, to 5G radio spectrum needs.

This also has implications for monetising low-latency capabilities, if some of the most-demanding future use-cases don't need the network. And perhaps more importantly in the longer term, optimised on-device AI may require hugely less power and therefore be more green - a central theme of the webinar I mentioned at the start.

Speaking the same language

The semantics of Edge are partly to blame here. Words can have multiple meanings. As a result, people involved with adjacent areas of the technology industry often misunderstand each other, even when using the same terms. Each group has its own frame of reference, history and technical domain expertise.

In particular, areas of technology cross-overs and convergence are often fraught with category errors, flawed assumptions – or just poor communications. There is a significant risk that this is occurring in the area of Edge AI. At least four different groups interpret that term in very distinct ways.

For example, many professionals in the cloud and network world have no idea about what can be achieved with optimized Edge AI on devices, either currently or what is likely soon – and what that implies for their own visions of the future. The telecoms industry, in particular, appears to be at risk of missing an important "disruption from adjacency".

This article is aimed at helping these people talk to each other, better understand each others' needs and expectations – and also help them avoid poor decisions through a lack of awareness of broader tech trends.

If you asked representatives of the following industries to play "word association" with the phrase "Edge AI", they might suggest very different explanations:

Deep Neural Network (DNN) specialist

Someone involved in image detection or speech analysis might mention the trends towards "model compression" or "AI optimization", with heavy, resource-consuming or slow cloud-based inferencing shrunk down to work more efficiently on a CPU, GPU or microcontroller on a device – for instance a camera or smartphone. This is "AI at the edge" for them. It may solve multiple problems, from lower latency to reduced energy consumption (and better economics) for AI.

IoT System developer

Someone involved with building a connected vehicle, or the quality-control for a smart factory, might think about a local compute platform capable of combining feeds from multiple cameras and other sensors, perhaps linked to autonomous driving or closed-loop automation control. Their "edge AI" resides on an onboard server, or perhaps an IoT gateway unit of some sort.

Mobile network operator (MNO)

A telecom service provider building a 5G network may think of Edge AI both for internal use (to run the radio gear more efficiently, for instance) and as an external customer-facing platform exploiting low-latency connections. The "mobile edge" might be targeted at a connected road junction, video-rendering for an AR game, or a smart city's security camera grid. Here, "Edge AI" is entwined with the network itself – its core functions, "network slicing" capabilities, and maybe physically located at a cell-site or aggregation office. It is seen as a service rather than an in-built capability of the system.

Datacentre & cloud providers

For companies hosting large-scale compute facilities, AI is seen as a huge source of current and future cloud demand. However, the infrastructure providers often won't grasp the differences between training and inferencing, or indeed the finer details of their customers' application and compute needs. "Edge" may just mean a datacentre site in a tier-3 city, or perhaps a "mini datacentre" serving users in a 10-100km radius.

These separate visions and definitions of "Edge AI" may span as much as 10 orders of magnitude in terms of scale and power – from milliwatts to megawatts. So, unsurprisingly, the conversations would be very different – and each group would probably fail to recognize each other's "edge" as relevant to their goals.

These are not the only categories. Others include chip and module vendors, server suppliers, automation and integration specialists, cloud/edge platforms and federation enablers and so forth. Added to these are a broad array of additional "edge stakeholders" – from investors to government policymakers.

Why does this matter? Because AI applications ultimately fit into broader ecosystems, transformation projects, consumer and business products or even government policy and regulatory regimes. In most cases, all of these groups will need to organize themselves into a value chain, or at least depend on each other.

The developer perspective

Often, edge-AI market participants focus – understandably – on what they perceive as their unique capabilities, whether that is their preferred models, their physical premises, network/system speeds and their existing customer relationships. And internally, they are looking for new revenue opportunities and use-cases to help justify their investments, as well as gain more "customer ownership".

But the questions which don't get asked often enough are "What does the developer – and the final end-user – really value? What are their constraints? And how will that drive their decision choices, now or in the future?"

领英推荐

AI's next leap in healthcare is here—and it didn't…

Robert Pearl, M.D. 3 周前

Scale AI CEO says China has quickly caught up with the…

CNBC International 1 个月前

May AI Have Your Attention, Please?

Haptik 5 个月前

For instance, consider an application developer working on an AI-powered object recognition tool. At the moment, their product has a few problems to resolve. In particular, the response times are laggy, which reduces the effectiveness and market opportunity of the overall solution. Given the round-trip time of video images to and from the cloud, plus the significant processing load and inference time, they can only get one reliable response per second, and the implied cost means it's only suitable for certain high-value tasks.

That's fine for monitoring crowds and lost property in a railway station – or detecting a particular parasitic beetle on a crop-leaf – but isn't useful for spotting defects on a fast production-line conveyor or to react to a deer jumping in front of an autonomous vehicle.

They may also need to adapt the model - for instance, a security camera picks up "false positives" because it's not just shoplifters who spend a long time in one aisle - but also shelf-stackers. But they all have a trolley, or an orange uniform, so the model can be retrained to ignore them.

For the next version of their product (or a model update), they have a range of different improvement and optimization paths they could pursue:

Aim for use-cases with reliable lower-latency network connections to the cloud, for instance, by using 5G, fiber or Wi-Fi6, which are designed to minimize end-to-end delays.
Locate some workloads in regional mini-datacentres (say 100km away) rather than distant hyperscale facilities (1000+ km distant), reducing the speed-of-light and switching/routing delay.
Use optimized inference models to reduce processing time and potentially fit on the end-device or nearby gateway itself.

However, latency is not the only criterion to optimize for. In this example scenario, the developer's cloud-compute costs are escalating and they are facing ever more questions from investors and customers about issues of privacy and CO2 footprint. These bring additional trade-offs to the decision process. (The actual CO2 footprint of everything is horribly complex to estimate - you need to factor in sources of power as well as demand for it. And also bear in mind that batteries need CO2 in manufacture, so ambient energy "harvesting" may be better still for local on-device compute).

Indeed, at a high level, there are numerous technical and practical constraints involved, such as:

Reliability and accuracy – does the new product maintain comparable levels of performance to the previous (slow but reliable) version? Does it introduce any new dependencies that could reduce uptime and availability?
Power consumption (either for environmental reasons or to preserve battery life)
Connectivity – is the network connection predictable (e.g., fiber to a fixed machine in a factory or hospital) or intermittent (for instance, an IoT system on a train going through tunnels and remote areas)?
Privacy/data sovereignty – how many organizations (or countries' governments) can potentially see/intercept the data? Are there GDPR issues involved?
Contractual complexity involved with third parties – for instance, dealing with multiple providers of network and compute resource, especially where SLAs and guarantees may not be comparable.
Memory and processing resource – either on the device or in edge/cloud nodes. This aspect takes on greater resonance in an era of semiconductor shortages, where obtaining new devices or servers may be difficult.
Skills and personnel – can they, or their customers, deploy, maintain and troubleshoot the new platform? Do they need particular permissions or certifications to access specific sites such as cloud datacentres?
Single or multiple devices involved – does the application work on standalone sensors, or is the task inherently a multi-device process? For instance, a single camera might monitor a doorway or support a heating engineer's infra-red detector. Still, a town square or robot's path through a factory might need dozens of inputs to be combined.

Looking through this list – and also considering all the other AI-related tasks, from audio/speech analysis to big-data trend analysis for digital twins – there is no singular "answer" to the best approach to Edge AI. Instead, it will be heavily use-case-dependent.

Also, clearly not all Edge/cloud/wireless applications are about AI either - many may relate to legal requirements for data collection, closed-loop automation, or device-to-device communications and analysis.

The implications of on-device AI and model compression

There are numerous approaches to optimizing AI models, both for server-side compute and on-device optimization. From the previous discussion, it can be seen that if localized inferencing becomes more feasible, then it will likely expand to many use-cases – especially those that can run independently on single, standalone devices. This has possible significant benefits for AI system developers – but also less-favorable implications for cloud and low-latency network providers.

Consider something intensely private, such as a bedside audio analyzer that detects sleep-apnoea, excessive snoring and other breathing disorders. The market for such a product could expand considerably if it came with a guarantee that personal data stayed on-device rather than being analyzed on the cloud. The model could be trained on the cloud, but inferencing could be performed at the Edge. If appropriate, it could communicate results with medical professionals and upload raw data if the user then permitted it later, but local processing would be a good selling point initially.

Yet when I regularly speak to representatives of the datacentre and telecoms worlds, especially in connection with new network types such as 5G, there is very little awareness or understanding of the role of on-device compute or AI – or how rapidly it is evolving, with improvements in processor hardware or neural network optimization.

Even in more camera-centric telecoms sectors such as videoconferencing, there seems to be little awareness of a shift back from the cloud to edge (or exactly where that Edge is). There has been some recent awareness of the conflicts between end-to-end encryption and AI-driven tasks such as background blurring and live audio-captioning – but that is just one of the trade-offs that might be shifting.

Conclusions

The shift to Edge AI has huge possible benefits for developers and IoT providers. But it may have some negatives for 5G, edge-cloud and other connectivity-oriented specialists, at least for some of their target use-cases. I think we'll see distinctions between:

Individual location-based AI (eg an individual camera, phone, car, smart speaker) that can operate self-sufficiently most of the time, but pass data upwards (and get updated models downwards) where needed.
Correlational multi-location AI apps that need data to be compared / aggregated (eg digital twins), especially if there's a realtime requirement.
Non-AI data use, for instance for security video archiving or "raw" data collection mandated for compliance reasons.

The medium-term issues that seem to be underestimated are around energy budgets and privacy. If model compression and on-device Edge AI can prove not just "greener" in terms of implied CO2 footprint, but also reduce the invasiveness of mass data-collection in the cloud, then it may be embraced rapidly by many end-user groups. It may also catch the attention of policymakers and regulators, who currently have a very telecom/cloud-centric view of edge computing.

Despite this shift, it is important not to exaggerate the impact on the wider cloud and network market. This changes the calculus for some use-cases (especially real-time analysis of image, video and similar data flows) – but it does not invalidate many of the broader assumptions about future data traffic and value of high-performance networks, either wireless or wired.

But again, there's a "semantics" issue to resolve here. Often, at the core of poor assumptions is a cause of poor communications.

When all participants in the market understand each other's language and technology trajectories, we should hopefully see fewer poor assumptions and less unrealistic hype. There are huge advances occurring across the board – from semiconductors to DNN optimization to network performance. But each alone is not an all-purpose hammer – they are tools in a developers' toolkit.

#edgeAI #edgecomputing #cloud #5G #neuralnetworks #machinevision #deeplearning #IoT #imagerecognition #voiceanalytics #video #camera #AI

________________________________________________________________________

Interested in the topic and want to learn more? I specialise in this type of cross-silo, big-picture view of technology trends, especially where they intersect with wireless connectivity in some fashion. Please get in touch with me, either for internal advisory / brainstorming work, or external communications such as events, webinars and publications. (The sponsor of the original blog and webinar is Deeplite AI - drop 'em a line about Edge AI in particular, and please mention I sent you)

Dean Bubley's Tech Musings

10,475 位关注者

Mats Eriksson

3 年

A really good article! It's kind of amusing (but sad) these misunderstandings. Just want to make two points: 1) Normal solution providers wants to be dependent os as little as possible. Being dependent on a "hidden" network is just a big risk. That points towards placing solutions in devices rather than networks 2) BUT if the devices are battery powered, the PUE is around 6-7. It comes from inefficient charging due to priority on energy density and charging time. This means that you can consume 3-5 times more energy running the inference in a more efficient DC and still be on par (not counting the energy consumed sending the data)

Oscar Bexell

3 年

This is a brilliant article Dean. These walled gardens exist everywhere and always lead to a massive waste of money, time and focus. When working for 3GPP/WiFi companies, I noticed high walls between RAN, core and OSS/BSS. Within each area, say core, you find more walls with people who have spent 20-30 years working only with SS7, PCRF or AAA (but they can't understand a traced call flow and have no clue what impacts the user experience e2e). Was also working with integration & verification of RAN features for a while. None of the testers were using the customer OSS tools. Everyone used a CLI tool created by a RAN guy. Just one of many examples. Then, when working with IoT, you realize the exact same problem exists in cities, buildings, companies, hospitals and enterprises. Walled gardens. A lot of people are insanely good at narrow tasks. Very few people are broad and understand the full picture. I believe this is a main reason to why there's so much bad IT out there. I often recommend younger students to aim for the training/education part of companies when looking for jobs. That's probably the best place to start if you want to build deep and broad know-how.

1 次回应

???? Jonas Wallenius

I help you make better decisions, saving time & money. DM for scheduling a free 1h first session using Wardley mapping.

3 年

Great read Dean Bubley, and good work in building bridges between domains. You mention AR there briefly. Still some years down the road for sure... But wouldn't many more advanced AR applications (e.g. gaming) be multi-party, multi-device, bidirectional, low-latency in a way that isn't suitable for on - device execution only, not for centralized cloud execution? Not only an AI use case (though could be a part). Not really a today use case except niche uses. But a potentially very big one eventually?

Petro Sasnyk

???? ?? Ukrainian Soldier, sergeant?. Fractional CTO, Engineering Director, Solutions Architect, University lecturer at peace time. Edge Computing, IoT, SaaS, Cloud, Startups speciality.??

3 年

Power consumption and large computing power at the edge can be solved by the ARM architecture, they are so much energy efficient. And they could have CUDA cores that translates to the insane AI performance at the edge. But I see the largest problem in the potential customers mindset and legacy applications. I think, the future of edge applications is on conteinerized or serverless workloads which runs on energy efficient hardware architecture(means ARM or maybe soon we will see competitive RISC-V-based products). The customers had to rearchitect or build from scratch edge-native solutions. This is largest challenge that I see today for the wider adoption.

Ewan N.

Ex-adult

3 年

certainly improving the communicability of concepts between those with IT backgrounds, and those from telco, is going to be important.?

查看更多评论

要查看或添加评论，请登录

Dean Bubley的更多文章

5G / 6G network efficiency may grow faster than data demand. Overcapacity is a serious risk

2025年3月24日

5G / 6G network efficiency may grow faster than data demand. Overcapacity is a serious risk

Technological improvements in wireless networks are yielding faster growth in mobile data capacity using existing…

16 条评论
European Telecoms Lobbying Points at MWC: Same old cliches and fallacies...

2025年3月2日

European Telecoms Lobbying Points at MWC: Same old cliches and fallacies...

I wanted to keep this article brief, as I know everyone's in Barcelona this week and dashing around, but unfortunately…

25 条评论
Good & bad metrics/statistics to watch at MWC

2025年2月24日

Good & bad metrics/statistics to watch at MWC

"There are three kinds of lies: Lies, damned lies, and statistics" (attributed to Mark Twain, Benjamin Disraeli, or…

19 条评论
FTTH altnet consolidation: is everyone wrong?

2025年2月17日

FTTH altnet consolidation: is everyone wrong?

I can't get full fibre FTTH where I live, in a small gated mews in London, which lies off another alley, about 50m from…

44 条评论
Telco EdgeWash 2.0: AI-RAN inference edition

2024年12月3日

Telco EdgeWash 2.0: AI-RAN inference edition

We're entering a new era of telco #edgecomputing hype. It is essentially a replay of the MEC debacle from 5-6 years…

28 条评论
Spectrum policy: The missing voice of enterprise

2024年10月29日

Spectrum policy: The missing voice of enterprise

Businesses have many wireless & spectrum needs, but a fragmented voice I’ve been to a number of events about radio…

43 条评论
Why we shouldn’t regulate cloud the same way as telecoms

2024年10月21日

Why we shouldn’t regulate cloud the same way as telecoms

Note: This article is one of a series relating to the European Commission's White Paper on network infrastructure and…

11 条评论
The 6G vision needs a Reset

2024年10月11日

The 6G vision needs a Reset

I used to say that #5G is "just another G". Actually, I was wrong.

73 条评论
Wi-Fi: Positive news, but also some missing discussions & opportunities

2024年9月28日

Wi-Fi: Positive news, but also some missing discussions & opportunities

I could only attend the WiFi NOW EMEA event in Geneva for one day this year, as I had to leave early to do a client…

20 条评论
Regulating interconnect markets in Europe: disconnected from reality

2024年9月2日

Regulating interconnect markets in Europe: disconnected from reality

Introduction Various statements and publications by the European Commission during 2024 have pointed towards a desire…

14 条评论

See all articles

Edge AI: The Network may be less important than you think

Dean Bubley

Introduction

Speaking the same language

The developer perspective

领英推荐

The implications of on-device AI and model compression

Conclusions

Dean Bubley's Tech Musings

10,475 位关注者

Dean Bubley的更多文章

社区洞察

其他会员也浏览了

?? Daily News in AI Agents: Key Updates 12/18: OpenAI's o1 Model, Google DeepMind's Fact-Checking Push, and Nvidia's Compact AI Supercomputer

Mastering the Tech Ingredients for Successful AI at Scale

Cognilytica’s AI Enabled Vision of the Future - Pervasive Knowledge (Part 5 of 5)

Artificial Intelligence bottlenecks – what is important to know

DeepSeek’s AI Revolution: A Turning Point in the Race for Artificial Intelligence

Have We Run Out of Data for AI?

Edge AI and Vision Insights Newsletter

The Global AI Divide: How the Battle for Advanced Chips Shapes the Future of AI

Artificial Intelligence Trends Shaping 2025

The Sunday Prompt: Open vs. Closed AI - Who Controls the Future?

Introduction

Speaking the same language

The developer perspective

领英推荐

The implications of on-device AI and model compression

Conclusions

Dean Bubley's Tech Musings

10,475 位关注者

Dean Bubley的更多文章

5G / 6G network efficiency may grow faster than data demand. Overcapacity is a serious risk

European Telecoms Lobbying Points at MWC: Same old cliches and fallacies...

Good & bad metrics/statistics to watch at MWC

FTTH altnet consolidation: is everyone wrong?

Telco EdgeWash 2.0: AI-RAN inference edition

Spectrum policy: The missing voice of enterprise

Why we shouldn’t regulate cloud the same way as telecoms

The 6G vision needs a Reset

Wi-Fi: Positive news, but also some missing discussions & opportunities

Regulating interconnect markets in Europe: disconnected from reality

社区洞察

其他会员也浏览了

?? Daily News in AI Agents: Key Updates 12/18: OpenAI's o1 Model, Google DeepMind's Fact-Checking Push, and Nvidia's Compact AI Supercomputer

Mastering the Tech Ingredients for Successful AI at Scale

Cognilytica’s AI Enabled Vision of the Future - Pervasive Knowledge (Part 5 of 5)

Artificial Intelligence bottlenecks – what is important to know

DeepSeek’s AI Revolution: A Turning Point in the Race for Artificial Intelligence

Have We Run Out of Data for AI?

Edge AI and Vision Insights Newsletter

The Global AI Divide: How the Battle for Advanced Chips Shapes the Future of AI

Artificial Intelligence Trends Shaping 2025

The Sunday Prompt: Open vs. Closed AI - Who Controls the Future?