登录查看更多内容

AI at the Edge

Luca Moglia

Senior Solutions Engineer at Akamai Technologies

发布日期: 2024年4月23日

For a recent Meetup in Milan, I was asked to imagine what deploying AI at the Edge of the Internet would mean. I only had a few days to build an effective story, specifically on how the advertising market can benefit from extreme personalization of content.

As I always tend to approach a topic from its architecture, I started studying the subject by looking at the Deep Learning and LLM diagrams.

I won’t go into details about what DL and LLM are, but I noticed documenting myself that what most of the AI technologies have in common is the huge parallel tasks to achieve the results. So GPUs became the preferred vehicle for AI models because of their ability to handle almost identical operations simultaneously on many data samples. With the growth in the size of training data sets, the massive parallelism available in GPUs proved indispensable. You can find on Google many good results about this.

For example, talking about GPT 3, we have studies that indicate that the training on 1024 GPUs (equivalent) took 34 days, with an approximate cost of $4.6M in computing alone.?

Given the importance of the GPU, Akamai recently announced its plan to roll out cloud infrastructure and services Powered by NVIDIA, with the NVIDIA RTX 4000 Ada Generation GPU. On top of video processing, this infrastructure can be used for generative AI and machine learning.

I then started reading up on the concept of Edge AI, and I found this interesting article from Nvidia. They state that “Edge AI is the deployment of AI applications in devices throughout the physical world. It’s called Edge AI because the AI computation is done near the user at the edge of the network, close to where the data is located, rather than centrally in a cloud computing facility or private data center.”

Their article is interesting because it also contains some examples of Edge AI use cases. Of course, the way Nvidia intends the Edge is related to the deployment of their GPUs/systems in specific locations or devices, generally closer to where the users are.

In the article we also start to see the difference between training and inference: training is the process of learning and optimizing models from data, when you present a trained AI algorithm with a problem and it gives you an answer, that’s called inference.

https://www.xilinx.com/applications/ai-inference/difference-between-deep-learning-training-and-inference.html

Good, we introduced some concepts, and now we should ask ourselves: Where should I place AI computing? Should I place it in centralized systems or at the edge on specific devices? It depends.

AI in the Centralized Cloud (or DC) may be good for very heavy tasks like training AI models over GPUs, this may be expensive and may be not ideal for Inference tasks because we would face added latency (roundtrip from client to server) and minimal offload benefit.

If we discuss purely Inference AI, it can be run also at the Edge directly on specific devices, but we should pay attention to some challenges, such as inconsistent and poor HW, Over-downloading of bytes, and that in most cases we would need controls of the specific HW.

But what if we introduce the Akamai Edge, placed in many locations around the world, closer to the concentrations of users?

We can achieve:

Faster response by eliminating network hops
Lower bandwidth costs
Maximize offload of managing and scaling compute resources
Resilience from network failure
Reduce client-side burden

Recently, Akamai announced plans to embed cloud computing capabilities into its massive edge network. The Akamai’s Generalized Edge Compute (Gecko) aims to embed compute with support for virtual machines into 100 cities by the end of the year.

"Akamai is delivering on the promise it made when it acquired?Linode?by quickly integrating compute into its security and delivery mix," said?Dave McCarthy,?IDC, Research?Vice President, Cloud and Edge Services. "What they're now doing with Gecko is an example of the more distributed cloud world we're heading toward, driven by demands to put compute and data closer to the edge."

The Gecko Project and the existing Core Computing Regions

How can AI Inference Workload benefit from this distributed architecture?

As the massive Akamai network is mostly based on CPU, it’s important to accelerate AI workloads using automated model sparsification technologies, available as a CPU inference engine. Akamai and Neural Magic announced a strategic partnership intended to supercharge deep learning capabilities on Akamai’s distributed computing infrastructure.

Neural Magic (Acquired by Red Hat) ’s solution enables deep learning models to run on cost-efficient CPU-based servers rather than on expensive GPU resources.? This allows the companies to deploy the capabilities across Akamai’s globally distributed computing infrastructure, offering organizations lower latency and improved performance for data-intensive AI applications.

领英推荐

This AI newsletter is all you need #92

Towards AI 12 个月前

Leading Practices for GPUaaS and LLMaaS Success: A…

Rashmi Sharma 1 个月前

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1…

Together AI 5 个月前

My colleague Alesandro Slep?evi? tried the Neural Magic engine DeepSparse, a sparsity-aware inference runtime that delivers GPU-class performance on commodity CPUs, purely in software, anywhere. You can find here his tests.

Now, let’s try to imagine some use cases. Is it possible today to deploy AI at the Edge?

My colleague Joseph Glover recently developed a simple POC, showcasing that CPUs, with their parallel processing capabilities, offer a viable and accessible option for AI inference tasks.

You can try it yourself at this address: https://moviemind.info/

Moviemind is a personal movie recommendation application powered by a blend of Open-Source Innovation and Self-Sufficiency. You can read more here.

I also deployed the code (here is the GitHub link) to a Machine on the Akamai Edge in Milan, and put on top of it the Akamai Edge solution ION to accelerate it. You can see the results, since the cloud and edge live on the same network, you have similar latency.

Ok, the first steps are done! But then for the Meetup, since it was a #videotech Meetup, I had to imagine some ADV use cases.

I thought to list some of my characteristics, such as age, gender, the city where I live, and to give an image generative AI tool a text to create an Image ADV.

“Advertisement with 39-year-old boy drinking a Cola on a yellow Fiat Panda in the city of Turin with hot weather”

I first tried Midjourney and then also Artflow (with my character). See the results!

Imagine driving personalized ADV at the Edge, with low latency computing close to the users. Nowadays, AI Image Generation is not fast enough, but think about the possible opportunities of having this workflow in real time.

Of course, there are also other use cases, think about e-commerce applications where the customers can benefit from AI tools to improve their APP experience. Or, speaking with Alan Evans , think about LLM-powered chat Architecture distributed at the edge, to have real-time conversations with your users and customers. For example, see here what our partner Macrometa is doing.

Personalized Content at the Edge with Akamai

Because we were at the #VideoTech Meetup, I also tried some tools to create AI videos, like Creatify and Synthesia. I used the previously generated images (plus others) with an avatar reading a text generated with AI.

Again, at the moment these tools take some time to render the videos, but think about the possible application of having this technology in real-time, also interacting with the users.

Finally, you can also watch this great interview with Jay Jenkins , he speaks about Gecko and its possible use cases, including AI. Many great inputs!

Conclusions

Exciting times at Akamai Technologies ! With the rollout of the new Nvidia GPU and with the support of Edge Computing (Gecko) into 100 cities by the end of the year, the Akamai customers can choose the best architecture for their AI workloads. With its partners, Akamai can support AI Inferencing at the Edge, powered by its massive network of 4,100 points of presence around the globe.

Wentao Li

Akamai Technologies - Cloud & Edge Computing Solutions, APJ

10 个月

Great post, Luca! Thank you for sharing your insights on AI on the Edge!!

1 次回应

Stefan K.

Principal Technical Solutions Architect

11 个月

You rock Luca! Great read??

1 次回应

Walter Lee

GDE, GCP, AWS, Azure Cloud Expert, CKA/S, ex-Oracle, 38k Followers. Many X Certified in Clouds, DevOps & k8s. Hackathons Winner. Writer, Speaker, Mentor. Opinions are my own and not the views of my employer : )

11 个月

thanks Luca Moglia ! Great post. I wrote something related at https://www.dhirubhai.net/posts/walterwlee_ai-at-the-edge-activity-7186246620710100992-kz10?utm_source=share&utm_medium=member_desktop - CDN/Edge AI can help overcome many device limitation, e.g. slow communication speed, limited power/cpu powers, etc...

1 次回应

查看更多评论

要查看或添加评论，请登录

Luca Moglia的更多文章

Open-Source AI Meets Akamai Cloud: How They Helped Me Write My Latest Article

2025年1月13日

Open-Source AI Meets Akamai Cloud: How They Helped Me Write My Latest Article

The fusion of open-source AI tools and cutting-edge NVIDIA GPUs has unlocked new possibilities in creative and…

2 条评论
The Lock-In Problem in Cloud Adoption: How to Avoid Being Tied to a Specific Vendor

2024年12月5日

The Lock-In Problem in Cloud Adoption: How to Avoid Being Tied to a Specific Vendor

This article was crafted using open-source AI tools deployed on the Akamai Cloud and inspired by a recent talk…

2 条评论
How to gain Observability from your Akamai (and other) Logs

2024年1月23日

How to gain Observability from your Akamai (and other) Logs

In today's life, whether as consumers or professionals, we generate lots of data. Just think of an Edge/CDN platform…

2 条评论
Akamai’s new site in Milan, Italy is Live!

2023年9月25日

Akamai’s new site in Milan, Italy is Live!

Wow, finally the new Cloud Computing Data center in Milan has been launched! This is the sixth new location launched in…
When Latency Matters: my latest Tests on Low-Latency Video Streaming on the Akamai Connected Cloud

2023年4月12日

When Latency Matters: my latest Tests on Low-Latency Video Streaming on the Akamai Connected Cloud

I started working with video streaming technologies in 2010. I just graduated in Cinema and Media Engineering at…

3 条评论
The Akamai Connected Cloud in Action: The Continuum of Compute and Delivery for a VOD Streaming Use Case - from Core to Edge

2023年2月27日

The Akamai Connected Cloud in Action: The Continuum of Compute and Delivery for a VOD Streaming Use Case - from Core to Edge

On February 2023, Akamai announced the Akamai Connected Cloud, a massively distributed platform for cloud computing…

3 条评论
How to Power your Live Streaming Projects at Scale using Akamai and Linode

2022年10月17日

How to Power your Live Streaming Projects at Scale using Akamai and Linode

Akamai has deployed over the years the most pervasive, highly distributed Edge network with approximately 350,000…

7 条评论
Deploying a Zero Trust Architecture on Linode Cloud in Minutes

2022年9月19日

Deploying a Zero Trust Architecture on Linode Cloud in Minutes

I joined Akamai in 2018 and I found a company that was innovating a lot, as it continues to do today. On top of the…

5 条评论
Video Monitoring Dashboards with near real-time Edge Logs and CMCD KPIs

2021年5月13日

Video Monitoring Dashboards with near real-time Edge Logs and CMCD KPIs

Can we use the Edge-CDN Logs to extract QoE and QoS Video KPIs? So far the Logs coming from the Edge-CDN were seen only…

10 条评论

See all articles

AI at the Edge

Luca Moglia

Senior Solutions Engineer at Akamai Technologies

领英推荐

Luca Moglia的更多文章

社区洞察

其他会员也浏览了

VAST Powers Blazing-Fast, S3-Native Model Streaming and Data Processing with NVIDIA Run:ai

XenonStack AI Factory: Revolutionizing Gen AI Workloads for Next-Gen Success

Inside the H200 Tensor Core GPU: An In-Depth Architectural Analysis

CES 2017: AI Comes to World's Largest Tech Show

Inside the H200 Tensor Core GPU: An In-Depth Architectural Analysis

Which AI Hardware Will Rise Above in the Wake of Competing AI Models?

A Detailed Comparison of the NVIDIA H200 and H100 Architectures for Developers

AWS Partners with NVIDIA for Generative AI Advancements

Exploring NVIDIA's AI and Machine Learning Frameworks: A Guide to Accelerated Innovation

GPU-as-a-Service Market Faces Disruption as Industry Shifts to AI-as-a-Service

领英推荐

Luca Moglia的更多文章

Open-Source AI Meets Akamai Cloud: How They Helped Me Write My Latest Article

The Lock-In Problem in Cloud Adoption: How to Avoid Being Tied to a Specific Vendor

How to gain Observability from your Akamai (and other) Logs

Akamai’s new site in Milan, Italy is Live!

When Latency Matters: my latest Tests on Low-Latency Video Streaming on the Akamai Connected Cloud

The Akamai Connected Cloud in Action: The Continuum of Compute and Delivery for a VOD Streaming Use Case - from Core to Edge

How to Power your Live Streaming Projects at Scale using Akamai and Linode

Deploying a Zero Trust Architecture on Linode Cloud in Minutes

Video Monitoring Dashboards with near real-time Edge Logs and CMCD KPIs

社区洞察

其他会员也浏览了

VAST Powers Blazing-Fast, S3-Native Model Streaming and Data Processing with NVIDIA Run:ai

XenonStack AI Factory: Revolutionizing Gen AI Workloads for Next-Gen Success

Inside the H200 Tensor Core GPU: An In-Depth Architectural Analysis

CES 2017: AI Comes to World's Largest Tech Show

Inside the H200 Tensor Core GPU: An In-Depth Architectural Analysis

Which AI Hardware Will Rise Above in the Wake of Competing AI Models?

A Detailed Comparison of the NVIDIA H200 and H100 Architectures for Developers

AWS Partners with NVIDIA for Generative AI Advancements

Exploring NVIDIA's AI and Machine Learning Frameworks: A Guide to Accelerated Innovation

GPU-as-a-Service Market Faces Disruption as Industry Shifts to AI-as-a-Service