登录查看更多内容

Where’s my underwear? High performance storage in the age of gen AI.

Simon Dredge

发布日期: 2024年8月5日

The advent of generative AI is causing a dramatic rethink of the types of compute architectures required to support the enormous volumes of data it consumes. Training and tuning foundation models to adapt them to specific tasks, and then creating the applications themselves, require the adoption of technologies that can deliver high performance and low latency while also ensuring scalability.

Outside the obvious mass-adoption of GPUs and AI-specific custom silicon, such as the ARM Ethos-N Series, there are other areas of consideration. This includes the distributed computing frameworks employed, the networking infrastructure supporting it, the extract, transform, and load (ETL) pipelines that prepare the data for consumption or storage, and finally the storage mechanism itself. That’s what I’m going to dig into in this post.

Having just returned from a family vacation, I can positively attest to the fact that not all storage techniques are equal. Take a look at my suitcase versus my wife’s versus my kids' and you’ll see a distinct disparity in packing techniques. With mine, items are easily accessible but do not make optimum use of space, while the opposite is true for my wife, who’s admittedly a lot less OCD than me. Is that important? Well, it depends on if you need low-latency access to items or scalability.

The same is true for storage. There are different approaches that can be adopted based on the type of data being deposited and the characteristics required for its retrieval. Is it structured like a spreadsheet, unstructured like text documents and media files, or semi-structured like code? Is it more important to access data quickly or to scale? Is it more efficient from a cost perspective, or less complex to manage? Ultimately, all these attributes - and more - must be weighed to determine the optimal solution for each application.

While file and database storage options are ideal for classic content management and transactional systems, respectively, we typically look to block and object storage solutions when it comes to generative AI. Block storage is favored for the training phase of a foundational model, where fast, consistent, and reliable data access is required. The net result is improved training efficiencies and better utilization of the all-important (and expensive) underlying resources. Featuring consistent, high-quality, and highly curated records, training datasets are usually labeled and standardized using structured formats, making them ideal for block storage.

As the name suggests, block storage works by dividing data into fixed sizes. Each has a unique address, allowing for direct access to the information located there. If you’re thinking this all sounds too familiar, it’s because it is. Block storage is the exact same technique that’s been employed since 1956, from early IBM mainframes to today’s server SSDs. Consequently, you’ll be familiar with the various block storage standards, such as SCSI, SATA, and NVMe - each flavor with its own key features, but almost all requiring some form of third-party software overlay or hardware underlay to meet the requirements for error correction, corruption prevention, encryption, and replication.

I’m no boomer, but a common thread through all my posts is this notion of old fundamental technologies continuing to find purpose in modern applications. I always get a kick out of it. And when correctly implemented, block storage techniques can provide not only the performance, consistency and reliability noted previously but also the fine-grained control, and integration with the modern high-performance compute environments generative AI model training demands. These include the hyperscale cloud providers, who all have block storage offerings, including Amazon’s Elastic Block Storage, Google’s Persistent Disks, Microsoft’s Managed Disks, plus IBM and Oracle’s Cloud Block Volumes. Now get off my lawn.

领英推荐

Is the "Great AI Takeover" Era Upon Us?

LatentView Analytics 7 个月前

Key takeaways from the 2024 AI Infra Summit

HTEC 3 个月前

Beyond Data and Model Parallelism: Sequence…

Fast Code AI 4 个月前

Oh – wait – we’re not finished yet. Indeed, we’ve barely started. The training of foundational models is, of course, only a small part of the generative AI story. It’s their application that will continue to revolutionize how we live and work. We can loosely categorize these in buckets that include text, image, audio, and video generation, 3D modeling and animation production, document and report creation, plus numerous other AI-powered creative tools. These applications have fundamentally different demands of their underlying infrastructure – not least in the area of storage.

Unlike training data, generative AI applications run a little more rogue. Their datasets are derived from a variety of sources, presented in diverse formats, making them more unstructured in nature. The data is more real-time, noisy, incomplete and, unlike training data, foregoes the rigorous preprocessing. With origins tracing back to the late 1990’s, object storage techniques elastically scale to accommodate the large volumes of unstructured data typically seen in generative AI applications – and they do so far more cost-effectively than block storage mechanisms.

Object storage implementations also support extensive metadata, aiding in the organization and retrieval of information, which can be performed using standard (and very familiar) HTTP/HTTPS. Along with robust RESTful APIs and SDKs, this aids integration with web-based interfaces and supporting services. The downside to adopting these protocols is generally higher overheads, resulting in higher latency than block storage, but this is offset by higher throughputs. Once again, all public cloud service providers have object storage services, including Azure Blob Storage, Google Cloud Storage, and Oracle’s OCI Object Storage, to name a few. However, with its robust programming interfaces, Amazon’s S3 (Simple Storage Service) is generally viewed as the most widely adopted among this crowd.

But any developer embracing a hyperscaler’s object storage implementation is left with a dilemma: be shackled to a single hosting platform or rewrite their codebase for each. While the multi-cloud debate still rages, it’s fair to say that, in general, software vendors and their customers generally feel more comfortable when presented with an application that’s easily ported – if not run simultaneously – across disparate clouds. The answer may be to adopt an independent object storage implementation that can be spun-up on any cloud instance - public or private. Throw in support for S3 interfaces, as the heir-apparent to de facto standard status, and the issue of cloud supplier lock-in can be thoroughly negated.

At this point, a self-destructive degree of honesty compels me to make a disclosure: This post came about because of research I performed when interviewing for a technical marketing position at MinIO – an open-source, multi-cloud, S3-compatible object storage offering. As anyone reading this superficial post with any actual experience in this space can probably guess (coupled with the admission that I had to research it in the first place) I bombed out of the hiring process, rather unceremoniously but expectedly, in the first round.

MinIO is by no means the only open source, multi-platform and S3-compatible object storage solution. Indeed, alternatives such as Ceph, OpenIO and SeaweedFS all generally promote the same fundamental features, give or take. Reed-Soloman error correction (developed in 1960) provides high-performance error correction by adding redundant data in the form of parity symbols, in the same way data transmissions can be protected. Hashing algorithms are employed to prevent the systematic degradation and corruption of information over time. Large-scale replication and strong encryption algorithms protect and secure data, while federation allows multiple clusters to operate and be managed as a single system.

So, like most evolutions, we take some technology from the past and apply a liberal sprinkling of modern innovations to create something new. The stakes are obviously a little higher than my underwear, though mine tend to ride up a little, if I’m totally honest – and, as previously ascertained, I am to a detrimental degree. Generative AI requires more than a boatload of specialized compute hardware. Other aspects, like storage, must be carefully considered and will vary based on specific requirements. It’s fair to say, though, that object storage solutions are about to enjoy a renaissance, of sorts, as more AI applications evolve. Which variant ultimately gains the largest share of this space remains to be seen.

要查看或添加评论，请登录

Simon Dredge的更多文章

Expectations of an explosive growth in edge computing.

2024年7月2日

Expectations of an explosive growth in edge computing.

While there are definite disparities, depending on who is reporting and how, it’s clear that the next five years will…
Open Gateway aims to reverse operator API trends

2024年6月19日

Open Gateway aims to reverse operator API trends

The first concerted effort to deliver open network application programming interfaces (APIs) can be traced all the way…

7 条评论
Quantum computing and communications: Dead and very much alive.

2024年6月11日

Quantum computing and communications: Dead and very much alive.

Not ironically, I remain in a state of supposition with regard to quantum computing and communications. That is, when…

8 条评论
Recognizing the realities of responsible AI

2024年6月8日

Recognizing the realities of responsible AI

With the advent – and very public failures - of early generative AI implementations, there is a burgeoning business…

3 条评论
Can VoNR+ reignite the allure of IMS?

2021年12月3日

Can VoNR+ reignite the allure of IMS?

I don’t think I’ve ever told anyone this, but I’m quite… fond, of the IP multimedia Subsystem. This admiration likely…

2 条评论
Let’s talk about 6G

2021年10月2日

Let’s talk about 6G

Do you have a minute? Great. Look, I didn’t want to have to bring this topic up, right now.

1 条评论
Doing the splits: How an open RAN is leading to self-organizing networks

2021年8月4日

Doing the splits: How an open RAN is leading to self-organizing networks

The application of artificial intelligence to 5G radio frequency technologies, employing models built using modern…
Betting against Wi-Fi: The odds of Private 5G for digital transformation

2021年7月27日

Betting against Wi-Fi: The odds of Private 5G for digital transformation

Since the advent of the ETHER Network in 1973, Ethernet has been on a seemingly unstoppable tear – first in the Local…
Supporting Low Latency Enterprise Applications with 5G on Public Clouds

2020年10月25日

Supporting Low Latency Enterprise Applications with 5G on Public Clouds

Painfully clichéd though it is, I have to admit that the first thing that popped into my head as I set out to study the…
The role of IMS in Voice over 5G

2020年2月21日

The role of IMS in Voice over 5G

The IMS specification started life, in 3GPP’s Release 5 specification, as part of the network evolution from a circuit…

See all articles

Where’s my underwear? High performance storage in the age of gen AI.

Simon Dredge

领英推荐

Simon Dredge的更多文章

社区洞察

其他会员也浏览了

GenAI on a budget: How CIOs can unlock value without breaking the bank

Improve Production Line Quality using Machine Learning at the Edge

How Long Can Current Data Centers Handle AI's Insatiable Power Demands?

Tech Predictions for 2023: AI, Edge Computing, Cloud & More

Artificial Intelligence News for the Week of October 1v1; Updates from Dell, Google, HPE & More

Bringing Your AI Computer Vision To Life With Tuba.ai.

OpenAI & Microsoft to Establish US$ 100 billion AI Data Center Project with 'Stargate' Supercomputer

Distinction between Static and Dynamic Batching in Machine Learning

IMO Weekly Highlights - 06102024

How AI is Reshaping the Data Center Landscape

领英推荐

Simon Dredge的更多文章

Expectations of an explosive growth in edge computing.

Open Gateway aims to reverse operator API trends

Quantum computing and communications: Dead and very much alive.

Recognizing the realities of responsible AI

Can VoNR+ reignite the allure of IMS?

Let’s talk about 6G

Doing the splits: How an open RAN is leading to self-organizing networks

Betting against Wi-Fi: The odds of Private 5G for digital transformation

Supporting Low Latency Enterprise Applications with 5G on Public Clouds

The role of IMS in Voice over 5G

社区洞察

其他会员也浏览了

GenAI on a budget: How CIOs can unlock value without breaking the bank

Improve Production Line Quality using Machine Learning at the Edge

How Long Can Current Data Centers Handle AI's Insatiable Power Demands?

Tech Predictions for 2023: AI, Edge Computing, Cloud & More

Artificial Intelligence News for the Week of October 1v1; Updates from Dell, Google, HPE & More

Bringing Your AI Computer Vision To Life With Tuba.ai.

OpenAI & Microsoft to Establish US$ 100 billion AI Data Center Project with 'Stargate' Supercomputer

Distinction between Static and Dynamic Batching in Machine Learning

IMO Weekly Highlights - 06102024

How AI is Reshaping the Data Center Landscape