WEKA: Visionary, Again!
David A. Chapa
Technology Visionary, Chief Evangelist, Storyteller, Strategist, Sales, Product Management, Product Marketing & Competitive Intel. #MrRecovery
This is pretty exciting for me as a "noob" to WEKA, having only been here for three months and learning WEKA has been named a Visionary in the Gartner? Magic Quadrant? for Distributed File Systems and Object Storage for a second consecutive year. It’s an impressive accomplishment for sure.
I remember hearing about WEKA several years ago, probably as early as 2017, and then as former colleagues joined the company I heard more. I listened to their elevator pitch about the company and thought it was very cool. Here was a company delivering a data platform at parallel file system performance expected by HPC environments, but with the scale and simplicity general IT expects from the everyday file solutions. To me, that was unique, a solution for HPC but easy enough to use and manage for more of the general IT workloads. Nonetheless, I was cautious with my enthusiasm.?
Let me explain.
Why hearing HPC causes me to question the solution provider's claim
Let's face it, just about every solution provider that has flash on their bill of materials claims to support high performance computing (HPC) workloads. However, it is important to understand just what HPC is. Oracle's definition of HPC, for example, is "High Performance Computing (HPC) refers to the practice of aggregating computing power in a way that delivers much higher horsepower than traditional computers and servers. HPC, or supercomputing, is like everyday computing, only more powerful. It is a way of processing huge volumes of data at very high speeds using multiple computers and storage devices as a cohesive fabric. HPC makes it possible to explore and find answers to some of the world’s biggest problems in science, engineering, and business." I've read several definitions and I believe this is probably the best I have read.
One of the highlights of my career was touring the data center that housed one of the top supercomputers in the world back in the early 2000s. I learned a phrase from our hosts that day about supercomputers that has stayed with me ever since and it went something like this, "idle cores in a supercomputer is like tossing $100 bills out of your car window at 80 mph. It is just a waste."
In other words if you aren't feeding the supercomputer the data at a rate fast enough to keep it busy, then the system goes idle, even if it is just for a second or two. That doesn't sound too horrible, but when you're talking about something that costs millions of dollars, it is serious business. To give you some perspective, the fastest supercomputer as of this year is the Frontier located at Oak Ridge National Lab in Tennessee.
It boasts 9,472 AMD EPYC processors with 64 cores each or a total of 606,208 cores and 37,888 Radeon Instinct MI250X GPUs (8,335,360 cores). This is a beast and weighs 4 tons. I'm looking forward to the outcomes we will see from this first ever exascale computing infrastructure that cost somewhere around $600M to build. So you can see at $600M if it is idle, we're losing valuable time that could otherwise be serving to fulfill its mission around scientific discovery, clean energy, and national security. So when you talk about idle cores, you can see how money going out of the window at 80mph makes sense.
How has HPC evolved over the years?
I'm not sure when it happened, perhaps it was somewhere in the early 2000s, but HPC was ending up on nearly all of the major solution providers line cards as a use case they supported. If you look at the history of the International SuperComputing (now called SC) conference over the past 30+ years it has been around you can start to see by virtue of keynote the shift into more mainstream adoption of HPC across the board. What we as an industry started to understand is that the aggregation of computing was no longer found in major national labs and academia, rather the similarities of traditional HPC environments was showing up in industries like Media and Entertainment, Financial Technology (FinTech), Oil and Gas, Genomics, and even Retail. I remember visiting one of the very large animation companies out in California and seeing their "rendering farm" as they called it. This consisted of a few hundred Linux servers within a cluster. When I first saw this I mentioned to our customer/host that this is very similar in design to some of the high performance computing environments I have seen as well. While we agreed to disagree at the time, it ultimately did match the pattern of most HPC workloads. There are so many examples of workloads from the vertical markets mentioned above that parallel the definition of HPC.
领英推荐
However, in my personal experience I have found over the years while working for some of these solution providers that many of them couldn't really support the performance required in these HPC or "clustered server" environments. While the performance was good enough, and the price affordable, it seemed to me "the idle core" syndrome plagued many of those solutions but customers didn't really have a good selection of choices at the time. It was either go with a mainstream general IT solution that was good enough and checked all the boxes of ease of use, and manageability, or go with a 20-30 year old open source solution and "roll your own" in hopes that you could manage and maintain these brittle file systems for your project(s). So, through the years, there were unspoken compromises inside those IT environments where solutions were adopted for simplicity over the absolute performance profile requirement. It was good enough.
When Good Enough Is Not Enough
Many years ago I worked for an advertising agency in Venice, CA called, Chiat/Day, and the company motto was, "Good Enough Is Not Enough", in fact it was a way of life across the entire business, not just the creative side. Perhaps that is what instilled in me the curiosity to look for the best and not be satisfied with mediocre. Perhaps it is another reason why I questioned so many of my former employers when I found use cases targeting HPC - because in my mind it wasn't about just being better than the previous solution, it was about meeting the standard of leaving no core idle, or as close to it as possible. There were only two prior employers who could safely say they played in the realm of HPC, and both solutions were so long in the tooth, the systems felt antiquated, and remain so to this day.
WEKA breaks the compromise of "good enough" by giving our customers the ability to dial in their performance requirement while at the same time keeping it simple to manage and use, and scale as the customer requires whether running WEKA on premises or in the Cloud. So customers get a data platform with parallel file system performance, and the ease of use, and stability of the general purpose IT file solutions in the market. It is a WIN/WIN. Feed your cores based on the performance requirement you need without having to hire a specialized babysitter to oversee these antiquated and custom configured file systems I alluded to earlier.
The WEKA Data Platform Is Purpose-Built for HPC
As I started out, many solution providers will say they support HPC use cases, but again in my personal experience, the reality is many of these solutions end up being an archive tier in an HPC environment and not actually one of the key components in HPC.
WEKA on the other hand delivers on the trifecta of speed, simplicity, and scale. We can achieve millions and millions of IOPs and a R/W bandwidth to meet and even exceed your performance expectations for your most demanding, latency sensitive workloads and that is a promise that many others in this industry struggle to keep. Our data platform is not just an old architecture dressed up in containers and "marketed as an AI platform", rather, it is the modern AI/ML, and Deep Learning platform that scales linearly based on your performance and capacity requirements.
Why Being a Visionary 2 Years In A Row Matters
In my opinion, being placed in Gartner’s Visionary quadrant is perhaps the most desirable position for any company breaking into a Magic Quadrant for the first time. If you read Gartner’s methodology to understand what each of the quadrants mean you will see that Visionaries understand "where the market is going or has a vision for changing the rules of the market." That could not be more true about what I see here from the inside. WEKA is focused on what the customers are saying, and that is reflected in how the WEKA Data Platform is and has been developed. We're also looking at a great deal of market research to help us triangulate to where the market is trending. That's what Visionary means. The definition does go on to say that those in this category do not yet execute comparatively well or do so inconsistently. However, if you look at where we landed last year to this year, WEKA is moving up on execution and is one of the top 3 visionaries in the quadrant. I highly recommend that you read about the other quadrants and what each means – it’s eye-opening.
Chapa, signing off