If data platforms were cars, Snowflake would be...

If data platforms were cars, Snowflake would be...

I know you are probably thinking what in the world cars have anything to do with Snowflake data platform... Let me save you some time & say.... NOTHING!!!

However, I am a car guy and my favorite way to explain technology to business is using analogies that most people can relate to & understand. As a result, I tend to use cars a lot as a prop for my technology explanations & comparisons and garnish it with many other everyday items & gadgets to make my point across. So if you are wondering what is so special about Snowflake where the largest companies in the world are shifting their data platforms and running their entire businesses on it, here is why by way of using car analogies:

If data platforms were like cars, here is what it would look like in terms of how they would compare.

No alt text provided for this image

Your tired old on-prem data warehouses & Hadoop data lakes would be like 1950-1970 era cars. They would be huge, heavy & with big engines making small horsepower. Compared to today's cars, they would be slow, not handle well, won't provide much protection during an accident, and require a lot of ongoing maintenance and tuning to get that carburetor engine running properly. If you wanted your car to be more agile & faster, it would require a lot of time & expensive modifications and in the end, you would end up with a one-off custom kit car with you being the only person who knows how to run it properly. Most of the time, the best way to get a car from that time period to be handle better, be faster and more reliable would be to replace it & buy a better car (upgrade).

No alt text provided for this image

Your first-gen cloud-hosted data warehouse & lakes would be like cars from 1980-2000. They would have mostly fuel-injected engines with basic sensors & computers to automatically manage & tweak the engine output(some automation) to get better power from a smaller engine, be more reliable with less maintenance to keep it running well. You would also be a lot safer in an accident. You could easily slap on a turbo or supercharger kit to make it go much faster. Change shocks to make it handle better. Overall better performance, agility & safety from a much smaller and reliable package with much less work to make them go faster & turn better. The closer the production date is to the year 2000, the better and easier it gets. Again there is only so much you can do to make that car go faster within its limits. (meaning scaling was easier but still limited and usually a one-way ticket which was always UP or OUT and stayed that way till next big change) Once you hit that limit, your options are limited before blowing up the engine. You either replace the engine with a bigger one or replace the car yet again for a better car.

No alt text provided for this image

Current-gen cloud PaaS or SaaS platforms are much closer to today's modern traditional internal combustion engine cars. They are much more reliable, faster & safer. Their engines(compute power) would have a cylinder activation tech where you can switch between 4, 6, 8, 12, or more cylinders(nodes) to change the power output without having to modify the engine itself but this is usually a business disruptive process meaning they would have to pull over and shut the car down briefly to make this change. However, most owners would just stick with the engine size they got unless they absolutely need to change it because nobody wants to pull over on the highway and wait 10-15 mins while you got a bunch of impatient passengers(business users) during a trip. They also would be much more technologically advanced with ton of different features & gadgets where some can be very useful (auto windshield wipers) and others would require a Ph.D. to operate it and hardly ever gets used (like the launch control on BMW M cars or gesture & voice control features). Essentially sophisticated, high-tech & faster set of cars but not necessarily easier to use from the previous gen cars for most of us (I still prefer my 2000 BMW E46 to newer models, it didn't have as many features but the ones it had & the ones I cared about worked great and never used much of the newer features on later models). Also because of the complexity of tech and gadgets, if something goes wrong, figuring out & fixing these complex machines requires a highly skilled person. They are hard to find so you usually won't find these people at your local garage. They call them master technicians. They work in dealerships, get paid really well, and usually the guys who walk around in the shop with a tablet in their hand that is connected to your car with Bluetooth while sipping on their cappuccinos. They are the ones that come and talk to you to tell you it will be a $250 diagnosis charge but they were not able to replicate the problem you were experiencing. So current-gen cloud data platforms are like a faster, better & more sophisticated car with less maintenance on your part but not necessarily always easier to use and/or easy to fix/configure without a hard-to-find expensive expert.

No alt text provided for this image

This was all about speed, handling & maintenance. What about concurrency? This is life, stuff happens when you least expect it. You may have started as a happy single person with a fast 2 seater convertible. Then you get married and keep the car. Then you get news of a little one coming up in 9 months so you run and trade in your convertible with a 4 door sedan or a small SUV. Then the doctor tells you it is twins & give up the small SUV and upgrade to a bigger SUV (much more money). They get a little older and want their friends to go to the soccer practice with them forever, now you definitely need a minivan or a very big SUV (more money). Then they suddenly don't like their friends anymore, or friends decide to launch a fully loaded orange soda cup in the car as a joke and you ban them for life from riding in your vehicle. Now you are stuck with a big SUV that you are paying for. (This would be a typical self-service BI & Analytics workload). Or... maybe you are divorced with kids and you marry someone who also has kids (Mergers & Acquisitions), or your in-laws show up(Monday morning rush) and you suddenly end up with 2 or 3 times the number of people and may need a mini-bus. No one really wants to buy nor drive a mini-bus just in case they need the capacity a few times a month. With current traditional data platforms, this is essentially what IT has to deal with. They need to estimate the worst-case scenario in terms of capacity for the next few years and buy something big enough. And when things change and what they have is not enough, they have to buy a bigger system. If they planned it right but divorce happens where you lose headcount due to unknown circumstances or maybe IT was really bad at capacity planning, now you are stuck with that bigger box that you have to pay for & not need for a while.

And what about the cloud providers and being locked into only one of them? What if you are a FORD person but your business partner is CHEVY guy or you are a BMW guru but you got a job in a company that only works with AUDI? Does this mean, you can't work together or may have to learn about other's car technology and parts before you can work?

Let's talk Snowflake. So what kind of car would Snowflake be and why is it so damn popular?

No alt text provided for this image

Snowflake would be a brand new way of transportation vs. just a car you lease. Something between a Tesla, a futuristic car with features that are yet to be invented, and a private jet. It would also be delivered as a service & not sold as an actual product.

What do I mean?

  • It would be like TESLA because it would be very simple & super easy to use. If you can use your touch screen phone, you can operate TESLA. The learning curve would be next to none.
  • It would also be near 0 maintenance like TESLA with almost 0 things you have to worry about upgrading or maintaining. No oil changes, no transmission tune-ups, no unnecessary visits to the dealer for software recalls. It is fast and remains just as fast without you maintaining the car. Most of the maintenance is either not needed at all or delivered automatically over the air if a fix was required and you are not even aware that they already fixed the problem while you were driving it.
  • It would be continuously updated like TESLA where you see new features every time you get in the car. Same with Snowflake, the platform you use today may be much improved or has newer features than what you used an hour ago. No downtimes or waiting for quarterly updates for new stuff. New stuff just shows up as soon as it is available and you start using it. (Sorry no gas passing sound effects in Snowflake)
  • Unlike TESLA or a Supercar which have engines with specific HP outputs and battery/gas tank capacity, Snowflake as a car would have a miracle engine that could switch up & down between 1 & 512 cylinders in an instance within seconds while you are driving it without disrupting your drive. Imagine a Ferrari v12 pulls next to you on a highway, you hit a button and switch to a v32 or v64 engine and just become a spec in the far distance in seconds and back to v4 economy mode after to good laugh. Or you could be delivering pizzas in lighting speeds. The faster you deliver, the more pizzas you can get paid for. Unlike traditional cars, it would have a limitless fuel capacity so how far & how fast you can go would be unlimited and would not have to pre-planned. Essentially compute power in Snowflake is pretty much a serverless experience where you create it & use it when you need it. You also can change its size in seconds when you need it and not impact anyone using it while you are resizing it. This lets you do amazingly complex stuff in amazingly short time frames and spend similar money for it and not have to pay the amazingly high costs.
  • Unlike TESLA and other cars, Snowflake would be a totally different experience & be more like a service than a specific product in terms of handling concurrency. Unlike a traditional car (or a current-gen traditional PaaS data platform) where you end up leasing a car(product) with a specific capacity/size, Snowflake is delivered and purchased as a service. Think of it as a subscription service for this super cool magical car where you can change its engine size instantly to make it go faster at any time but that doesn't limit you to just a single car. If you end up having to drive your own kids, friends of your kids, inlaws, or even the entire soccer team and their fans; you can instantly summon more cars magically in seconds either manually or have the service add new cars to your fleet automatically as new people are coming in to handle everyone. But the great thing is that you would only use/pay for those additional new cars while you were using them to carry these new people. Once you are done, those cars would just magically disappear and you end up driving & paying for the right size car with the exact HP that want & need and not have to drive and pay for a bus for a year.
  • What about workload management? How do you handle rowdy & obnoxious people? (I am talking to all the Dashboard experts who use SELECT * FROM GIANT_TABLE and bring the servers down) The traditional platform approach is you get a big enough product that can handle everyone and limit the portion of resources they can use(Like leasing a bus to carry everyone. You can group and put the annoying ones together in the back but that is not going to solve the problem of not disrupting the rest of the group) Snowflake stands unique in this manner. Since it is delivered as a service and not as a product with a set size & scale, you can instantly summon as many cars in different sizes as you want. This means you can get a small 4 seater just to stuff the annoying people, a stretch limo for the finance group, and a Ferrari for the data-Science & Marketing team while all cars sharing the same driver at the same time. There is no limit to how you can separate your workloads using different clusters of compute, yet they all work from the same single copy of data.
  • And finally being locked into a single provider. Snowflake is fully cloud-agnostic and can run on either AWS, Azure or Google OR all of them at the same time. This means users get the same exact UI and user experience regardless of the cloud and never have to learn specific cloud provider skills to run or use Snowflake. Essentially you would never see a job posting looking for an AWS or Azure Snowflake Admin or Expert. Snowflake is the same Snowflake regardless of where it runs so you get on with your business without worrying about provider-specific things. If this was a car, it would be a car where you could order your car to be built using FORD, GM, or CHRYSLER parts but the resulting car would look, function, and operate identically regardless of what you choose giving you the freedom not to be locked into one company as the sole provider of the parts.
  • And the way you pay for Snowflake is revolutionary as well. Traditional data platforms are like leasing a car where you choose a car with a specific size & features. Then you lease it for 24 or 36 months where you have a fixed cost. With this model, you are leasing that specific car. That exact car is what you paid to use 24x7 for the next few years. You have to make sure you maintain it by doing oil, tire changes. Wait in the dealership as if it needs repairs or has a regular service check-up. Some services are free and others you still have to pay for. And if your needs change (need a faster, bigger or maybe a smaller car) then you have to trade in and get a new car instead. Snowflake is totally different. It is a service that you pay not an actual product. Instead of leasing a specific car, you are essentially subscribing to Snowflake service and have instant access to any car in any size, configuration & performance. You ask for a specific car type exactly then you need it & it magically appears in seconds. You get billed only while you are using it by the second. When you get to where you want to go & the engine stops running, you stop paying for it. Since you are not tied to a specific car & it is magical, you can change your mind at any time and ask for a different car instead and it will instantly change even while you are driving it. It could be a bigger & faster or a smaller one than the one you got. Or you can even ask for multiple cars at the same time if you need to transport many people. There is also no maintenance to worry about. Every time you shut the engine down then summon the car back later when you need it, you end up getting a brand new car and not the one you used before. If the car breaks down or needs service, you just get a replacement one instantly. Here is the best part... because you pay only while engines are running and not when they are idle down to a second, you end up spending a lot less over the course of a year for a similar car that you would normally lease for 24x7 usage since you don't usually drive it 24x7. Paying only for when you use it and not when it sits idle means costs will be lower for a comparable car. On the other hand, because you are not tied down to any specific car, you could decide to use the full amount you would normally use for a traditional lease and get a much faster or bigger car instead. Even though it would cost more than a comparable size car, you can drive much faster or bigger cars for the same money since you are not paying only while the engines are running which is usually a fraction of 24 hours per day. This also allows you to be flexible & agile with what you can do as you can summon any size car as your needs change and not be locked into a specific car for 2 or 3 years.

Pretty cool right? Run the same size car but pay less because you only drive few hours a day vs 24 that you pay for with a regular lease. OR spend about the same as you would with a traditional lease but end up driving a variety of cars that are much faster and bigger than what you would normally get as you only pay while the engines are running(more cost per second but shorter engine time period as faster cars get you there quicker).

Ok, now that you know what kind of car Snowflake would be, what kind of car would you rather have?

No alt text provided for this image

I personally rather be able to have a garage full of choices(cars & planes) that I can pick what I need at any point within seconds & pay only for the duration of use down to a second vs. having to pick one car, pre-pay for 24x7 usage, and locked into that car for few years at a time if the TCO is same or better.

If you like this article, feel free to like it and share it with your network... If not, pretend this didn't happen & you have not wasted 10 mins of your life reading this.

Suresh V.

Lead/Architect Data Engineer | Mastering SAP BW, HANA, SAP Modules | Cloud Data Technologies Specialist in Snowflake, Data Bricks | Proficient in ETL/ELT: Fivetran, Qlik Replication, BODS, HANA SDI, SDA

3 年

Totally make sense. Super Analogy!!!!

回复
Matt Hein

Florida Sales Manager at Sigma Computing | Enterprise BI, Spreadsheet UI

3 年

Great article. I use the car analogy all the time. Traditional DW = Custom Race Car - suped-up, can go fast on the track, a lot of custom/moving parts, the driver needs to set up and maintain themselves, will need to go to the shop monthly (no insurance or high premiums), not versatile (can't fit more people, not good for a Sunday cruise, and can't handle snow : ). Also, will need to spend the weekend cleaning and washing it off. Snowflake Data Platform = Open your garage and have unlimited access to lease any sedan/SUV/sports car you'd like. Whether it's commuting to work, bad weather, a camping trip with the entire family, or F1 race - the driver decides whenever they open the garage what they want to drive that day. Only paying for the amount of seconds they are driving the cars. The fleet of cars is always performant, clean, ready for a spin, the latest models and can even use auto-pilot! I'd rather focus my time on where I'm going, not spending time making sure I'm able to get there. Put the key in and go!

David Spezia

Data Geek, Speaker, Blogger, Sales Engineer, Angel Advisor

3 年

I don’t think Snowflake is just one car it’s a rental car stand that lets you choose ANY car on the lot or the helicopter, boat or train in the “secret” back lot. Choose the tool right for the purpose (warehouse size and SQL command). You can get the beater, family car, economy, truck or semitrailer all from the same stall...in 150ms. Sometimes it’s Uber and you don’t drive the vehicle (Snowpipe, Search Optimization, Clustering or MV Maintenance). So I think it might be tough to pin it to just one type for car. It a multi-modal transportation platform that goes fast as hell!

The car analogy is great and this article is very insightful (and funny too!)

Klemen Logar

Architecting the Future Data and AI Solutions for a New Tomorrow

3 年

Must share, great analogy.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了