Transitions

Ryan M.

Solutions Architect, Google Cloud

发布日期: 2024年10月27日

Once upon a time in the land of traditional compute, I would have been excited about the latest announcement coming from Cupertino California. The M(n) series chips have made the Apple laptops quite desirable in terms of working remotely. The caveat here is that most of the "power" of the M(n) series goes unused as it simply doesn't run ML workloads efficiently. Sure, I can run Ollama locally, and work with quantized models, but efficient is not the word I'd use.

For example, I've been teaching students how to write a RAG agent locally and generate the embeddings for the Chroma database using Gemma and Llama, both fine models with great context windows. But, and yes, there's a but, the amount of time it takes to translate chunks into embeddings locally is bad on the M(n) chips (in tens of minutes for 8 documents) and exponentially worse on older machines without Metal or other GPU support.

The point being is there are very few capable machines of executing these models effectively and efficiently without cloud computing or heavy metal resources.

This inflection point triggered an episode of comparisons over time and made me realize that until we can commoditize these resource constraints, the Cloud is indeed the new MainFrame and/or Super Computer and aside from the academic world, only the pay-to-play enterprises will be able to effectively run complex compute tasks that will shape their industries.

In the 1970s companies like Walmart, The Home Depot, First National Bank, (and many FIs) began adopting the computational power or Mainframes and large compute with an ~ten to twenty year ROI. Those investments arguably pushed these companies into the top brands they are today.

How does this apply? Well, to simply put it, the Mainframe / Tandem / High Performance computer was the only heavy-metal capable of running the computations required to balance books, inventory, supply chain, and transactional boundaries that ushered in the digital age. And now with GPUs not being manufactured fast enough to meet demand and large scale commercial contracts, the cloud is the only way to realize the future of the next ten to twenty years, because no other companies will have the capabilities to execute in this space. I say that confidently because the technical debt paid by all of these founders has lead to the understanding of how to distribute payload effectively and efficiently.

This can be further underlined by the example above, using Google Gemini with the exact same code I'm teaching students with, we can generate embeddings for thousands of PDFs in the same time frame. That's the game changing advantage.

So, where does this bring us? Instead of running out and buying a new computer (unless you actually need one), you should consider if that investment would be better placed in buying a cloud subscription and using those resources to save time to run complex compute and take advantage of things like Model Garden on the Vertex Platform so you don't have to hire a team of engineers just to experiment.

I am excited about the world of Generative AI, and more importantly the fields of AI such as voice, vision, and text. But the understanding of these capabilities only being available to large scale computational powerhouses does have me rethinking my investments on personal and professional levels. I'm finding myself encouraging students to find hardware that can save them time, and if they can't find subscriptions that can make their goals attainable. Because if you're out there denouncing the possible outcomes, then you're most likely misunderstanding the distance to the horizon.

** Note - running on local heavy metal has a similar affect as the cloud, but WOULD NOT have the ROI on maintenance over time to just paying for the execution cost.

If you'd like to try these examples:

Local Execution (Using Ollama):

https://github.com/woodstock-dev/lang-chain-e2e

Cloud Execution (Same workflow but using Gemini):

https://github.com/rrmcguinness/lang-chain-e2e

Andrew Klapper

Looking to take my 3 years of full-stack web development experience and apply it to any company or team looking to expand or take on someone new!!

4 个月

As one of your students I can attest to all of the time and thought that went into this post / repos. Learning and experimenting with this has really provided a firm foundation to understand what really happenes when generating these RAG models and how important it is to have the cloud tools to not only use, but help you understand how to best implement the model to provide the desired result.

1 次回应

要查看或添加评论，请登录

Ryan M.的更多文章

Agent of Agents

2025年3月9日

Agent of Agents

I've written about the importance of various AI architectures of the near future, today I want to focus on agent of…
Digital Commerce - Catalog Enrichment

2025年1月16日

Digital Commerce - Catalog Enrichment

In the incredible chaos of 40,000+ visitors from around the world at #NRF2025, if you missed our demo on digital…
Building the AI of tomorrow

2025年1月2日

Building the AI of tomorrow

SETI @ Home God's Debris - Scott Adams "The Network is the Computer" - John Gage for Sun Microsystems in 1984 Numerous…
Make it a Linux 2025

2024年12月29日

Make it a Linux 2025

As we move faster into the world of Generative AI integration with our operating systems, and the total power drain and…

4 条评论
Distraction Free Tech

2024年11月22日

Distraction Free Tech

Today's distraction free tech recommendations Sticking with the theme posted yesterday about removing distractions…
The missing teammate - GenerativeAI

2024年9月24日

The missing teammate - GenerativeAI

I don't know about you, but I am tired of seeing all of these silly posts from CEOs and laymen saying that AI will be…
Generative AI made Easy with Google Cloud

2024年9月18日

Generative AI made Easy with Google Cloud

I'd recently read a post stating that "using Google AI is difficult" at least more difficult than the competition. I…
Digital Commerce Catalog and Generative AI

2024年9月1日

Digital Commerce Catalog and Generative AI

At Google, my business partners Logan Vadivelu, Paul Tepfenhart and I are constantly working with our retail partners…

3 条评论
Generative AI Images & KSampler - Deep Dive

2024年8月24日

Generative AI Images & KSampler - Deep Dive

I've recently been exploring the depths of image creation in my 'spare time' and found most of the information online…

6 条评论
Patterns with GenerativeAI

2024年8月3日

Patterns with GenerativeAI

Introduction One of the most exciting things we can do as digital explorers is to find new and emerging patterns as…

4 条评论

See all articles

Ryan M.的更多文章

Agent of Agents

Digital Commerce - Catalog Enrichment

Building the AI of tomorrow

Make it a Linux 2025

Distraction Free Tech

The missing teammate - GenerativeAI

Generative AI made Easy with Google Cloud

Digital Commerce Catalog and Generative AI

Generative AI Images & KSampler - Deep Dive

Patterns with GenerativeAI

社区洞察