The Tech Behind Google Cloud Next ’24 ??
谷歌 recently hosted their Cloud NEXT event, and I completely understand if you couldn't watch the entire thing. However, there's a significant reason why the keynote lasted nearly three hours.
Let’s get straight to it with First Up: Gemini
Gemini is Google’s largest and most capable model so far. Gemini 1.5 Pro in Vertex AI is said to be a breakthrough in long-context understanding and can consistently process 1 million tokens of information with a 10M token context window ?? And like many AI initiatives of today (yes, I mean humanoid robots), we’re seeing great multimodal capabilities which can process audio, video, text, code, and more.
But no, this one is not just a flashy demo; they have already taken Gemini 1.5 Pro into public preview for Cloud customers and developers. Besides, in the corporate partnerships scene, names like Mercedes-Benz and Goldman Sachs are already utilizing this new piece of technology!
Let’s make all this make sense with real-life examples real quick. A gaming company can now offer video analysis to improve performance. Similarly, an insurance company can leverage multimodal models to combine various data types and, say, create an incident report, streamlining and enhancing their services.
A Massive Computing Data Center: AI Hypercomputer, Harnessing Computing and Data at a Scale Never Imagined Before
Google’s AI Hypercomputer is an orchestra of hardware and software, from programming languages to compilers, runtime, serving stacks, to the chips and networks that make it all possible at an unprecedented scale. The system-level integration is up to 2x more efficient at scale relative to baseline solutions that simply deliver raw hardware and chips. - Amin Vahdat, VP and GM, ML, Systems, and Cloud AI at Google
They are enhancing performance and capacity at every layer of the stack, with Google-designed TPUs and 英伟达 GPUs leading the charge.?
Their TPU v5p, which has 4 times the compute capacity of the previous model, is designed for AI and ML tasks, allowing for rapid matrix multiplication.
This hardware is tailored for enterprise companies to train and deploy AI models efficiently.
?? For those less familiar with technical jargon:
4x compute capacity = faster and more efficient model training = faster and better AI.
Example: Implementing TPUs in medical imaging, like MRI or X-ray analysis, expedites diagnosis, vital for life-threatening situations and understaffed medical facilities.?
A3 supercomputers with NVIDIA H100 GPUs, purpose-built for AI
First off, let's cover the basics: The H100 GPU is the most sought-after in the tech industry.
It sold out last year, with giants like Meta and 微软 making sure to stockpile it. However, this is now old news due to ? Blackwell ?.
About Blackwell: You know if you know, but if you don’t, no problem. I will cover it separately. For now, let’s note that it is 30x better than their last best microarchitecture, Hopper.?
Back to Google’s A3.?
The H100-powered A3 GPU supercomputer is ideal for LLMs and generative AI tasks. In a nutshell, with advanced GPUs, increased memory, and enhanced networking, A3 VMs deliver top-notch performance, accelerating ML model training and inference. Fully-managed AI infrastructure via Vertex AI offers convenience, while custom software stack options cater to specific requirements.
Think about it: Once fully optimized in the automotive industry, this technology could potentially enhance road safety and prevent numerous accidents, especially with self-driving vehicles.
In conclusion, Google is making hot moves in the High-Performance Computing space, which is essentially the power bank of Gen AI.
Training AI models requires substantial computing power, and HPC, relying on impressive processing units like A3 and v5p, will define our Race to the Future ??
C’mon, I said Race to the Future. We need more rocket emojis here, right? ??????????
On the other hand, there are some speculations and ?the? technical challenge of the century around AI bias. Practically, nobody is happy about it ??
领英推荐
For one, I would like to remain positive as we are all trying to make things better, and so is Google. Let's give them a break :) And believe in the power of model training & pouring accurate data into the pool, we will get there! ??
As I near the end, you might be thinking:
'Hey! There is A LOT more in that 3-hour-long Google Cloud Next 24’ Keynote!'
I can’t help you there because when I write long articles, y’all don’t read them… (Imagine here a very moody and disappointed emoji of me lol)
But there shall be a part 2 or even 3… don’t worry :)
In the meantime, do me a favor.?
No, I don’t want likes or reshares.
Just look at how we covered nearly 1.5 hours in under 5 minutes. Huge shout out to Team: Mihael Gubas & Sanela O. Super proud ??
Let me know what you think in the comments ??
Happy Wednesday! Go Google??????
????????????????????:
Google Cloud. (2023, May 11). Announcing A3 supercomputers with NVIDIA H100 GPUs, purpose-built for AI. https://cloud.google.com/blog/products/compute/introducing-a3-supercomputers-with-nvidia-h100-gpus
Google Cloud. (2023, March 22). Introducing G2 VMs with NVIDIA L4 GPUs — a cloud-industry first [Blog post]. https://cloud.google.com/blog/products/compute/introducing-g2-vms-with-nvidia-l4-gpus
NVIDIA Corporation. (n.d.). NVIDIA H100 Tensor Core GPU. Retrieved from https://www.nvidia.com/en-us/data-center/h100/
Almas. (2024, March 28). Gemini 1.5: Everything You Need to Know About it. Medium. https://medium.com/@almaswebconsulting/gemini-1-5-everything-you-need-to-know-about-it-3ca9da954a61
The Wall Street Journal. (2023, December 27). How Chips That Power AI Work [Video file]. https://www.wsj.com/video/how-chips-that-power-ai-work-wsj-tech-behind/0A6E661F-C5A1-439E-A41D-3B1352EE4E72
[Note: The article was proofed by ChatGPT, and attachments are generated by OpenArt and not subject to copyright as per 2024 T&Cs.]
[Note: Since some of the sources lack publication dates, "n.d." (no date) is used instead.]
WEBSITE DISCLAIMER:
The information provided by Tech Giants hosted on LinkedIn Corporation Website ("we," "us," or "our") on https://www.dhirubhai.net/build-relation/newsletter-follow?entityurn=7181917023554150400 (the "Site") and our mobile application is for general informational purposes only. All information on the Site and our mobile application is provided in good faith, however we make no representation or warranty of any kind, express or implied, regarding the accuracy, adequacy, validity, reliability, availability, or completeness of any information on the Site or our mobile application. UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR OUR MOBILE APPLICATION OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE AND OUR MOBILE APPLICATION. YOUR USE OF THE SITE AND OUR MOBILE APPLICATION AND YOUR RELIANCE ON ANY INFORMATION ON THE SITE AND OUR MOBILE APPLICATION IS SOLELY AT YOUR OWN RISK.
NOT FINANCIAL ADVICE:
The information contained in this newsletter and the resources available for download through this website is not intended as, and shall not be understood or construed as, financial advice.
#GenerativeAI #TechNews #HPC #DataCenters #GoogleCloud
And what about AI Code Agent? thanks for sharing Denise ??
Medallia | Machine Learning | Computer Science
10 个月Amazing read! Will be interesting to see where Google will take this in the near future
I help AEs, SCs, + CSMs gain the technical knowledge to sell any SaaS, cloud platform, or API | Founder @ SaaS Savvy
10 个月Data centers and infrastructure simply do not get enough time to shine ? But without them there's no internet, and certainly no AI.