TL;DR - 2023 OCP Global Summit
Whether you're a startup seeking to lay a strong tech foundation, a mid-size company aiming to optimize infrastructure, or a large corporation lacking a comprehensive technology strategy, my expertise can help. With broad experience as a technology executive, and specializing in sustainable digital infrastructure and scaling out operations, Hume Consulting offers advisory services to design technology strategies, diagnose and resolve operational inefficiencies, and utilize a data driven approach to add value across your business.
Let's collaborate to bring focus and strategic depth to your technology endeavors.
I was super thankful for being able to attend Open Compute Project's (OCP) Global Summit in San Jose last year and planned to head down again this year. Unfortunately, a client engagement took priority this year, however the great team at OCP have started publishing this year's content online, starting with the kick-off Keynote, and there's plenty to be excited about.
I plan to cover topics such as immersion cooling, data center, HPC, and networking, as more workshops, presentations and sessions are released online.
What is OCP?
Open Compute Project is exactly what is says on the tin.
It's a collaboration across the tech community, focused on redesigning technology to support the changing demands on infrastructure. With roots at Facebook circa 2009, a team challenged with supporting an exploding social media platform designed a more efficient to build and run, an at scale infrastructure. This innovation was an integral part of initiation of the OCP Foundation in 2011.
In a historically proprietary based industry, OCP has brought together all the big hitting heavy weights across the globe, encouraging openness of collaboration, community and standards. The board has members from Microsoft, Intel, Meta and Google.
2023 Global Summit review
I've embedded the full keynote below, however there's a few key things that interested me during the presentations.
#INTEL
"AI Training models are 10x YoY" - Zane Ball (Intel)
My rough math that is 1 core per RU and at least 400W per proc (could be higher) utilizing Intel 3 fabrication process - about 15kW, for just the CPU in the rack. Add anything currently standard (for air cooling) 1 GPU per RU to that mix, adds another 26kW. >40kW.
I appreciate Zane (and Intel's) openness when talking about alternate cooling technologies like liquid and immersion. Importantly, in 2021 Intel partnered with Lubrizol to warranty their chips in immersion cooling.
Zane also covers an important topic of the practical implementation of smaller, more targeted AI models "expert models" for example Meta's LLaMA 2 for getting similar outcomes using fractions of compute (and power).
?
#MARVELL
"Network is the new bottleneck"- Loi Nguyen (Marvell)
Even to produce a 32,000 GPU cluster, there is a minimum of 7:1 oversubscription as there isn't a big enough switch to connect without blocking.
领英推荐
One of my favorite presentations!
?
#BROADCOM
"The network is the compute" - Ram Velaga (Broadcom)
Ethernet continues to evolve, with even NVIDIA, who push Infiniband creating their own high-speed (51.2T) Ethernet switch, given customer need.
I'm very excited for UltraEthernet - you can find out more here -> Ultra Ethernet Consortium
#PROMERSION
(the immersion project is the) "largest project in Open Compute...is no longer hypothetical" - Rolf Brink (Promersion)
Rolf gives a great overview of the cooling industry and presents a realistic position that no one cooling solution will be the only solution, and all datacenters will be dealing with some form of liquid cooling, in the next 5 years.
?
#META
"...we are very, very far from having a single solution that could work for all of these different kinds of workloads" - Dan Rabinovitsj (Meta)
This is by far my favorite chart during the keynote.
Whilst every chart was a hockey stick, showing growth of spend, power, GPUs, heat, etc, Dan visualizing that a one-size-fits-all strategy, at least for AI, is incredibly hard, expensive, and ultimately in-efficient.
Wrap-up
OCP, is a significant initiative that has reshaped the tech community's approach to infrastructure. Its collaboration and emphasis on open standards have drawn in major industry players, fostering innovation and cooperation.
Intel's focused on AI training models and the growth in global data center power usage is a noteworthy trend. Marvell's insights into the challenges of network capacity in data centers are also crucial, and Broadcom's evolution of Ethernet standards is an interesting move. The immersion cooling project's growth and the practicality of different cooling solutions, as presented by Promersion, are important in the context of energy-efficient data centers. Lastly, Meta's emphasis on the complexity of infrastructure optimization for AI workloads is a compelling perspective.
Have questions about digital infrastructure, future trends or AI? I'd love to connect and help.
Driving the global growth and adoption of liquid cooling technologies for data centers
1 年Hi Nick Hume, great summary! What I liked a lot was the synergy in all the keynotes and technical presentations. For many years we have been talking about liquid cooling being part of the future. This is the year in which we are discussing it in terms of "NOW". The challenges are real, the work is being done and the solutions are out there being deployed. Not tomorrow, but today! This is how all the keynotes and technical sessions were beautifully tied together this summit. It shows how we are at a paradigm shift and stepping into a whole new era for the datacenter industry. Another thing I really loved was the liquid bar which was pulled together at the last minute by Allison Boen... It was great to catch up with so many on Tuesday evening while enjoying the effects of liquid cooling and some snacks! I'm looking forward to your future coverage on immersion cooling, HPC, and networking as the rest of the summit's content is released. It's professionals like you who bring depth and breadth to these discussions, making them accessible and engaging for the rest of us.
Immersion Cooling Advisor & Influencer ?? 30 Years Infrastructure Power & Cooling@ Alcatex ?? Shell Immersion Cooling Fluid Brand Ambassador?? SVP @ DatanovaX - ????OCP Immersion Community Outreach Lead
1 年Nick Hume I completely agree with Rob Coyle that there is so much to unpack and I personally am still on my "Immersion High" from the content. Starting with the Keynotes - Rolf Brink really brought the excitment for Liquid cooling content, but he followed an equally impressive Keynote by Intel Corporation who metioned their collaboration with Shell and their Immersion Cooling Fluid! Now that was EPIC in my book. My personal favorite was the Liquid Bazaar event I mentioned in my post and the open mic opportunity to ask questions was standing room only - speaks to the interest around that subject. There is more to glow about... but I agree... An “Immersed” with Allison Boen is coming up on this topic! I love the community around OCP and I was so grateful to meet so many in person that I have literally only met on a video call!
Keeping Computing Cool - Talk to me about Immersion Cooling
1 年Nick, I brought my boss so he could get up-to-speed. Seeing OCP Global Summit through his eyes was awesome. We have a great team of people moving the ball forward....scrum might apply to software but it also applies to rugby too. Glad to be on Team Immersion.
Breaking down barriers to sustainability, accessibility, and cost-efficiency... an open approach to data center infrastructure is essential for achieving a more sustainable future.
1 年Thanks Nick! So much to unpack from the event. I am still coming down from the excitement! Not only was this the biggest event from an attendance perspective, but we also had some great announcements that you covered in your article. We also had more hands-on demos in our Experience Center then ever before. It's hard to pick just one or two things to highlight, but we saw lots of work in accelerators, supporting hardware and multiple cooling strategies related to the demand of AI. Any specific news from our event that you are most curious about?