登录查看更多内容

Summary of the 6th Community over Code Performance Engineering Track (October 7, 2024, Denver, Colorado, USA)

Paul Brebner

Open Source Technology Evangelist at Instaclustr by NetApp

发布日期: 2024年10月23日

After much anticipation, the 6th Community over Code Performance Engineering track was held on October 7 2024 in Denver, Colorado, USA. I've only just returned to Australia (hence the slight delay in writing this track report), but after the conference, I tracked down my new favourite steam locomotive, the Union Pacific "Big Boy" in the Forney Museum of Transportation - this is one massive locomotive (impossible to photograph in practice) as it is over 40m long, with 16 drive wheels (4-8-8-4 class), weighed 1.2 million pounds (600 tons), and produced a whopping 7,000 horsepower - way more than diesel locomotives of the time. They were probably the highest-performing steam locomotives ever built and would be a good candidate for the new Performance Engineering track train mascot!

This time around Roger Abelenda and I (Paul Brebner) were the co-chairs. I briefly introduced the track and explained the motivation (Apache projects have many performance and scalability challenges, some projects have solutions in the form of tools and best-practices and experiences that can be shared, and there are plenty of opportunities for cross-fertilization, particularly between old and new projects, including incubator projects).

From an innovation perspective, I have been hoping for talks that explore Open Source + Performance innovation (e.g. code analysis, simulation, etc) and noted that we have had one talk in the past that was close (byte code analysis for Camel), and that LLM's are likely to have an impact in the future.

The talks this time around were:

Paul Brebner (co-chair), Making Apache Kafka even faster and more scalable

Roger Abelenda (co-chair), Skywalking Copilot: A performance analysis assistant

Ritesh Shukla, Tanvi Penumudy, presented by Ethan Rose, Overview of tools, techniques and tips - Scaling Ozone performance to max out CPU, Network and Disk

Shawn McKinney, Load testing with Apache JMeter

Chaokun Yang, Introduction to Apache Fury Serialization

My talk on Kafka performance highlighted the performance impact of recent Kafka architectural changes (KRaft and Tiered storage), with a summary of Zipf's law and Kafka cluster size distribution from my C/C EU talk. Some general conclusions included that Kafka is still hard to benchmark, we need more "science" to compare results, benchmarking of cached/tiered systems is (still) tricky, etc.

Roger's talk on Apache Skywalking copilot ticked the boxes for open source performance engineering innovation for me, and also used LLM's! He demonstrated a new performance assistant for Apache Skywalking (an APM tool) that can help users find and analyse alarms, traces, metrics, topologies, metric charts and talk generally about performance. This was very clever and has enormous potential I think (and all made possible by open source - there's probably no way this could be done as easily - if at all - with closed-source APM tools).

Next up we had another in a series of great talks on Ozone performance - unfortunately Ritesh and Tanvi, couldn't attend in person, but Ethan Rose did a great job presenting with Q&A with Ritesh virtually at the end. Ethan covered flamegraphs vs. metrics (and why metrics plus flamegraphs are better), how to design good dashboards (redundancy is ok, LLMs are your friend for Grafana), the best order for tooling, and challenges/solutions scaling open source projects - all lessons learned from performance engineering of Ozone but widely applicable to other projects!

Shawn McKinney gave a great introduction to Apache JMeter, covering an overview of load testing in general, and going into more depth on load testing an LDAP system. This type of talk demonstrates the value of good introductory material with examples as not everyone is always at the same level across projects - I know I learned a lot. Note to self - maybe I should do an introduction to Performance Engineering next time!

We had our 2nd Shawn (Shawn Yang) present the final talk of the day Apache Fury, a blazingly-fast multi-language serialization framework. This was a great talk on performance by design, and ticked yet another box for me, and our first presentation in the track (in the US at least) from an Incubator project - great work! I think it works so well because it uses best serialization/deserialization practices per data type combined with some other magic. The examples Shawn gave included Flink, so I wonder if it would also work with Kafka?

We lost a 6th talk in the track (due to visa issues) leaving me time to attend the final talk of the Streaming track which also had a performance flavour: ?? Matthias J. Sax "The Nuts and Bolts of Kafka Streams: An Architectural Deep Dive" - Kafka Streams is a powerful technology for streams processing and we can look forward to ongoing improvements to scalability, reliability, high availability and performance etc.

So, by the end of the day here is the list of technologies we've covered in this track so far:

Apache Kafka

领英推荐

Are 10% of Devs Actually "Ghost Engineers"? Not Likely.

Jellyfish 2 个月前

Scalendar November 2024

Scalac 4 个月前

Exploring the Power of Static Code Analysis

Netopia Solutions 1 年前

Apache JMeter & Selenium

Kubernetes

Apache Arrow

Java Profiling

Apache Flink

Apache Spark/ML

Apache Hadoop

Apache Ozone

Apache Cassandra

Apache Camel

Apache Lucene

Apache Iceberg

Apache Impala

Oxia

Apache Skywalking

Apache Fury

Thanks again to the speakers, attendees (about 150 in total this time) and Apache Software Foundation Community over Code conference organisers. We hope to run the event again so put your thinking caps on and start coming up with some possible talk titles and abstracts. If you also like to be involved in reviewing etc let us know.

The presentations will be available online eventually - I'll add them here when I find out where.

In the meantime here's a link to the intro and my talk: https://www.slideshare.net/slideshow/making-apache-kafka-even-faster-and-more-scalable/272645669

Just like Performance Engineering, driving the "Big Boy" locomotive was non-trivial - just look at the all controls (although the coal feed was automatic) - with the potential risk of the boiler exploding - fun!

Inside the cab of the "Big Boy" locomotive

Roger Abelenda

Chief Technology Officer at Abstracta

4 个月

Great summary as you always do. Thank you Paul for being such a good leader and pioneer in Apache community on performance topics.

2 次回应

查看更多评论

要查看或添加评论，请登录

Paul Brebner的更多文章

Load Testing - of a bridge, by lots of trains!

2025年3月3日

Load Testing - of a bridge, by lots of trains!

Finally, an opportunity to combine software performance engineering with trains in a way that's not too far-fetched! I…
Three decades of laptop computers

2025年2月23日

Three decades of laptop computers

I was tidying up the garage on the weekend and came across a stack of old laptops that I've been "accidentally"…
Open Source Performance Engineering: Blogs – Part 1

2025年2月19日

Open Source Performance Engineering: Blogs – Part 1

I recently needed to track down and summarise some of my Performance Engineering blogs (covering performance…
20 years of Open Source from Grid to Cloud Computing

2024年12月17日

20 years of Open Source from Grid to Cloud Computing

Given that it's coming to the end of 2024 I was thinking back to what I was up to 20 years ago, in 2004. That feels…
Kafka Connect: Build and Run Data Pipelines - Book Review, Paul Brebner

2024年11月22日

Kafka Connect: Build and Run Data Pipelines - Book Review, Paul Brebner

Kafka Connect: Build and Run Data Pipelines, by Mickael Maison and Kate Stanley, O'Reilly September 2023, 400 pages. I…

2 条评论
Seven Years of Open Source DevRel Technology Fun With Instaclustr

2024年8月6日

Seven Years of Open Source DevRel Technology Fun With Instaclustr

Seven years ago tomorrow I joined Instaclustr as the first Technology Evangelist to help explain multiple open source…

4 条评论
The Fourth Community over Code Performance Engineering Track (Bratislava, Slovakia, 5 June 2024)

2024年6月17日

The Fourth Community over Code Performance Engineering Track (Bratislava, Slovakia, 5 June 2024)

The 4th Community over Code Performance Engineering track was on recently in Bratislava. Thanks to everyone who made it…
Kafka Summit Bangalore 2024 - Interesting Talks

2024年5月9日

Kafka Summit Bangalore 2024 - Interesting Talks

Last week I attended the Apache Kafka Summit Bangalore (India, along with thousands of other speakers and attendees -…
What Do Hanoi Intersections And Water Puppets Have In Common With Distributed Cloud Systems?

2024年4月22日

What Do Hanoi Intersections And Water Puppets Have In Common With Distributed Cloud Systems?

Last week I presented at FOSSASIA which was held in Hanoi, Vietnam. During my time in Hanoi, I had two experiences that…

3 条评论
Connecting to Instaclustr Managed PostgreSQL? and Apache Kafka? from Payara Cloud

2024年3月14日

Connecting to Instaclustr Managed PostgreSQL? and Apache Kafka? from Payara Cloud

Paul Brebner, Instaclustr Technology Evangelist https://www.instaclustr.

See all articles

Summary of the 6th Community over Code Performance Engineering Track (October 7, 2024, Denver, Colorado, USA)

Paul Brebner

Open Source Technology Evangelist at Instaclustr by NetApp

领英推荐

Paul Brebner的更多文章

社区洞察

其他会员也浏览了

The Difference Between Liveness and Readiness Probes in Kubernetes

Strategies for Deploying Modern Applications!

Reactive Programming: What & Why??—?Part III: Project Reactor

Profiling WebAssembly with pprof and wzprof

Requirements Capture – the Cinderella of Computing

My tryst with Chaos Engineering

Engineer and the Machine

What's the next development model?

Race Conditions in Software Development

Understand writing Thread-Efficient Code

领英推荐

Paul Brebner的更多文章

Load Testing - of a bridge, by lots of trains!

Three decades of laptop computers

Open Source Performance Engineering: Blogs – Part 1

20 years of Open Source from Grid to Cloud Computing

Kafka Connect: Build and Run Data Pipelines - Book Review, Paul Brebner

Seven Years of Open Source DevRel Technology Fun With Instaclustr

The Fourth Community over Code Performance Engineering Track (Bratislava, Slovakia, 5 June 2024)

Kafka Summit Bangalore 2024 - Interesting Talks

What Do Hanoi Intersections And Water Puppets Have In Common With Distributed Cloud Systems?

Connecting to Instaclustr Managed PostgreSQL? and Apache Kafka? from Payara Cloud

社区洞察

其他会员也浏览了

The Difference Between Liveness and Readiness Probes in Kubernetes

Strategies for Deploying Modern Applications!

Reactive Programming: What & Why??—?Part III: Project Reactor

Profiling WebAssembly with pprof and wzprof

Requirements Capture – the Cinderella of Computing

My tryst with Chaos Engineering

Engineer and the Machine

What's the next development model?

Race Conditions in Software Development

Understand writing Thread-Efficient Code