Flink Forward Virtual Conference 2020 - Day 1 Recap
Dinesh Chandrasekhar
Industry Analyst | Marketing Executive | Master Story Teller | Public Speaker
With each and every favorite tech conference of mine shutting down their physical presence this year, it was getting gloomier. The last straw was when the much-awaited Flink Forward event, to be held in San Francisco, was also canned earlier last month. However, the organizers have put new life into the Flink community by creating an entire virtual conference and giving us the much-needed boost. April 22nd was the first day of the conference. With the entire conference being free, there were over 1300 attendees across various tracks and sessions. The speakers, from a range of companies, had amazing content to share with us. This is my quick recap of some of the highlights that I enjoyed on Day 1. I am still going to listen to all the recordings of the other sessions that I missed today.
The day started with three amazing keynotes -
- Stephan Ewen, CTO & Co-founder of Ververica, kicked off with a fascinating talk on Stateful Functions 2.0. Questioning the basic notion of why storage and compute need to be together, it promises to liberate the compute from Flink processes and run them as separate processes. The future looks very exciting with a whole range of use cases waiting to take advantage of this serverless execution model. Stephan invited member of the community to be part of this project to make it even better.
- The second keynote was delivered by Konstantin Knauf, Head of Products at Ververica. In the true interest of driving more Flink adoption across the community, Ververica has introduced a community edition of its platform. This free edition allows you to get started with Flink in under 5 minutes. All that you need is a Kubernetes cluster and a distributed file system to get started.
- The third keynote was delivered by Cloudera’s own Joe Witt, VP of Engineering, and Marton Balassi, Senior Engineering Manager for Flink. While Joe presented the broader Data-in-Motion philosophy of Cloudera of how an end-to-end streaming platform combines the power of Apache NiFi + Apache Kafka + Apache Flink, he also explained how our platform has evolved over time and where it is headed next. Marton followed it up with his commitment to the Flink community. He showcased what we have delivered based on his last commitment he gave in his keynote at Flink Forward Berlin 2019. He also gave a sneak preview of some really exciting upcoming features such as end-to-end schema and lineage management (integration of Schema Registry and Atlas) as well as support for Flink SQL within Cloudera Streaming Analytics. He also explained what the Data-in-Motion vision was all about simplifying the user experience across flows, topics and streaming applications.
After the keynotes, there were three back-to-back sessions across 5 virtual rooms. I attended a few while I chose to listen to the recordings of the remaining sessions. Out of the ones I attended, here are some key highlights -
- Managing real-time data teams by Jesse Anderson - Even though I am not a manager for a data team, it was interesting to understand the challenges that data teams face organizationally, functionally, across other teams etc. This was almost like a masterclass in a business school. Touching on the current COVID situation into this topic was also a realistic angle to this talk. Liked it.
- Needle in the haystack: Monitoring the health of a huge Kafka fleet with Flink - This was a talk delivered by an architect from AWS. As someone who owns a competitively superior offering of Kafka and related ecosystem services, I was eager to see how AWS was doing. By ingesting log streams from Kafka clusters and passing it into Flink using Kinesis, they are leveraging Flink to process the various streams for failure patterns and using those to alert appropriately. While I do understand the value of using Flink here, we (at Cloudera/Hortonworks) have solved such Kafka operational issues and DevOps visibility challenges using Streams Messaging Manager. It goes the extra mile of being able to showcase data lineage tracking for audit and compliance purposes as well.
- Flink SQL in 2020: Time to show off - This was a session by a team at Ververica. This was a really cool interactive session where they showed how Flink SQL can be used to do live queries on continuous data from different data sources such as files, Kafka topics etc. They showcased different types of JOINs that were possible with the SQL constructs. They also showed how to do pattern matching and recognition using specific SQL clauses. The excitement in this session was a great proofpoint that we (at Cloudera) are also working on the right things on our roadmap. Our customers have also been asking for SQL support to enable their analysts to query live streaming data easily. We will be coming out with this capability as well shortly.
I am eagerly looking forward to Day 2 and Day 3 of Flink Forward for more exciting sessions. Cloudera engineers have a few sessions that they are presenting as well. I am listing them here so that you won’t miss them -
- Energize Multi Tenancy Flink on K8s with YuniKorn by WeiWei Yang and Wilfred Spiegelenburg, April 23rd, 11 am PDT
- Testing production streaming applications by Gyula Fora and Matyas Orhidi, April 24th, 12 AM PDT, 9:00 AM CEST
In addition to this, we have been running a series of Flink webinars titled the “Flink PowerChat series”. We had our first PowerChat on “Introduction to Apache Flink” on April 14th, 2020. Our second Flink PowerChat is coming up on May 7th, 2020. It will be on the topic of the “Top 5 advanced stream processing concepts in Apache Flink”. Register for the webinar today.