登录查看更多内容

My First 365 Days at Databricks

? Mayur Palta

GTM | Field Engineering | Competitive Strategist | Board Member | Author | Harvard Business School | Ex-AWS, Ex-Oracle

发布日期: 2022年8月15日

Though celebrating 1 year anniversary, have been binge watching Databricks for the last 5.5 years (out of its 9-year existence) when I came close to joining Databricks but decided to join AWS (Amazon Web Services) instead. Now, when trying to connect the dots back together, it all made sense. The lessons learned at AWS for 4.5 years from leading a solution architect team to being the first few team members building and scaling AWS’s competitive intelligence flywheel proved to be valuable. All the pieces of the puzzle fit together.

Today I want to share with you three stories from my work at Databricks and what makes Databricks culture of innovation unique. That’s it. No big deal. Just three stories and three peculiar things about Databricks culture.

The first story is about Customer Obsession. Customer obsession put in practice in the form of working backwards from customers. At the Spark Summit 2018, Michael Armbrust, distinguished software engineer at Databricks, had a casual conversation with Dominique Brezinski , distinguished engineer at Apple. Dominique aka Dom, who heads up efforts around intrusion monitoring and threat response, was picking Michael’s brain on how to address the processing demands created by Apple’s massive volumes of concurrent batch and streaming workloads (petabytes of log and telemetry data per day). Apple could not use data warehouses for this use case because data warehouses are cost-prohibitive for massive event data, data warehouses do not support real-time streaming use cases which were essential for intrusion detection, and data warehouses lack support for advanced machine learning, which is needed to detect zero-day attacks and other suspicious patterns. So building it on a data lake was the only feasible option at the time, but they were struggling with pipelines failing due to a large number of concurrent streaming and batch jobs and weren’t able to ensure transactional consistency and data accessibility for all of their data. So, the two of them came together to discuss the need for the unification of data warehousing and AI, planting the seed that bloomed into Delta Lake as we now know it.?This is a classic example of how Databricks works backwards from customers to help data teams solve world’s toughest problems from drug discovery to precision agriculture to sustainability, you name it.

Today, Delta Lake is the most comprehensive Lakehouse format used by over 7,000 organizations (processing exabytes of data per day) and with over 7 million monthly downloads (growing 10x in monthly downloads in just one year). Why is Delta Lake so widely adopted? In simple words, Delta Lake enables organizations to build Data Lakehouses, which enable data warehousing and machine learning directly on the data lake. Also, the Delta Lake project is thriving with over 190 contributors across more than 70 organizations, nearly two-thirds of whom are from outside Databricks contributors and we’ve seen a 633% increase in?contributor strength (as defined by the Linux Foundation ) over the past three years. It’s this level of support that is the heart and strength of this open-source project.

Fast forward this story about the genesis of Delta Lake to June 2022, Databricks decided to open source all of Delta Lake with the announcement of Delta Lake 2.0. #customerobsession

This is another key example of customer obsession where actions speak louder than just words. Customers have shared with us that they want to avoid lock-in by adopting a true multi-cloud and open platform. This decision to open-source all of Delta Lake will potentially democratize the use and adoption of data lakehouses. I have been very fortunate over the last year to have this opportunity to partner closely with the core cross-functional team (product, engineering, developer relations, marketing, and product advisory board) factually differentiating this key innovation from other alternatives which are neither as mature nor as production ready (refer to independent third party findings and observations on the same https://databeans-blogs.medium.com/delta-vs-iceberg-performance-as-a-decisive-criteria-add7bcdde03d ). ?

The second story is about Data-Driven Decision Making. Over the last year, had this distinct honor to embark on a truth-seeking journey on why customers choose Databricks and why customers sometimes choose other alternatives. In simple words, answering the question “why we win/why we lose” and building+scaling our win loss analysis (qualitative+quantitative) muscle. Win loss insights tends to be the common achilles heals for most fast-growing technology startups. When technology startups are growing gangbusters (80% to 100+% YoY growth rates), most ignore to ask why we are winning so much. Such insights are even more critical at this stage of accelerated growth to be able to separate signal from noise and to double down investments in the right areas.

The first win loss review I participated at another company many years back came across a bit like the seller sold ice to eskimos. It was all about how the seller turned around an almost lost customer to a mega win. There was nothing wrong perceived about this situation because this company had a “hero culture”. Primarily, the acts of heroism like closing the largest 7 to 8 figure deal on the last day (at the final minute) were celebrated and rewarded at this company. As a result, this prior company struggled a bit to deliver predictable revenue.

At Databricks, data-driven decision making is deeply ingrained into the very fabric of its culture. Bricksters call it “let the data decide”. This explains why I experienced several tailwinds in building and scaling our win loss program. From product leaders to field/field engineering to marketing leaders, everyone jumped in to help. Being customer obsessed and data driven, we chose to align our win loss program to the entire customer journey. We decided to focus on multitude of dimensions for quantitative as well as qualitative data such as business context, main drivers for winning/losing, decision criteria, lessons from proof of concepts/bakeoffs, and quantifiable impact on customer’s business. Now if someone encounters a similar competitive situation, they can lean on these learnings. It gives you the sense that you have somewhat seen the play before.

Though there could be dozens of sources to derive meaningful win loss insights from like CRM, field interviews, NPS surveys, customer research, etc.; it is about reading the context for your own unique business and understanding the nuances of customer’s journey before you start diving deep into the details. A fellow win loss expert called this “nail it before you scale it”.

Thanks to the guidance, mentoring, and direction of industry win loss experts from SCIP and publishers of books on this topic. Without learning about what not to do and without the tailwinds of Databrick’s unique data-driven culture, choosing an appropriate course of action could have been challenging. Though we have a long journey ahead of us on this truth-seeking mission of win loss insights, we managed to move the needle a little bit over the last year.

The third story is about teamwork makes the dream work. Rarely do you get the opportunity to work with the smartest yet humble minds of the technology industry who believe in teamwork makes the dream work. At Data+AI Summit 2021 (last year), Databricks announced several key new innovations like a data warehouse that broke the world record before going to GA (Databricks SQL), a unified governance solution for all data and AI assets (Unity Catalog), and industry's first open standard for secure sharing of data assets (Delta Sharing). As a result, had the honor of partnering with cross-functional teams leading these new product introductions and factually differentiating these in each of the unique competitive landscapes these operate in.

Tomasz Tunguz 1 年前

Guide to Optimize Databricks for Cost and Performance

Analytics8 | Data & Analytics Consultancy 1 个月前

Understanding Batch and Real-Time Processing in…

Scrumconnect Consulting 9 个月前

In mid September last year, was in middle of authoring a technical white paper that would be available to customers. After putting kids to bed at around 9, I continued working on this white paper until mid-night and submitted it to seek feedback. Next morning, at 7 am noticed several valuable pieces of feedback provided on this document. While I was rubbing my eyes reading the feedback, it was none other than the CEO of Databricks, Ali Godhsi, himself who not only reviewed a 6 pager at 3 AM but also took the time to provide insightful and constructive feedback. Rarely do the executives take the time to stay in touch with the ground realties of the business or make themselves available. Not the case with Databricks, leadership team walks the talk, operates with attention to detail, and is accessible to the front lines. What we could learn from this short story is what my leadership coach (also happened to be manager) reminded me in a casual conversation few years back. She said, “to be an effective leader, one must be accessible”.

Exactly 1 year 13 days back, I joined Databricks after 4.5 years at Amazon Web Services. Amazon is well known for its strong culture, either you fit in the Amazon culture, or you don’t. While hiring at Amazon as a Bar Raiser in Training, the two leadership principles that remained top of mind (common denominator for hiring across roles) were customer obsession and ownership. At Databricks, it has been such a pleasant surprise that Databricks Executive-Staff (E-Staff) took some of the most relevant leadership lessons learned from other strong culture organizations like Amazon and improvised even further. Over the last 1 year, each interaction reassured how Bricksters continue to raise the bar on customer obsession and ownership.

During the interview process, there is one question I remember asking each interviewer, “how would you describe the culture at Databricks? And what makes it unique?”

Here are 3 peculiar things about Databricks culture:

Firstly, Databricks E-Staff and leaders across the company operate with an awesome level of transparency. Many thought leaders in business may debate whether this level of transparency is good or bad. Some of you may have experienced information guarded behind inner circles or at specific ranks in an organization in your prior or current adventures. One thing that this level of transparency instilled in the very DNA of Databricks is this feeling of mutual trust and reciprocity. Also, some may feel, as Bricksters, you operate as a cofounder and you are here to create a lasting impact on the business.
Secondly, Databricks prioritizes employee experience. If you oversimplify the concept of where an organization is focused on, some may say an organization prioritizes its resources (time, capital, energy) based on optimizing for stock price, some on customer outcomes, while some are over indexed on product. Imagine thinking about an organization’s priorities in this form of a stack rank. ?Typically, some technology companies tend to over index on either customer outcomes or product. Some optimize for the stock price; rarely may you find focus on employees appear among the top 3 of this stack rank though that has been changing fast in the last few years with labor shortages and fierce competition for talent. This ambidexterity to do right by employees (fairness) is still a muscle most organizations are developing. This is an area that Databricks particularly is doing well in (for data points, refer to reviews on Glassdoor ).
Thirdly, Databricks hiring process is oriented with long-term thinking in mind and involves several backdoors before someone is offered a role. This ensures the hiring decisions are somewhat predictable. Related peculiar piece is keeping the bar high on hiring and thinking a bit longer term to do the right thing. E.g., we kept the search ongoing for a role on my team for 10+ months, insisting we keep the bar high despite the short-term rising demands from this unfilled role. With the current uncertain macroeconomic environment and labor shortages, it is easier to give in to the temptation to hire someone who can and will do the job however Databricks continues to keep the bar high and maintains a long-term view on what is the right thing to do.

In addition to these 3 peculiar cultural aspects, got to observe closely how a new category like “lakehouse” gets created, nurtured, and adopted as a standard.

To recap, grateful to Databricks for this amazing opportunity to build. Build the competitive muscle, build the data-driven win loss program on why we win or lose, and contribute to communities that take awesome innovations like Delta Lake, Databricks SQL, Unity Catalog, Delta Sharing, Delta Live Tables, and many more from Data+AI Summit 2022 to market.

Share this sense of gratitude towards Databricks leadership team for the trust and continued support in building scaling mechanisms and taking on initiatives like building a win loss program, engaging with product advisory board, making sense of evolving Data+AI industry dynamics, engaging with industry leading analysts, contributing thought leadership (Building CI Flywheel using Data+ AI , Understanding Data Lakehouse KIMO Data anno 2021 , partnering with the global Competitive Intelligence (CI) community (Udemy Getting started with Competitor Intelligence , Apple podcast ), and bringing back the collective wisdom in areas like competitive intelligence, win loss analysis and ethics in CI.