100 Days at Databricks
As I hit the 100-day mark at Databricks, I want to review the journey so far with some of the bigger themes that stood out.
The Beauty of Creative Destruction
I'm very technical but ultimately a field engineer, which is under the sales organization. It's been an eye-opener. To make things very clear for everyone - Databricks is a consumption-based business. We get paid when you use things. You use things when you are happy and they deliver value. One of the big values at Databricks is being customer obsessed and it is a daily reality. For example, I was able to help a customer move from 10+ endpoints to a single one by leveraging PyFunc flavors and a meta model wrapper that pulls other models from Unity Catalog into a single endpoint that they keep live. This does zero to help our consumption, but delivers real value for them. We celebrated this, despite losing consumption as a result.
The pace of innovation here is breathtaking. New features are rolling out at an incredible rate, but our core philosophy remains refreshingly straightforward:
- Focus on use cases
- Ship things that deliver real value as fast as possible
- Prioritize open data, open frameworks, and rock-solid governance
This approach ensures we only succeed when our customers derive real value from our products. It is smart in my opinion, because the scale of creative destruction in technology is insane. In oil and gas, we plan projects for 50 years. Some projects in technology only last for 50 days. Take for example the DBRX instruct model. It was a state-of-the-art model ... for several weeks. It took millions of dollars to train and develop. No one views that project as a failure - just another example of the culture of innovation and research.
Adapting to Rapid Change
The evolution of our products, particularly in the Generative AI space, is nothing short of mind-blowing. It's both exciting and challenging to keep up with the pace, but it's a testament to our ability to pivot and respond to market demands quickly. For example, from the time I started, a vast three months ago, we've changed how models are served, introduced an agent framework, and revamped the MLFlow evaluation framework. Oh, and LangChain completely changed their framework and deprecated every solution accelerator we built on it.
To survive in this world, focusing on getting things done pragmatically is essential. Our serverless features are a prime example of how we're simplifying complex processes for users by abstracting away the intricacies of Spark execution and cluster management. For every unicorn out there, there are 100 non-experts that are just as smart but need a simple platform to put their ideas into action. They don't want to understand Spark execution, tracking server file structures, or even the statistics behind machine learning models. Worried about what packages are in DBR 14.3? Who cares. Worried about execution plans under the hood? F(orget) them. Scared about how you federate data sources? Let the platform admin deal with it in a consistent way, mount to volumes, and get to work. Even if you are a unicorn, you probably should be focusing on higher-value things - which is why the SaaS / PaaS ecosystem is one of the biggest economic drivers in the world today.
In this fast-paced environment, prioritization becomes crucial. There are a million things you can do. Trust me, I'm skilled but not exceptional, and you can do anything I post about if you give yourself some time to learn it. Yes, I'm talking to you, non-analytics people. Don't let the jargon break you down - we have AI for that. But you need to pick your battles. I may be the worst at this - I want to say yes to everything. I want to learn everything. And I'm always optimistic about how long things will take. So one of the key skills I'm working on at Databricks is prioritizing which technical streams are going to be most impactful for our customers. Help me with this, and then help yourself. Pick something and understand it deeply if possible. Time Series, Bayesian Optimization, Geospatial, Generative AI - whatever floats your boat. Tie it together and go for a float.
领英推荐
Small Teams, Big Impact
Despite Databricks' rapid growth to over 8,000 employees, the company has maintained a culture that empowers individuals and small teams to make significant contributions. This approach has led to impressive outcomes, such as the development of Databricks Apps by a small but talented team. This development was done in months, not years. And it works nearly flawlessly with a simple philosophy. I don't think this would be possible if you tripled the team.
Databricks runs quarterly hackathons and technical training weeks to provide opportunities for individual creativity and skill development. These initiatives, along with the formation of self-organized "tribes" of experts, foster an environment where passionate individuals can thrive and drive innovation. A great example of this is how quickly I was able to get involved with the geospatial and time series teams. Everyone is passionate and wants to improve things and the biggest challenge is honestly saying no. Something I need to get better at.
Closing
It's been a lot and I will probably have to be strategic about what battles I pick for the next three months. But my energy is still really high, mostly as a result of the awesome people I get to talk with, both at Databricks and our customers. I'm planning on getting back to time series and generative AI for the next couple of months and would love to hear things you're interested in. I'll likely focus on some smaller code-based examples with a bit of poetic waxing - so consider yourself warned if you don't speak up.
Subsurface Geologist - Field Development - Expert Petrel Modeler
4 个月So very well written. Have you considered journalism?
Geotechnical Engineer at Stantec
4 个月Hey Scott, do you have an AI Scott with a blogger persona who writes these articles with such discipline? Good read though
Field Engineering Director at Databricks
4 个月I can’t believe you’ve already been here 100 days Scott, seems like you just joined a few months ago. Your blog posts are always great and I love your perspective. Glad to have you on the team!
Data Scientist | Data Engineer | Machine Learning Engineer | Software Developer | Workforce Analyst | People Scientist | HR | Python | Spark | PostgreSQL | Pandas | Strategic Collaborator | 2X Databricks Certified
4 个月This is an excellent post (and congrats on 100 days!) I was just telling someone the other day that Databricks seems to release something new almost every day, and while that’s overwhelming, coming from the perspective of someone who has built products that consistently added new features, I can’t agree more with your statement that much of the speed of development is related to smaller, self organizing teams.
Data Science Consultant | @alastairmuir.bsky.social | Risk Analysis and Optimization | Causal Inference
4 个月It’s always challenging to keep up with you. Lead the way