Exploring Fabric: putting Microsoft's new analytics platform to the test
It's been a couple of days since Microsoft let the world know about Fabric, their new unified end-to-end data analytics platform. It was quite the announcement! Fabric's proposition is bold. Shiny visuals and slick presentations have made data practitioners excited and curious, myself included. But a shiny new analytics platform is like a shiny new car, you test drive before you buy. In this post I share my first hands-on experience.
Unification?
Before diving into the tool, let's first understand the problem Microsoft aims to solve. Core to Fabric's value proposition is delivering a unified, end-to-end experience: no need to use any other analytics tool because all the functionality is available within a single platform. That sounds lovely, as today's reality is very different. The amount of specialized tools is enormous, and data engineering often feels more like systems engineering. Just have a look at the MAD (ML/AI/Data) Landscape for 2023 and you'll understand. The added value of a data team would be so much bigger if it could focus on building data products, rather than integrating different tools.
The amount of specialized tools is enormous, and data engineering often feels more like systems engineering.
Free trial :)
Fabric is in public preview, and Microsoft offers a free trial. Great! It's easy enough to sign up and activate you trial.
Exploring end-to-end capabilities
More than anything, I'm curious if Fabric can deliver on their promise of unification. End-to-end analytics requires many different ingredients, and I explored some of them:
I will now walk through the steps I took and share relevant insights.
Data ingestion
I first created a lakehouse using Fabric's UI.
Next step: getting data in the lakehouse, I chose to work with the popular diabetes sample dataset.
Well, that was simple enough. All UI work, not a single line of code written.
Visualization (reporting)
Getting data in was as simple as ABC. Now the visualization. With Power BI taking a prominent place in Fabric (I heard it's the Power BI team that leads the Fabric efforts, maybe that has to do with it), I expected this to be a first-class experience. I was right. For every lakehouse you make, a Power BI "dataset" gets created automatically. After a few clicks you're in the Power BI interface ready to build your report.
MLOps
So far so good, Fabric made ingestion and visualization super easy. What about Machine Learning? After all, I didn't load the diabetes data just to create some simple charts. I found some "Data Science" functionality in the UI.
Aha, there's some actual code! This marks the end of my UI-only experience. I'm not surprised that a more advanced use case like Machine Learning isn't fully captured by a user interface (yet). Also happy that Fabric still lets me do what I like the most: writing code. Another delight: Fabric's live Spark pools really work...no waiting for a pool to start!
I had to change the code a little to load the diabetes data from the lakehouse into Spark, run an experiment, and train and register a model. Microsoft uses the open-source MLFlow framework to manage the ML lifecycle within Fabric. The results can be inspected interactively.
The last step was to load my newly created model and use it to make predictions against the diabetes data.
Conclusion
This was my first interaction with Fabric and it was pretty good. I was able to easily ingest data, visualize it, train a ML model on it, and use that model to make predictions...all within a single tool! I can't say the experience was completely flawless, there are definitely some glitches here and there. But Fabric is still in preview, so I call that acceptable. I don't think general availability is near, but I'm excited to keep track of the developments!
Co-founder of Gatekeeper | Easy hard skill assessments
1 年Have you encountered and/or tried any enterprise features? E.g. cooperative features, branching and version control, staging environments, ...?