Control OneLake with Storage Explorer – Data Engineering with Fabric
What is a Lakehouse in Azure Fabric? Borrowing a quote from MS Learn - "Microsoft Fabric Lakehouse is a data architecture platform for storing, managing, and analyzing structured and unstructured data in a single location". I think of it as a shared Apache Spark virtual space. I can upload, download, and manage files. I can run Spark notebooks to read, transform and write data. I can create tables and/or views on top of the files. Of course, the delta file format is preferred since it has ACID properties that we Data Architects love.
Business Problem
Our manager at adventure works has a team of data engineers that already know how to use Azure Storage Explorer. How can we manage files in the one lake using this tool?
领英推荐
Technical Solution
Previously, I talked about how to manage files with OneLake Explorer. The problem with this tool is that data is manipulated either on the client machine or in Microsoft Fabric. At a given point, the data is synched between the two. There is no granular control on what files or folders are manipulated. The same steps can't be easily repeated.
Today, we are going to investigate two tools. First, Azure Storage Explorer is a graphical user interface that reminds me of the Cute FTP tool that I used early in my career. Second, the AzCopy executable is the worker that Azure Storage Explorer (manager) calls to get the job done. This program reminds me of the command line version of FTP. The executable is available on both Windows and Linux systems. This opens the door for different operating systems to upload files into Microsoft Fabric.
Please see my full article on SQL Server Central for all the details.