Control OneLake with Storage Explorer – Data Engineering with Fabric

Control OneLake with Storage Explorer – Data Engineering with Fabric

What is a Lakehouse in Azure Fabric? Borrowing a quote from MS Learn - "Microsoft Fabric Lakehouse is a data architecture platform for storing, managing, and analyzing structured and unstructured data in a single location". I think of it as a shared Apache Spark virtual space. I can upload, download, and manage files. I can run Spark notebooks to read, transform and write data. I can create tables and/or views on top of the files. Of course, the delta file format is preferred since it has ACID properties that we Data Architects love.

Business Problem

Our manager at adventure works has a team of data engineers that already know how to use Azure Storage Explorer. How can we manage files in the one lake using this tool?

Technical Solution

Previously, I talked about how to manage files with OneLake Explorer. The problem with this tool is that data is manipulated either on the client machine or in Microsoft Fabric. At a given point, the data is synched between the two. There is no granular control on what files or folders are manipulated. The same steps can't be easily repeated.

Today, we are going to investigate two tools. First, Azure Storage Explorer is a graphical user interface that reminds me of the Cute FTP tool that I used early in my career. Second, the AzCopy executable is the worker that Azure Storage Explorer (manager) calls to get the job done. This program reminds me of the command line version of FTP. The executable is available on both Windows and Linux systems. This opens the door for different operating systems to upload files into Microsoft Fabric.

Please see my full article on SQL Server Central for all the details.

要查看或添加评论,请登录

John Miner的更多文章

  • Why use Tally Tables in the Fabric Warehouse?

    Why use Tally Tables in the Fabric Warehouse?

    Technical Problem Did you know that Edgar F. Codd is considered the father of the relational model that is used by most…

  • Streaming Data with Azure Databricks

    Streaming Data with Azure Databricks

    Technical Problem The core functionality of Apache Spark has support for structured streaming using either a batch or a…

    1 条评论
  • Upcoming Fabric Webinars from Insight

    Upcoming Fabric Webinars from Insight

    Don't miss the opportunity to boost your data skills with Insight and Microsoft. This webinar series will help you…

  • How to develop solutions with Fabric Data Warehouse?

    How to develop solutions with Fabric Data Warehouse?

    Technology Details The SQL endpoint of the Fabric Data Warehouse allows programs to read from and write to tables. The…

  • Understanding file formats within the Fabric Lakehouse

    Understanding file formats within the Fabric Lakehouse

    I am looking forward to talking to the Cloud Data Driven user group on March 13th. You can find all the presentation…

    3 条评论
  • Engineering a Lakehouse with Azure Databricks with Spark Dataframes

    Engineering a Lakehouse with Azure Databricks with Spark Dataframes

    Problem Time does surely fly. I remember when Databricks was released to general availability in Azure in March 2018.

  • Create an Azure Databricks SQL Warehouse

    Create an Azure Databricks SQL Warehouse

    Problem Many companies are leveraging data lakes to manage both structured and unstructured data. However, not all…

    2 条评论
  • How to Load a Fabric Warehouse?

    How to Load a Fabric Warehouse?

    Technology The data warehouse in Microsoft Fabric was re-written to use One Lake storage. This means each and every…

  • My Year End Wrap Up for 2024

    My Year End Wrap Up for 2024

    Hi Folks, It has been a very busy year. At the start of this year I wanted to learn Fabric in depth.

    1 条评论
  • Virtualizing GCP data with Fabric Shortcuts

    Virtualizing GCP data with Fabric Shortcuts

    New Technology Before the invention of shortcuts in Microsoft Fabric, big data engineers had to create pipelines to…

社区洞察

其他会员也浏览了