登录查看更多内容

Virtualizing AWS data by using Fabric Shortcuts

John Miner

Data Architect at Insight

发布日期: 2024年11月1日

Technical Problem

Before the invention of shortcuts in Microsoft Fabric, big data engineers had to create pipelines to read data from external sources such as Amazon Web Services (AWS) S3 buckets and write into Azure Data Lake Storage.? This duplication of data is at risk of becoming stale over time.? Additionally, computing power might be wasted on bringing over data that is used one time.? With today’s companies being comprised of mergers and acquisitions over time, your company’s data landscape might exist in multiple cloud vendors.? How can we virtualize the data stored in S3 buckets in our Microsoft Fabric Lakehouse design?

Business Problem

Our manager has asked us to create a virtualized data lake using AWS S3 buckets and Microsoft Fabric Lakehouse.? The new shortcut feature will be used to link to both files and delta tables.? Most of the article will be centered around setting up an AWS trial account, loading data into S3 buckets, and creating a service account to access those buckets.? Once the data is linked, a little work will be needed to either create a managed delta table or test linked delta tables.? At the end of this article, the big data engineer will be comfortable with Microsoft Fabric Shortcuts using AWS S3 buckets as a source.

Learn More

Please see the recent article on SQL Server Central for all the details and examples.

Dhruv Mahajan

4 个月

Do you have any idea if the aws glue catalog shortcuts are also in the roadmap like fabric has for the unity catalog

要查看或添加评论，请登录

John Miner的更多文章

Why use Tally Tables in the Fabric Warehouse?

2025年2月26日

Why use Tally Tables in the Fabric Warehouse?

Technical Problem Did you know that Edgar F. Codd is considered the father of the relational model that is used by most…
Streaming Data with Azure Databricks

2025年2月25日

Streaming Data with Azure Databricks

Technical Problem The core functionality of Apache Spark has support for structured streaming using either a batch or a…

1 条评论
Upcoming Fabric Webinars from Insight

2025年2月19日

Upcoming Fabric Webinars from Insight

Don't miss the opportunity to boost your data skills with Insight and Microsoft. This webinar series will help you…
How to develop solutions with Fabric Data Warehouse?

2025年2月18日

How to develop solutions with Fabric Data Warehouse?

Technology Details The SQL endpoint of the Fabric Data Warehouse allows programs to read from and write to tables. The…
Understanding file formats within the Fabric Lakehouse

2025年2月10日

Understanding file formats within the Fabric Lakehouse

I am looking forward to talking to the Cloud Data Driven user group on March 13th. You can find all the presentation…

3 条评论
Engineering a Lakehouse with Azure Databricks with Spark Dataframes

2025年2月3日

Engineering a Lakehouse with Azure Databricks with Spark Dataframes

Problem Time does surely fly. I remember when Databricks was released to general availability in Azure in March 2018.
Create an Azure Databricks SQL Warehouse

2025年1月21日

Create an Azure Databricks SQL Warehouse

Problem Many companies are leveraging data lakes to manage both structured and unstructured data. However, not all…

2 条评论
How to Load a Fabric Warehouse?

2025年1月9日

How to Load a Fabric Warehouse?

Technology The data warehouse in Microsoft Fabric was re-written to use One Lake storage. This means each and every…
My Year End Wrap Up for 2024

2024年12月26日

My Year End Wrap Up for 2024

Hi Folks, It has been a very busy year. At the start of this year I wanted to learn Fabric in depth.

1 条评论
Virtualizing GCP data with Fabric Shortcuts

2024年12月16日

Virtualizing GCP data with Fabric Shortcuts

New Technology Before the invention of shortcuts in Microsoft Fabric, big data engineers had to create pipelines to…

See all articles

Virtualizing AWS data by using Fabric Shortcuts

John Miner

Data Architect at Insight

Technical Problem

Business Problem

Learn More

John Miner的更多文章

社区洞察

其他会员也浏览了

Unlocking Data Processing with Google Cloud Functions and a Layered Storage Approach

Building Resilient Data Architectures with Google Cloud Services

Design Principles for Partitioning with Azure training

Azure Blob Storage

Azure Blob Storage

AWS Lake Formation: Part 10 Troubleshooting and Optimization

Tutorial: Deploy a Kubernetes-Driven PostgreSQL-Hyperscale on Azure Arc

Azure for AWS Professionals(Part-3)

QuickSight

Creating a Data warehouse for meaningful business insights? Here's a comparison for GCP & AWS service models!

Technical Problem

Business Problem

Learn More

John Miner的更多文章

Why use Tally Tables in the Fabric Warehouse?

Streaming Data with Azure Databricks

Upcoming Fabric Webinars from Insight

How to develop solutions with Fabric Data Warehouse?

Understanding file formats within the Fabric Lakehouse

Engineering a Lakehouse with Azure Databricks with Spark Dataframes

Create an Azure Databricks SQL Warehouse

How to Load a Fabric Warehouse?

My Year End Wrap Up for 2024

Virtualizing GCP data with Fabric Shortcuts

社区洞察

其他会员也浏览了

Unlocking Data Processing with Google Cloud Functions and a Layered Storage Approach

Building Resilient Data Architectures with Google Cloud Services

Design Principles for Partitioning with Azure training

Azure Blob Storage

Azure Blob Storage

AWS Lake Formation: Part 10 Troubleshooting and Optimization

Tutorial: Deploy a Kubernetes-Driven PostgreSQL-Hyperscale on Azure Arc

Azure for AWS Professionals(Part-3)

QuickSight

Creating a Data warehouse for meaningful business insights? Here's a comparison for GCP & AWS service models!