Querying Lakehouse data from Warehouse in Microsoft Fabric

Querying Lakehouse data from Warehouse in Microsoft Fabric

Microsoft Fabric is a cloud data platform that provides both Lakehouse and Warehouse artifacts for data storage and analysis. Lakehouse enables you to manage your data with Spark notebooks, while Warehouse enables you to analyze and manage data using T-SQL queries.

If you have a table in Fabric Lakehouse and you need to query it from the Fabric Warehouse, you can do it without copying data from Lakehouse to Warehouse.

How do you query Lakehouse table from Warehouse?

Microsoft Fabric enables you to easily cross reference Lakehouse and Warehouse data. To query Lakehouse data from Warehouse, you need to have both artifacts in the same Fabric workspace.

No alt text provided for this image
Lakehouses and warehouses can be placed in the same workspace

Fabric workspaces have a cross-database query capability that enables you to query Lakehouse data from your Warehouse using T-SQL. You do not need to use COPY INTO to load data into Warehouse if your data is already in the same workspace. If your data is in a table in Lakehouse and in Delta Lake format, you can query it directly from your Warehouse.

To query Lakehouse data from Warehouse, you need to open a SQL editor that you are using to query Warehouse. Then, you need to use a 3-part name reference to access your Lakehouse table. The 3-part name reference consists of the Lakehouse name, the dbo schema name and the Lakehouse table name, separated by dots.

For example, if your Lakehouse table is called salesorders and placed in the Lakehouse artifact called MyLakehouse:

No alt text provided for this image
salesorders table in the lakehouse artifact

You can use the following 3-part name to reference the table if you are qureying data in your Warehouse: MyLakehouse.dbo.salesorders. On the picture below you can see how to get first 10 rows from the lakehouse table while you are in the Warehouse context:

No alt text provided for this image
You can query lakehouse table from the warehouse context using 3-part-name reference

You can use this 3-part name reference in any SQL statement that you would use to query Warehouse tables, such as SELECT, JOIN, GROUP BY, etc. For example, if you want to join your Warehouse customers table with your Lakehouse salesorders table and get the total amount of orders per customer, you can use the following T-SQL statement:

SELECT c.customer_id, c.name, SUM(o.amount) AS total_amount
FROM customers c
     JOIN MyLakehouse.dbo.salesorders o
          ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
ORDER BY total_amount DESC;        

Cross-database querying in the Fabric workspaces enables you to work with the tables in Lakehoues and Warehouse like they are placed in the same database.

Conclusion

Fabric workspace enables you to share data between Lakehouse and Warehouse without copying data. You can query Lakehouse data from Warehouse using 3-part name references and SQL statements. This way, you can leverage the advantages of both services and perform complex data analysis with ease and efficiency.

Ankit Kotak

Lead Software Engineer at Microsoft

1 年

Can I query lakehouse data from Lakehouse only without going to Warehouse ?

回复
Noah Kluge

VP Strategic Software and Technology at PROtect LLC

1 年

3-part name worked for me from the web UI and also SSMS. Thanks for the post.

回复
Mou Rakshit

Data, Analytics and AI Solutions Architecture | MS, MHSA | Data, AI, and Analytics Leader

1 年

Lakehouse, warehouse, onelake, too many technical jargons of data houses nowadays :). Jokes apart, Thanks for the Key point that data sharing is easy in fabric just like one drive! :)

要查看或添加评论,请登录

Jovan Popovic的更多文章

社区洞察

其他会员也浏览了