Querying Lakehouse data from Warehouse in Microsoft Fabric
Jovan Popovic
Principal Program Manager at Microsoft, working on Microsoft Fabric Warehouse. Worked on Azure Synapse, Azure SQL Azure SQL Managed Instance, and SQL Server.
Microsoft Fabric is a cloud data platform that provides both Lakehouse and Warehouse artifacts for data storage and analysis. Lakehouse enables you to manage your data with Spark notebooks, while Warehouse enables you to analyze and manage data using T-SQL queries.
If you have a table in Fabric Lakehouse and you need to query it from the Fabric Warehouse, you can do it without copying data from Lakehouse to Warehouse.
How do you query Lakehouse table from Warehouse?
Microsoft Fabric enables you to easily cross reference Lakehouse and Warehouse data. To query Lakehouse data from Warehouse, you need to have both artifacts in the same Fabric workspace.
Fabric workspaces have a cross-database query capability that enables you to query Lakehouse data from your Warehouse using T-SQL. You do not need to use COPY INTO to load data into Warehouse if your data is already in the same workspace. If your data is in a table in Lakehouse and in Delta Lake format, you can query it directly from your Warehouse.
To query Lakehouse data from Warehouse, you need to open a SQL editor that you are using to query Warehouse. Then, you need to use a 3-part name reference to access your Lakehouse table. The 3-part name reference consists of the Lakehouse name, the dbo schema name and the Lakehouse table name, separated by dots.
For example, if your Lakehouse table is called salesorders and placed in the Lakehouse artifact called MyLakehouse:
领英推荐
You can use the following 3-part name to reference the table if you are qureying data in your Warehouse: MyLakehouse.dbo.salesorders. On the picture below you can see how to get first 10 rows from the lakehouse table while you are in the Warehouse context:
You can use this 3-part name reference in any SQL statement that you would use to query Warehouse tables, such as SELECT, JOIN, GROUP BY, etc. For example, if you want to join your Warehouse customers table with your Lakehouse salesorders table and get the total amount of orders per customer, you can use the following T-SQL statement:
SELECT c.customer_id, c.name, SUM(o.amount) AS total_amount
FROM customers c
JOIN MyLakehouse.dbo.salesorders o
ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
ORDER BY total_amount DESC;
Cross-database querying in the Fabric workspaces enables you to work with the tables in Lakehoues and Warehouse like they are placed in the same database.
Conclusion
Fabric workspace enables you to share data between Lakehouse and Warehouse without copying data. You can query Lakehouse data from Warehouse using 3-part name references and SQL statements. This way, you can leverage the advantages of both services and perform complex data analysis with ease and efficiency.
Lead Software Engineer at Microsoft
1 年Can I query lakehouse data from Lakehouse only without going to Warehouse ?
VP Strategic Software and Technology at PROtect LLC
1 年3-part name worked for me from the web UI and also SSMS. Thanks for the post.
Data, Analytics and AI Solutions Architecture | MS, MHSA | Data, AI, and Analytics Leader
1 年Lakehouse, warehouse, onelake, too many technical jargons of data houses nowadays :). Jokes apart, Thanks for the Key point that data sharing is easy in fabric just like one drive! :)