Learn How to Query S3Table Buckets (Managed Iceberg) with Trino | Hands-on Labs
This hands-on lab demonstrates how to query S3 Table Buckets (Managed Iceberg) using Trino. The tutorial covers creating Iceberg tables in Amazon S3, writing data with PyIceberg, setting up a Trino environment, and executing queries on the Iceberg tables. By following this guide, you'll learn how to leverage Trino's distributed SQL query engine to efficiently analyze data stored in Iceberg format on S3, enabling scalable and performant analytics on your data lake.
Video Guide
Step 1: Create Table Buckets
Step 2: Write Data Using PyIceberg
Python Script to Write Data
Step 3: Set Up Trino to Query Iceberg Tables
We will use Trino to query our Iceberg tables.
Docker Compose Setup for Trino
Configure Iceberg Properties
Create a trino/etc/catalog/iceberg.properties file:
Start Trino
Run the following command to start Trino:
Step 4: Query Iceberg Tables with Trino
We can now use Jupyter Notebook to query the Iceberg table with Trino.
Output
Conclusion
In this hands-on lab, we demonstrated how to create an Iceberg table in Amazon S3, write data using PyIceberg, set up Trino, and query the table. This setup enables scalable, SQL-based querying of managed Iceberg tables stored in S3.
Read Blog on Medium
Data Engineering, Cloud migration and infrastructure as code
1 天前I was just thinking of this use case other day.. Thanks for the lab
Product at Amazon
1 天前Awesome Soumil S.! The configs worked :)