Learn How to Query S3Table Buckets (Managed Iceberg) with Trino | Hands-on Labs

Learn How to Query S3Table Buckets (Managed Iceberg) with Trino | Hands-on Labs

This hands-on lab demonstrates how to query S3 Table Buckets (Managed Iceberg) using Trino. The tutorial covers creating Iceberg tables in Amazon S3, writing data with PyIceberg, setting up a Trino environment, and executing queries on the Iceberg tables. By following this guide, you'll learn how to leverage Trino's distributed SQL query engine to efficiently analyze data stored in Iceberg format on S3, enabling scalable and performant analytics on your data lake.

Video Guide

Step 1: Create Table Buckets



Step 2: Write Data Using PyIceberg

Python Script to Write Data

https://github.com/soumilshah1995/s3tablebuckets-trino/blob/main/create_table_buckets.py


Step 3: Set Up Trino to Query Iceberg Tables

We will use Trino to query our Iceberg tables.

Docker Compose Setup for Trino

Configure Iceberg Properties

Create a trino/etc/catalog/iceberg.properties file:

Start Trino

Run the following command to start Trino:

Step 4: Query Iceberg Tables with Trino

We can now use Jupyter Notebook to query the Iceberg table with Trino.

Output

Code https://github.com/soumilshah1995/s3tablebuckets-trino/blob/main/README.md

Conclusion

In this hands-on lab, we demonstrated how to create an Iceberg table in Amazon S3, write data using PyIceberg, set up Trino, and query the table. This setup enables scalable, SQL-based querying of managed Iceberg tables stored in S3.

Read Blog on Medium

https://medium.com/@shahsoumil519/learn-how-to-query-s3table-buckets-managed-iceberg-with-trino-hands-on-labs-20fe55f850a8


Arif R.

Data Engineering, Cloud migration and infrastructure as code

1 天前

I was just thinking of this use case other day.. Thanks for the lab

Aritra Gupta

Product at Amazon

1 天前

Awesome Soumil S.! The configs worked :)

回复

要查看或添加评论,请登录

Soumil S.的更多文章