DuckDB Now Supports Querying New S3 Table Buckets via Glue IRCC Endpoints

DuckDB Now Supports Querying New S3 Table Buckets via Glue IRCC Endpoints

DuckDB continues to push the boundaries of fast, in-memory analytics by now supporting querying of new S3 table buckets via AWS Glue IRCC (Iceberg Runtime Catalog Connection) endpoints. This latest enhancement enables seamless integration with Iceberg tables stored on S3, leveraging Glue as the metadata catalog for efficient querying.

Why This Matters

With this feature, DuckDB users can directly interact with Iceberg tables in an AWS Glue-backed catalog without needing to manage metadata manually. This streamlines access to large-scale datasets stored in S3 while benefiting from DuckDB's high-performance querying capabilities.

Setting Up DuckDB with AWS Glue IRCC

To start using this new capability, follow these steps to install the required extensions and configure the connection to AWS Glue IRCC.

1. Install and Load Required Extensions

These commands ensure that DuckDB is set up to interact with AWS services, including S3 and Glue-based Iceberg catalogs.

2. Force Install the Iceberg Extension from Core Nightly


This ensures you are using the latest version of the Iceberg extension, which includes Glue IRCC support.

3. Configure Secure Access to AWS Glue


Secrets management ensures secure access to AWS resources without exposing credentials in plain text.

4. Attach the Glue-Backed Iceberg Catalog


This command connects DuckDB to the Iceberg catalog in AWS Glue, enabling queries against Iceberg tables stored in S3.

5. Query the Iceberg Tables

SHOW ALL TABLES;
SELECT count(*) FROM my_iceberg_catalog.myblognamespace.customers;        


After successful attachment, you can list available tables and run SQL queries as you would with any other database.

References & Further Reading Official Github PR

With these latest enhancements, DuckDB continues to expand its reach in the data lake ecosystem, making it easier than ever to perform analytics on S3-backed Iceberg tables via AWS Glue IRCC.

Yadunandan Batchu

Building PYOR, Ex-Coinswitch, Ex-Unocoin, Ex-CommonFloor

3 天前

does it only work with s3 tables? what about iceberg tables setup via glue catalog?

回复
Sam Ansmink

Software Engineer

1 周

Really cool Soumil S.! We also just posted a blogpost here https://duckdb.org/2025/03/14/preview-amazon-s3-tables which focusses on the new S3 Tables Iceberg Rest Catalog endpoint

Brandon Jackson, MBA

Analytics Executive | Solution Architecture | Data Engineering | ML | FP&A

1 周

I literally did this two days ago, but running iceberg_scan on the json metadata files in S3. Does this make that and time travel transparent?

回复

要查看或添加评论,请登录

Soumil S.的更多文章

社区洞察