登录查看更多内容

How Amazon DynamoDB Streams Batch Processing Works

Uriel Bitton

AWS Cloud Engineer | The DynamoDB guy | AWS Certified | I help you supercharge your DynamoDB database ????

发布日期: 2025年1月18日

+ 关注

I remember when I first started using DynamoDB streams, the data I processed was a mess.

I didn’t understand how Streams worked and it affected the data on my database.

Here was the problem:

My first project that involved streams was about processing user uploaded data to Amazon S3. Everytime a user upload a file, I had a Lambda function that would run some processing on that file, write the file’s metadata to DynamoDB and then upload a modified version of the file to S3.

The issue was some of these user uploaded files would range between a few megabytes to several hundred megabytes. The size limit was 1gb.

In turn, this required me to boost the Lambda function’s memory size from the default 128mb to 256mb (the max memory usage was below this).

But during production often the Lambda functions would timeout even though during my tests they wouldn’t timeout with 1gb file uploads.

What was going on?

The Solution: Streams Batch Processing

The answer to my problem lied with DynamoDB streams and how they work.

When you enable Streams on your DynamoDB table, you have to create a trigger. This trigger needs to be a Lambda function that will be executed when items are added, modified or deleted from your table.

When you add a trigger, you first choose the Lambda function trigger and then you can choose a batch size number.

Here’s the “add a trigger” page:

Batch Size

What is Batch size and what does it do?

When items are modified on your DynamoDB tabel, a change stream event is created which lets you invoke a Lambda function to perform further downstream processing.

领英推荐

How to Improve the Performance of DynamoDB in General…

Centizen, Inc. 5 个月前

Amazon S3: First look and simple demo to upload image…

Saigon Technology - Accelerate Software Development 1 年前

Azure Cosmos DB’s Advantages Over Standard Databases

Bizmetric 5 个月前

At the default batch size of 1, the Lambda function waits for just one item to be “streamed” until it is invoked.

This is where it gets interesting.

For scalability, you want to use a higher number so that many items can be processed with one Lambda function invocation.

But the more items the function processes the more time it takes and hence the more memory it needs to process these items.

So finding the right balance between number of items processed by the function is key.

Too little items and your function is called too often, causing potential for bottlenecks, while too many items will cause the function to timeout or overspend on memory (and cost).

As you can guess the problem in my project was that I had set the batch size number to 1000 and the function was timing out.

The solution was simple:

I increased the timeout from 1 minute to 10 minutes
I increased the memory of 128mb to 1gb
I used a smaller batch size number for the Streams processing (20).

The result was no more timeouts?—?the increase in memory size, timeout was able to accommodate the user uploaded files.

Conclusion

Understanding how DynamoDB Streams works is essential to using them at scale in production.

By adjusting the batch size and increasing the memory and timeout settings, I was able to optimize the Lambda function for processing DynamoDB Streams.?

This allowed me to avoid performance bottlenecks and make sure my Lambda triggers was able to support DynamoDB’s Stream data.

?? My name is Uriel Bitton and I hope you learned something in this edition of Excelling With DynamoDB.

?? You can share the article with your network to help others learn as well.

?? If you're looking for help with DynamoDB, let's have a quick chat:

https://calendly.com/urielas1/dynamodb-consultations

?? I hope to see you in the next week's edition!

Excelling With DynamoDB

1,132 位关注者

Elinor E

Advisor Financial. Decoration.

1 个月

Very helpful

1 次回应

Petar Ivanov

Practical React, Node, and Software Architecture Tips ?? | Author of “The Conscious React” book ??

1 个月

Looking forward to reading the full article, Uriel Bitton! ??

1 次回应

Raul Junco

Simplifying System Design

1 个月

CDC the right way ?? Thanks for sharing Uriel Bitton

1 次回应

查看更多评论

要查看或添加评论，请登录

Uriel Bitton的更多文章

I Built A Serverless AirBnB-Styled App For Garages — Here’s What Happened

2025年2月25日

I Built A Serverless AirBnB-Styled App For Garages — Here’s What Happened

?? Hello there! Welcome to The Serverless Spotlight! The Story A couple of years ago I had this potentially great idea.…

4 条评论
Designing A Job Application Database With DynamoDB

2025年2月22日

Designing A Job Application Database With DynamoDB

I built a job board mobile app for an idea I had with a partner of mine last year. The idea was innovative and would…

2 条评论
How Zoom Used DynamoDB To Grow From 10 Million To 300 Million Daily Users

2025年2月18日

How Zoom Used DynamoDB To Grow From 10 Million To 300 Million Daily Users

In early 2020, Zoom encountered massive traffic spikes in user requests, due to the Covid-19 pandemic. As users stayed…

4 条评论
Why You Shouldn't Use An ORM With DynamoDB

2025年2月15日

Why You Shouldn't Use An ORM With DynamoDB

ORMs simplify your life. I get it.

6 条评论
How To Create An HTTP API In AWS API Gateway

2025年2月11日

How To Create An HTTP API In AWS API Gateway

Build an HTTP API in a matter of minutes not hours — with AWS API Gateway. Back in the day, creating REST or HTTP APIs…

4 条评论
My Top 3 Takeaways From Building Scalable Databases With DynamoDB

2025年2月8日

My Top 3 Takeaways From Building Scalable Databases With DynamoDB

I spent the past 5 years building scalable databases for clients and organizations. Here are a few lessons I’ve learned…

2 条评论
Amazon DynamoDB Vs Google Firebase: Which Serverless Database To Choose For Your App?

2025年2月3日

Amazon DynamoDB Vs Google Firebase: Which Serverless Database To Choose For Your App?

?? Hello there! Welcome to The Serverless Spotlight! In this week's edition, we;ll be taking a look at two serverless…

4 条评论
Understanding DynamoDB’s scaling features in the console

2025年2月1日

Understanding DynamoDB’s scaling features in the console

DynamoDB is a powerful key-value database with virtually infinite scaling capabilities. But like with all other…

2 条评论
How To Satisfy Multiple Access Patterns By Overloading Keys

2025年1月26日

How To Satisfy Multiple Access Patterns By Overloading Keys

When the single table design was introduced in DynamoDB, it changed the way many users designed and built databases…
How To Better Handle Large Item Sizes In DynamoDB

2025年1月11日

How To Better Handle Large Item Sizes In DynamoDB

Welcome to the 40th edition of Excelling With DynamoDB! Oftentimes you will run into scenarios with your DynamoDB data…

8 条评论

See all articles

How Amazon DynamoDB Streams Batch Processing Works

Uriel Bitton

AWS Cloud Engineer | The DynamoDB guy | AWS Certified | I help you supercharge your DynamoDB database ????

The Solution: Streams Batch Processing

Batch Size

领英推荐

Conclusion

Excelling With DynamoDB

1,132 位关注者

Uriel Bitton的更多文章

社区洞察

其他会员也浏览了

MongoDB Roadmap: What’s Coming Next

Week 25 (17 Jun - 23 Jun)

Cloudwalker acquires Amazon Redshift Service Validation

Week 27 (1 Jul - 7 Jul)

AWS Cloudscape Atlas : Envisioning possibilities with AWS services

Amazon Athena

AWS DynamoDB Fundamentals | A Complete Guide

Is Tessell the Snowflake of Operational Databases?

5 Tips To Help You Save On DynamoDB Costs

Amazon Redshift’s Top Performance Features and Latest Capabilities

The Solution: Streams Batch Processing

Batch Size

领英推荐

Conclusion

Excelling With DynamoDB

1,132 位关注者

Uriel Bitton的更多文章

I Built A Serverless AirBnB-Styled App For Garages — Here’s What Happened

Designing A Job Application Database With DynamoDB

How Zoom Used DynamoDB To Grow From 10 Million To 300 Million Daily Users

Why You Shouldn't Use An ORM With DynamoDB

How To Create An HTTP API In AWS API Gateway

My Top 3 Takeaways From Building Scalable Databases With DynamoDB

Amazon DynamoDB Vs Google Firebase: Which Serverless Database To Choose For Your App?

Understanding DynamoDB’s scaling features in the console

How To Satisfy Multiple Access Patterns By Overloading Keys

How To Better Handle Large Item Sizes In DynamoDB

社区洞察

其他会员也浏览了

MongoDB Roadmap: What’s Coming Next

Week 25 (17 Jun - 23 Jun)

Cloudwalker acquires Amazon Redshift Service Validation

Week 27 (1 Jul - 7 Jul)

AWS Cloudscape Atlas : Envisioning possibilities with AWS services

Amazon Athena

AWS DynamoDB Fundamentals | A Complete Guide

Is Tessell the Snowflake of Operational Databases?

5 Tips To Help You Save On DynamoDB Costs

Amazon Redshift’s Top Performance Features and Latest Capabilities