AMAZON SQS USE CASE
AMAZON SQS:
Amazon Simple Queue Service(SQS) is a managed message queue service offered by Amazon Web Services (AWS). It provides an HTTP API over which applications can submit items into and read items out of a queue. The queue itself is fully managed by AWS, which makes SQS an easy solution for passing messages between different parts of software systems that run in the cloud.
HOW AMAZON SQS WORKS:
Architecture:
There are three main parts in a distributed messaging system: the components of your distributed system, your queue (distributed on Amazon SQS servers), and the messages in the queue.
In the following scenario, your system has several producers (components that send messages to the queue) and consumers (components that receive messages from the queue). The queue (which holds messages A through E) redundantly stores the messages across multiple Amazon SQS servers.
Message lifecycle
The following scenario describes the lifecycle of an Amazon SQS message in a queue, from creation to deletion.
A producer (component 1) sends message A to a queue, and the message is distributed across the Amazon SQS servers redundantly.
When a consumer (component 2) is ready to process messages, it consumes messages from the queue, and message A is returned. While message A is being processed, it remains in the queue and isn't returned to subsequent receive requests for the duration of the visibility timeout.
The consumer (component 2) deletes message A from the queue to prevent the message from being received and processed again when the visibility timeout expires.
How does SQS integrate with other AWS services?
Most interesting for Serverless developers is SQS‘s integration with Amazon Lambda: SQS can act as an AWS Lambda event source. When configured, every SQS message triggers a Lambda function run that processes a batch of SQS messages.
Another useful integration is with SNS: an SQS queue can subscriber to an SNS topic. We cover the differences between SQS and SNS below, in the When to use SQS vs. SNS section.
SQS also provides standard integrations for monitoring and debugging SQS queues using Amazon CloudWatch and AWS X-Ray.
Serverless developers can manually integrate an SQS queue with any other AWS service (or a third-party service) by writing code that uses the AWS SDK to submit messages to SQS and read them from there, or by using the SQS API directly.
The benefits of using SQS:
For Serverless developers, using SQS generally provides a wealth of benefits, which you can read about below.
Scalability Your SQS queues scale to the volume of messages you’re writing and reading. You don’t need to scale the queues; all the scaling and performance-at-scale aspects are taken care of by AWS.
Pay for what you use When using SQS, you only get charged for the messages you read and write (see the details in the Pricing section). There aren’t any recurring or base fees.
Ease of setup Since SQS is a managed service, so you don’t need to set up any infrastructure to start using SQS. You can simply use the API to read and write messages, or use the SQS <-> Lambda integration.
Options for Standard and FIFO queues When creating an SQS queue, you can choose between a standard queue and a FIFO queue out of the box. Both of these queue types can be useful for different purposes.
Automatic deduplication for FIFO queues Deduplication is important when using queues, and for FIFO queues SQS will do the work to remove any duplicate messages for you. This makes FIFO queues on SQS suitable for tasks where it’s critical to have each task done exactly once.
A separate queue for unprocessed messages This feature of SQS is useful for debugging. All messages that couldn’t be processed are sent into a "dead-letter" queue where you can inspect them. This queue has all the usual integrations enabled, so you can subscribe to it using an AWS Lambda event, for example, to send a notification when an item can’t be processed.
Disadvantages of using SQS:
Using SQS can also create challenges for Serverless developers, as described hereafter.
High cost at scale With pay per use pricing, if the number of messages you send is very high, your SQS bill can be quite significant. Part of SQS pricing is data transfer charges, and those can add up if you send larger messages, or if you process messages from outside the main AWS region in which the queue is located. In some cases, when running at scale with millions of messages processed every day, the cost of using SQS might be higher than the cost of operating your own queue system, even including the overhead to manage your own solution.
Lack of support for broadcast messages With its “exactly once” delivery, SQS doesn’t support a way for multiple entities to retrieve the same message, making SQS not so useful for one-to-many broadcasts.
Reduced control over performance When running a message queue system at scale, something you may well end up wanting to do is to fine-tune its performance to suit your needs. With SQS this isn’t an option: the service is fully managed, and you don’t get to look under the hood.
Oyster Case Study:
New York-based Oyster.com shares unvarnished reviews of hotels in nearly 200 destinations worldwide. The company's own investigators visit each location to assess cleanliness, amenities, service and overall quality. What sets Oyster apart from similar sites is its extensive collection of photographs. Oyster takes hundreds of photos for each property, and every review includes dozens of untouched images (submitted by guests as well as investigators) that allow potential visitors to compare a hotel’s marketing material with reality.
"It took less time to rewrite the code and do a full processing job with AWS than it took to do a single run with the old method."
Eytan Seidman,
Cofounder and Vice President of Product, Oyster
The Challenge
Since its 2009 launch, Oyster has published more than one million high-quality digital images. When this massive volume of images became too cumbersome to handle in-house, the company decided to offload the content to a central repository on Amazon Simple Storage Service (Amazon S3). “We migrated to Amazon S3 in 2010,” says Eytan Seidman, Co-Founder and Vice President of Product. “We chose moving to the cloud and Amazon S3 because storing images in our data center would have been too costly. Amazon S3 was a more economical solution.”
Oyster reprocesses its entire collection of photographic images a few times each year to update the copyright year and, if necessary, to change the watermarks. Using their previous solution, reprocessing the entire collection of photographs required about 800 hours to complete. In addition, Oyster often recreated existing images in new formats and sizes for mobile and tablet devices. Resizing existing images and adding new ones was slowing down the rate at which the company was able to process the collection. “Our processes were slowing down,” says Seidman. “When the iPad with Retina display came out, for example, it took us more than a week to create new sizes specifically for that resolution.” Oyster considered purchasing additional hardware, but found the cost of new hardware and routine maintenance was too high, especially when the machines would sit idle most of the time.
Moreover, there were numerous software bugs in the multiprocessing solution that the company used, but since the solution didn’t scale, Oyster didn’t bother to fix them.
Why Amazon Web Services
"We were already using Amazon S3 to store the images, so using Amazon Elastic Compute Cloud (Amazon EC2) to process the images was a natural choice,” Seidman says. Chris McBride, a software engineer at Oyster, adds, “We wanted a cloud environment that could be ramped up for the large processing jobs and downsized for the smaller daily jobs.”
While the company is still running one local server, the bulk of the processing work now takes place on the AWS Cloud. Oyster is using a customized Amazon Linux AMI within Amazon EC2. Within this new environment, the company connects to Amazon S3 and Amazon Simple Queue Service (Amazon SQS) using boto, a Python interface to AWS. The images themselves are processed with the ImageMagick software available in the AMI package.
Oyster uses Amazon EC2 instances and Amazon SQS in an integrated workflow to generate the sizes they need for each photo. The team processes a few thousand photos each night, using Amazon EC2 Spot Instances. When Oyster processes the entire collection, it can use up to 100 Amazon EC2 instances. The team uses Amazon SQS to communicate the photos that need to be processed and the status of the jobs.
The Benefits
Oyster's old system needed approximately 400 hours to process one million photos. By using AWS, the company can process the same number of photos in about 20 hours—a 95 percent improvement. "It took less time to rewrite the code and do a full processing job with AWS than it took to do a single run with the old method," says Seidman. “It used to take close to a week to produce photos specifically for the iPad. With AWS, we can create the photos in just a few hours. The documentation is straightforward and the dashboards are incredibly helpful.”
Oyster has also been able to reduce in-house hardware expenses by repurposing two of its old servers, which were sitting idle more than 80 percent of the time. “We estimate that we saved roughly $10,000 in capital expenditures by moving to AWS, and reduced our operating expenses by an additional $10,000,” Seidman says. He believes that AWS is a perfect match for any company performing similar batch processing. "AWS lets us move faster without worrying about machine expenditures or maintenance, which frees us to focus on other things.”
THANKYOU FOR THE READING!!