AWS Case Study 5 - Image recognition powered by Serverless AI

AWS Case Study 5 - Image recognition powered by Serverless AI

I have a confession to make.

I always wanted to create an app that would use an artificial intelligence to perform fancy technological tasks.

What was preventing me from doing it was the lack of AI/ML algorithm expertise.

Now in 2021, you and I don't need to know anything about AI anymore to use AI in our apps.

All we need to know is which AI/ML cloud service to outsource and how to integrate it with our application.

Amazon Rekognition logo

Amazon Rekognition - image recognition @AWS

Amazon Rekognition is one of several AI AWS webservices that you may use to enhance your applications by AI stuff.

Amazon Rekognition makes it easy for you to add image & video analysis to your app using deep learning.

With it, you can identify objects, people, text, and activities in both images and videos, as well as detect any inappropriate content.?

It provides highly accurate facial analysis and facial search capabilities that you can use to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.

What is more, with Amazon Rekognition Custom Labels you can build your own recognition models to satisfy your custom image recognition business needs.?

It takes care of the heavy lifting of model development for you meaning no machine learning experience is required on your side.?

You simply need to supply images of objects you want to identify and the service is going to handle the rest.

PhotoDetective - photo analysis tool powered by AWS

PhotoDetective is a tiny web app which I created that runs on 100% AWS serverless architecture and uses Amazon Rekognition to allow its users analyze their photos.

No alt text provided for this image

Upon photo upload, PhotoDetective performs analysis to identify objects and faces on the photo and also does EXIF data extraction to provide additional details.

PhotoDetective Serverless Architecture

The following scheme describes which AWS webservices are behind the PhotoDetective:

AWS Serverless Architecture - PhotoDetective

/click here to display the scheme in bigger resolution/

As depicted in the architecture,

  • each uploaded photo goes from the user's web browser to Amazon Cloudfront,
  • then to Amazon API Gateway
  • and then to the AWS Lambda function that integrates with Amazon Rekognition which processes the photo and retrieves image recognition metadata,
  • after this processing Lambda generates HTML code containing results of photo analysis and sends it back through Amazon API Gateway back to Amazon Cloudfront which delivers it to the user's web browser.

Do you recommend using this architecture in production?

The architecture, as it is designed now, is an example and proof of concept that demonstrates how easily image recognition use cases can be handled by AWS serverless webservices.

The entire PhotoDetective solution (architecture + Lambda function code + configuration settings) was created during just 1 weekend and as such was not tuned for production use (from the perspective of scalability, performance and security).

Which architectural changes need to be made to make this architecture production ready?

1. Decoupling of heavyweight processing

Due to the limitation on the side of Amazon API Gateway, entire processing of uploaded photos (described above) needs to be finished within the limit of 29 seconds. Otherwise the request in Amazon API Gateway's REST endpoint times out (there is really no way to lift this limit).

Because photos are usually quite large files and not all of our users have fast internet connection (some of them may be on their mobile phone in locations not covered by 4G/5G), it is very likely that they will experience application malfunction due to going over this limit.

This problem can be solved by decoupling user's photo upload requests from the follow-up image recognition processing.

One possibility how to do that is to incorporate additional 2 AWS webservices into our architecture - Amazon S3 & Amazon DynamoDB.

Amazon S3 logo

S3 is constructed to handle storage of objects having as much as 5TBs (five terabytes), so uploading few megs into S3 is walk in the park.

Once the upload of photo to S3 is complete, S3 can fire S3 event notification and asynchronously trigger AWS Lambda function with the details about the new successfully uploaded photo.

Since there is no Amazon API gateway involved, there is no 29 second limit (instead we have now 900 seconds /15 minutes/ which is max. duration of Lambda function execution).?

Amazon DynamoDB logo

After Lambda function processes the photo, it will save the results in the Amazon DynamoDB table.?

Our frontend needs of course count on this asynchronous processing and periodically trigger via Amazon API gateway another AWS Lambda function to try to fetch the photo analysis results from Amazon DynamoDB.

2. Downsizing of user's photo before it is sent to Amazon Rekognition

Currently when uploaded photo arrives at AWS Lambda, Lambda sends it to Amazon Rekognition in its original graphical resolution.

If the photo has more than 2MPx (>1920x1080px) which is the optimal graphical resolution recommended by Amazon Rekognition, unnecessary delays are likely to be introduced.

One resulting from the unnecessary data transfer due to larger photo and the other from longer processing required on the side of Amazon Rekognition.

Solution to this is implementation of photo resizing in the AWS Lambda function before contacting Amazon Rekognition. E.g. in Node.js this can be implemented by using resize functions from image libraries such as Imagemagick, GraphicsMagick or Sharp.

3. Mechanism to prevent service abuse

Image recognition is very fancy thing and chances are that our PhotoDetective service may be misused by inviduals and also other websites. In order that PhotoDetective doesn't become "a free public API", following measures should be implemented:

a) HTTP Referrer request header check to terminate requests not originating from the PhotoDetective homepage (can be easily implemented with Lambda@Edge),

b) mandatory user registration and login before performing photo analysis / or making our users pay for the service,

c) limiting the total number of photo analysis requests per user / per IP address / per day,

d) limiting the number of allowed parallel requests from the same IP address (to prevent DoS attacks),

e) deployment of AWS Shield (to prevent Distributed Denial of Service - DDoS attacks).

Frequently asked questions

1. What is EXIF data and how is EXIF data extracted from the uploaded photo?

Every time you take a photo with your smartphone or digital camera, your device usually adds some extra metadata to the image file. This includes e.g. the current date, time, geographic location of the original photo.

It may also include advanced camera settings such as: Camera model, Aperture, Shutter speed, Focal length, Metering mode, ISO speed, Image orientation and many others.

For the purpose of EXIF data extraction, we are in our AWS Lambda function are using node.js library called piexifjs residing in the functions subfolder.

As EXIF data can often provided much value to any Photo Detective, I decided to include their extraction into the app.

2. I noticed that there are colorful rectangles drawn around faces on the uploaded photo. Who does them and why?

Besides Labels Detection (identification which objects are shown on the photo), PhotoDetective app uses Amazon Rekognition also for performing Face Detection.

Amazon Rekognition provides us not just with the number of faces in the uploaded photo, but it also gives us coordinates where actually these faces are located within the photo. Once this data is collected in AWS Lambda, client side javascript is used to draw corresponding rectangles by using HTML5 canvas element.

3. Is Amazon Rekognition an expensive webservice?

Within your first 12 months of AWS account membership, you can process up to 5000 images per month for free.

If you perform both label and face detection, please note that these are counted as individual requests. So for the first year, you can provide 2500 photo analysis using PhotoDetective per month without being charged.

Outside this mentioned Free Tier, for the first 1 million images processed per month, there is a charge of $0.001/per image.

4. There is an image displayed at the top of the PhotoDetective homepage. Besides it, there is a original uploaded photo enriched by rectangled faces displayed on the results page. Where are these images stored? Web browser needs to download them from somewhere.

As you can see from the architecture above, there is no persistant storage used in our app be it database or filesystem. All of the images that we display are embedded directly in the HTML code using data URI scheme.??

5. Can you provide full instructions how to setup and configure entire PhotoDetective app?

Yes, feel free to download my instructions.

6. Can I contact you in case I have questions or get stuck with the solution implementation?

Don't hesitate and contact me. You can reach me via my LinkedIn profile - I accept all requests to connect and will be glad to discuss AWS architectures with you.

Sachin Mittal

5X Snowflake Advanced Data Engineer, Advanced Architect, Advanced Administrator ,Advanced Data Analyst, Data Superhero, SnowPro Core, SnowPro Certification SME, Oracle , SIEBEL EIM; https:/medium.com/@sachin.mittal04;

3 年

Very very interesting Rasto, though could not get the time to implement in my org but theoretical and conceptual wise really great p

Peter Barczi

Sr. DevOps Engineer | Kubestronaut | K8s | AWS | DualEdu Linux Trainer

3 年

It works!

  • 该图片无替代文字

要查看或添加评论,请登录

Rastislav Skultety, MBA的更多文章

社区洞察

其他会员也浏览了