登录查看更多内容

AWS AI Service: Amazon Rekognition

Nischal Subedi

Data Scientist at Home Partners of America?

发布日期: 2024年8月17日

Amazon Rekognition is an AI service that enables users to effortlessly incorporate image and video analysis into their applications. It provides a set of tools for analyzing visual content, including capabilities for image classification, where objects and scenes are identified within an image; object detection, which pinpoints and labels specific items within both images and videos; and text extraction from images, allowing for the identification and retrieval of written content within visual media.

Additionally, Amazon Rekognition offers advanced features like facial recognition, enabling the detection, analysis, and comparison of faces across images and videos. It also supports activity detection in videos, which can identify and track specific actions or movements, making it particularly valuable in surveillance and public safety scenarios. This service is designed to streamline the implementation of complex visual analysis tasks, making it accessible and scalable for a wide range of applications.

It is important to understand the basics of image and video processing in deep learning, particularly convolutional neural networks (CNNs). CNNs are specialized deep learning models designed to recognize patterns in images by breaking them down into smaller parts. They use layers of filters to identify features like edges or textures, gradually combining these to understand more complex aspects of the image, such as identifying objects. CNNs consist of convolutional layers (applying filters) and pooling layers (reducing parameters and spatial size). Initial layers capture low-level features (edges, curves); later layers capture high-level features (object identification).

Popular CNN architectures for image classifications include ResNet and Inception v4 and for object detection YOLO and SSD.

Transfer learning involves using a pretrained model, freezing initial layers, and retraining the last few layers on a new dataset. Commonly used in image classification, object detection, and NLP.

With Amazon Rekognition, developers can simply leverage pretrained models or train custom machine learning models without having to worry about writing the algorithm code, or about setting up or managing the infrastructure to train and deploy a deep learning model.

Amazon Rekognition Custom Labels allows you to train custom models tailored to your specific needs, but this feature is currently limited to image-based tasks only. It does not support custom training for video data or other types of media.

Amazon Rekognition processes both static images and stored videos. Image operations are synchronous, meaning you receive the results immediately. In contrast, video operations are asynchronous. When you request video processing, Amazon Rekognition will notify you upon job completion by publishing a message to an Amazon SNS topic. You will then need to call a Get* API to retrieve the outputs.

Here is a practical example to illustrate detecting objects in an image or video.

For object detection in an image, it is necessary to provide the location of the image (in JPEG or PNG format) stored in Amazon S3 or as a byte-encoded image input.

Sample output may look like this:

{
            
    {
    "Labels": [
        {
            "Name": "Vehicle",
            "Confidence": 99.15271759033203,
            "Instances": [],
            "Parents": [
                {
                    "Name": "Transportation"
                }
            ]
        },
        {
            "Name": "Transportation",
            "Confidence": 99.15271759033203,
            "Instances": [],
            "Parents": []
        },
        {
            "Name": "Automobile",
            "Confidence": 99.15271759033203,
            "Instances": [],
            "Parents": [
                {
                    "Name": "Vehicle"
                },
                {
                    "Name": "Transportation"
                }
            ]
        },
        {
            "Name": "Car",
            "Confidence": 99.15271759033203,
            "Instances": [
                {
                    "BoundingBox": {
                        "Width": 0.10616336017847061,
                        "Height": 0.18528179824352264,
                        "Left": 0.0037978808395564556,
                        "Top": 0.5039216876029968
                    },
                    "Confidence": 99.15271759033203
                },
                {
                    "BoundingBox": {
                        "Width": 0.2429988533258438,
                        "Height": 0.21577216684818268,
                        "Left": 0.7309805154800415,
                        "Top": 0.5251884460449219
                    },
                    "Confidence": 99.1286392211914
                },
            ],
            "Parents": [
                {
                    "Name": "Vehicle"
                },
                {
                    "Name": "Transportation"
                }
            ]
        },
    "LabelModelVersion": "2.0"
}
  
}

By specifying MaxLabels, the number of responses can be limited, and Amazon Rekognition will synchronously return a response displaying the bounding boxes and confidence scores of the various objects detected in the image, as illustrated in the previous example. The confidence score can then be utilized for downstream actions.

领英推荐

Statistical inference vs Machine learning inference vs…

Ajit Jaokar 1 个月前

Machine Learning: Definition, Types, Advantages & More

Neil Sahota 2 年前

The Marvelous Intersection of Artificial Intelligence…

Megharaj Dadhich 1 年前

In contrast, for a video job, it is not possible to pass in bytes; instead, the location of a video stored in Amazon S3 must be provided. The API used is StartLabelDetection, and it is also necessary to pass in an SNS topic for Amazon Rekognition to send a notification once the video labeling task is completed. The outputs can then be accessed by calling the GetLabelDetection API.

A key benefit of Amazon Rekognition Video is that you can work with streaming videos. Amazon Rekognition can ingest streaming videos directly from Amazon Kinesis Video streams, process the videos, and publish the outputs to Amazon Kinesis Data Streams for stream processing.

Real World Example:

Consider a scenario where an IT manager at a large retail chain is tasked with monitoring in-store security footage in real-time to identify shoplifters based on a database of known individuals. The company’s leadership, however, is concerned about the time and complexity involved in building, training, and maintaining these machine learning models due to the advanced expertise required. The challenge lies in designing a solution that addresses these needs while minimizing costs. The primary concern is that the lack of deep learning expertise within the organization may hinder the development of this solution.

Amazon Rekognition Video offers an effective solution for this scenario. The process can be broken down into the following steps:

Create a Face Collection: Start by building a collection of known faces using Rekognition Image or Video, detecting faces from an existing database of images or archived footage.

Ingest Live Security Feed: Use a service like Kinesis Video Streams to ingest the live security feed into the system.

Manage Output Data Stream: Set up Kinesis Data Streams to handle the output data stream from the video feed.

Process Video Feed: Utilize the CreateStreamProcessor API in Rekognition Video to process the incoming video feed, with the Kinesis Video stream as input.

Publish Analysis Results: The analysis results will be published to Kinesis Data Streams.

Store Outputs: AWS Lambda can then consume the data from Kinesis Data Streams and store the outputs in S3 or a key-value store like Amazon DynamoDB.

The following graphic illustrates the high-level architectural flow:

要查看或添加评论，请登录

Nischal Subedi的更多文章

SQL SERIES #4: LEAD() Window Function

2024年8月23日

SQL SERIES #4: LEAD() Window Function

Problem Statement Given a table sales_data that logs daily sales figures for multiple stores, the goal is to calculate…

1 条评论
SQL Series #3: ROW_NUMBER() Window Function

2024年8月22日

SQL Series #3: ROW_NUMBER() Window Function

Problem Statement: Imagine you work for a retail company, and you want to analyze the performance of your salespeople…
SQL Series #2: Calculating Loyalty Points with INTERVAL

2024年8月16日

SQL Series #2: Calculating Loyalty Points with INTERVAL

Problem Statement In the first week after joining (including the join date), customers earn 2x points on all items, not…

1 条评论
SQL Series #1: Dense_Rank() Window Function

2024年8月14日

SQL Series #1: Dense_Rank() Window Function

?? Problem Statement Write a query that outputs the name of the credit card and how many cards were issued in its…

AWS AI Service: Amazon Rekognition

Nischal Subedi

Data Scientist at Home Partners of America?

领英推荐

Nischal Subedi的更多文章

社区洞察

其他会员也浏览了

Why is Deep Learning Preferred Over Machine Learning?

Difference Between ML and Deep Learning: Explained in Simple Terms

Machine Learning

Key Differences Between Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL)

?? Unlock the Power of AWS AI: Key Terms You Need to Know! ??

How to Improve the Inference Performance of Your AI Applications

What is machine learning, and how does it differ from other algorithms, particularly deep learning?

What is the Difference between AI, Machine Learning, and Deep Learning

Machine Learning

Demystifying Deep Learning: The Engine Behind AI's Power

领英推荐

Nischal Subedi的更多文章

SQL SERIES #4: LEAD() Window Function

SQL Series #3: ROW_NUMBER() Window Function

SQL Series #2: Calculating Loyalty Points with INTERVAL

SQL Series #1: Dense_Rank() Window Function

社区洞察

其他会员也浏览了

Why is Deep Learning Preferred Over Machine Learning?

Difference Between ML and Deep Learning: Explained in Simple Terms

Machine Learning

Key Differences Between Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL)

?? Unlock the Power of AWS AI: Key Terms You Need to Know! ??

How to Improve the Inference Performance of Your AI Applications

What is machine learning, and how does it differ from other algorithms, particularly deep learning?

What is the Difference between AI, Machine Learning, and Deep Learning

Machine Learning

Demystifying Deep Learning: The Engine Behind AI's Power