A comparison of pre-trained image recognition APIs across GCP, AWS and Azure
Machine Learning has been arguably one of the most in-demand technologies required by industries in recent times. The ever-increasing surge in the demand propelled by increasing amount of data possessed by industries, led software giants like Google, Microsoft and Amazon into making pre-trained Machine Learning APIs. These APIs are readily available for use and come with a free tier as well as a paid tier – and can be applied across both structured and unstructured data. Recently I thought of experimenting across the pre-trained APIs for image analysis across Google, AWS and Azure (the three leading and major Cloud Service Providers) and found some astonishing results. I took a sample set of 35 images and ran them across all three APIs so that I can compare their performances against each other. Here are the results across some major parameters –
- Text Detection – GCP aces over Azure and AWS in this one. Google’s pre-trained API has the ability to detect even small and minute texts in images – something which Azure was not able to do. On the other hand, AWS did well in detecting smaller texts, but the outputs was absurd and incomprehensible. Therefore, as far as text detection is concerned GCP > AWS = Azure
- Object and Label detection – The results were a bit contrasting here. Google’s API did well do detect a large number of objects in an image but sometimes gave generalized results. For example – It described a toothpaste as “Product”. On the other hand, AWS detected less objects as compared to Google, but did it more specifically. The same image – for which Google gave the result “Product”, AWS gave “Toothpaste”. Azure lagged a bit behind here as well and detected a smaller number of objects and sometimes none at all. Therefore, in Object detection AWS=GCP > Azure
- Brand/Logo detection – This feature is not readily available in AWS hence the comparison narrows down between GCP and AWS. GCP again beats it’s Cloud counterpart here as well. Out of 25 images, in which some Brand was present, GCP correctly identified Brands in 22. However, Azure could identify Brands only in 10 images. In cases, where both GCP and Azure found Brands in the same image, it was found that GCP did way better in confidence scores than Azure (An average of 98% compared to 77% by Azure). GCP again prevailed again here. GCP > Azure > AWS
- Color Detection – Surprisingly, AWS does not provide color detection for images as of now(23rd November 2020). Although I am sure they are surely going to include it sometime in the future. The comparison again narrows down between GCP and Azure where GCP arguably scores over AWS again. GCP gives you output in the form of RGB indexes, using which you find out the specific colors later. However, Azure has a set of predefined 12 colors and will always give you results around that. The comparison here depends on what you need. If you need specific colors go for GCP, but if you are okay with 12 predefined colors provided by Azure, you can go for that as well. But personally, for me, I will prefer more specific colors, hence again GCP > Azure > AWS
- Safe Search – Sometimes brands and industries are concerned about correct and safe content being posted across websites (There can be other uses as well). This is where safe search comes into play. GCP provides safe search across as many as 5 categories – Medical, Violence, Spoof, Adult and Racy content. Each category can get a score from 1 to 5 depending on the content in the image. On the other hand, Azure provides safe search across 2 categories only – Adult and Racy content. It was also found that in some cases Azure was giving confidence levels > 100% which they probably need to correct in their APIs. Lastly, AWS only gives you a binary output stating whether the image passed the safe search test or not. And hence again, GCP > Azure > AWS
Upon seeing the results, you might assume that I am a Google fan boy and my results are biased towards Google but, in reality, this is exactly what I saw when I actually compared the outputs across the APIs. There are some other features which I have not included here such as Celebrity detection, face detection, adding captions to images and cost comparison. I have compared them as well but if you are curious enough to know those results, feel free to reach out to me. I’ll be happy to talk about it. But till now, as far as pre-trained Machine Learning services are concerned – I believe GCP is far ahead as compared to it’s counterparts. I am also planning to explore other pre-trained Machine Learning APIs as well, but that is something I’ll pick up a bit later. You can also re-use the code which I have used in this comparison. I have added all the methods into a single snippet which are triggered by function calls and can be found on my GitHub here. I hope you liked my comparison across the 3 major and leading Cloud Service Providers and in case you have any ideas using which we can explore Cloud technologies, I’ll be happy to help or collaborate.
Till then, sayō nara??
Data Scientist at SAS
1 年Very helpful, thanks!
Solution Director | Driving Innovation in Analytics & Digital Solutions | Analytics Delivery & Solutions ?? Gen AI Adoption & Automation Expert
4 年This is great Mayank! Thanks for sharing
Measurement & Analytics Strategy @ Google | Analytics Mentor & Coach | Marketing Strategy | Growth Marketing | Insights Specialist | Consulting | Experience Optimisation (CRO) | Business Intelligence
4 年Very well written article Mayank ?? Thanks a-lot for putting in time to write and making us understand one of the most talked technology right now.
Global Data Platform Head | Product Owner | Associate Director | Data Analytics | Google Analytics | Adobe Analytics | Google Cloud Platform | Machine Learning | Data Science | Data Engineering | Speaker
4 年This was a really cool article, Mayank, thanks a lot for sharing. I will definitely use your code and see results and will share those with you :)