登录查看更多内容

Speech Recognition- The Future

Heena Khan

Solutions Architect at Amazon Web Services(Still like a child;Permanently failing, learning & testing)

发布日期: 2018年3月23日

Speech recognition is an interesting domain with many important applications.Business can use speech recognition in creating effective and dynamic solutions in their production.Here is a brief comparison between the APIs provided by two leading cloud providers Amazon and Google to do this job easily.

Language Support

Amazon Transcribe API can automatically transcribe US English and Spanish speech.

Google Speech API supports more than 110 languages.

Input Limit

Amazon Transcribe API take input no longer than 2 hours.

Google speech API can give result for asynchronous requests for audio data of any duration up to 180 minutes.

Asynchronous Request-As speech recognition will take time for longer audio (depending upon the length of audio), you can make request to Speech API and then do other work. Speech API will process it and you can access the result once ready.

Input Format

Amazon Transcribe API Supports FLAC, MP3, MP4, or WAV file formats.

Google Speech API- supports FLAC, LINEAR16, MULAW, AMR, AMR_WB, OGG_OPUS, SPEEX_WITH_HEADER_BYTE, WAV files with LINEAR16 or MULAW encoded audio. Use of mp3, mp4, m4a, mu-law, a-law or other lossy codecs during recording or transmission may reduce accuracy. So, to avoid getting empty or bad results do not use these formats.

Streaming Recognition

Amazon Transcribe API does not support Streaming Recognition.

Google Speech API supports streaming recognition. you can stream audio to the Cloud Speech API and receive a stream speech recognition results in real time as the audio is processed.

Noisy Environment Support

Amazon Transcribe API- Its not mentioned anywhere that it supports Noisy environments. Rest you can try and check.

Google Speech API can handle noisy audio from a variety of environment. No need to filter the noise before sending it to Speech API.

Inappropriate Content Filtering

Amazon Transcribe API does not support inappropriate content filtering.

Google Speech API- you can also filter inappropriate content in text results.

Rashmi S.

Operations & Customer Success | Training & Delivery Support | Edtech | Strategic Growth | Ex- upGrad

5 年

Article written by you Heena? :)

查看更多评论

要查看或添加评论，请登录

Heena Khan的更多文章

Text data preparation for ML

2020年4月19日

Text data preparation for ML

Machine learning algorithms cannot work with the raw text directly. Rather, the text must be converted into vectors of…
What is the difference between stemming and lemmatization?

2020年4月12日

What is the difference between stemming and lemmatization?

When we search for something on the internet, we expect the results not only for the exact expression but also for…

1 条评论

Speech Recognition- The Future

Heena Khan

Solutions Architect at Amazon Web Services(Still like a child;Permanently failing, learning & testing)

Heena Khan的更多文章

社区洞察

其他会员也浏览了

"How To:" (make an amazing song w/ music video) + some OpenAI speculation, news and rumors

How to not get fooled by AI audio deepfakes

Bayezian Bulletin - May 2024

OctoAI is now GA ??

Seamless Voice ??????

AI in Business | July 13, 2024

Authorship of AI-generated content - A case study on Chat GPT

The Impact of ChatGPT on Songwriting Creativity: A Paradigm Shift for Songwriters

How Netflix Uses Machine Learning to Enhance User Experience

The Curious Case of AI and Intellectual Property: Who Gets the Credit?

Heena Khan的更多文章

Text data preparation for ML

What is the difference between stemming and lemmatization?

社区洞察

其他会员也浏览了

"How To:" (make an amazing song w/ music video) + some OpenAI speculation, news and rumors

How to not get fooled by AI audio deepfakes

Bayezian Bulletin - May 2024

OctoAI is now GA ??

Seamless Voice ??????

AI in Business | July 13, 2024

Authorship of AI-generated content - A case study on Chat GPT

The Impact of ChatGPT on Songwriting Creativity: A Paradigm Shift for Songwriters

How Netflix Uses Machine Learning to Enhance User Experience

The Curious Case of AI and Intellectual Property: Who Gets the Credit?