Speech Recognition- The Future

Speech recognition is an interesting domain with many important applications.Business can use speech recognition in creating effective and dynamic solutions in their production.Here is a brief comparison between the APIs provided by two leading cloud providers Amazon and Google to do this job easily.

Language Support

Amazon Transcribe API can automatically transcribe US English and Spanish speech.

Google Speech API supports more than 110 languages.

Input Limit

 Amazon Transcribe API take input no longer than 2 hours.

Google speech API can give result for asynchronous requests for audio data of any duration up to 180 minutes.

Asynchronous Request-As speech recognition will take time for longer audio (depending upon the length of audio), you can make request to Speech API and then do other work. Speech API will process it and you can access the result once ready.

Input Format

Amazon Transcribe API Supports FLAC, MP3, MP4, or WAV file formats.

Google Speech API- supports FLAC, LINEAR16, MULAW, AMR, AMR_WB, OGG_OPUS, SPEEX_WITH_HEADER_BYTE, WAV files with LINEAR16 or MULAW encoded audio. Use of mp3, mp4, m4a, mu-law, a-law or other lossy codecs during recording or transmission may reduce accuracy. So, to avoid getting empty or bad results do not use these formats.

Streaming Recognition

Amazon Transcribe API does not support Streaming Recognition.

Google Speech API supports streaming recognition. you can stream audio to the Cloud Speech API and receive a stream speech recognition results in real time as the audio is processed.

Noisy Environment Support

Amazon Transcribe API- Its not mentioned anywhere that it supports Noisy environments. Rest you can try and check.

Google Speech API can handle noisy audio from a variety of environment. No need to filter the noise before sending it to Speech API.

Inappropriate Content Filtering

Amazon Transcribe API does not support inappropriate content filtering.

Google Speech API- you can also filter inappropriate content in text results.



Rashmi S.

Operations & Customer Success | Training & Delivery Support | Edtech | Strategic Growth | Ex- upGrad

5 年

Article written by you Heena? :)

回复

要查看或添加评论,请登录

Heena Khan的更多文章

  • Text data preparation for ML

    Text data preparation for ML

    Machine learning algorithms cannot work with the raw text directly. Rather, the text must be converted into vectors of…

  • What is the difference between stemming and lemmatization?

    What is the difference between stemming and lemmatization?

    When we search for something on the internet, we expect the results not only for the exact expression but also for…

    1 条评论

社区洞察

其他会员也浏览了