登录查看更多内容

The Stripe of Automatic Speech Recognition (ASR)

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

发布日期: 2022年7月28日

+ 关注

What is AssemblyAI?

I have a special summer discount going for my AI Supremacy Newsletter, help me get to 100 paid subscribers.

Get 25% off for 1 year

If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below). For the price of a cup of coffee, Join 84 other paying subscribers.

https://aisupremacy.substack.com/subscribe

AI-as-a-Service in an audio friendly world - powering Automatic Speech Recognition (ASR)

Can I just say, I really like this startup? But there are so many A.I. startups now where I feel that way. This article first appear on my Artificial Intelligence Survey.

JULY 14TH, 2022 4:15 PM MONTREAL, CANADA

The speech to text AI startups, AssmeblyAI has now raised $64 million, according to Crunchbase.

AssemblyAI today announced that it raised $30 million in a Series B round led by Insight Partners with participation from Y Combinator and Accel. To date, AssemblyAI has raised $64 million, which founder and CEO?Dylan Fox?tells TechCrunch is being invested in growing the company’s research and engineering teams and data center capacity AI model training.

On LinkedIn, Dylan says he talks about t #ai, #startups, #deeplearning, #speechtotext, and #speechrecognition - that sounds about right.

You are reading AI Supremacy, the top machine learning Newsletter on Substack covering AI’s impact on business, society and technology. If you can share to colleagues, friends, family and on Reddit, Hacker News or LinkedIn I would be grateful.

Origin Story

Fox founded AssemblyAI after a 2-year stint at Cisco, where he worked on machine learning for collaboration products. Prior to that, he started YouGive1, an organization that worked with companies to reward customers with product offers in exchange for nonprofit donations.

Here is?all they achieved in 2021.

AssembyAI has a great future with audio.

AssemblyAI is all about leveraging the same AI technology used to create popular AI models like?DALL-E 2, GPT-3, and?Google’s LaMDA model, to create State-of-the-Art AI models for transcribing, understanding, and analyzing audio and video data – including?Transformers,?Large Language Models, massive?GPU clusters, and large datasets.

First-of-their kind audio-first social networks, like Twitter Spaces and Clubhouse, have started popping up everywhere, including Substack’s App and?LinkedIn Audio Events.?There’s an API for that.

Smaller companies struggle to keep up, which is why many turn to “AI-as-a-service” vendors that handle the challenging work of creating models and charge for access to them through an API. One such vendor is?AssemblyAI, which focuses specifically on speech-to-text and text analysis services.

Automatically convert audio and video files and live audio streams to text with AssemblyAI's Speech-to-Text APIs. Do more with Audio Intelligence - summarization, content moderation, topic detection, and more. Powered by cutting-edge AI models.

The Stripe for Text to Speech

Four months ago, they announced our?$28M Series A led by Accel, with participation from Y Combinator, the Stripe founders – John and Patrick Collison, Nat Friedman, and Daniel Gross.

Now there’s yet another round. They are moving fast.

Steve Nouri 2 个月前

The unconventional journey of Open AI

Vani Kola 1 年前

The Generative A.I. Brief #17

Michael Spencer 1 年前

Democratizing Text to Speech

Their goal is to expose this progress to every developer and product team on the internet – via a simple set of APIs. As we continue to research and train State-of-the-Art AI models for ASR and NLP tasks (like speech recognition, summarization, language identification, and many other tasks), we will continue to expose these AI models to developers and product teams via simple APIs – available for free.

Over the past 6 months just in 2022 they’ve already launched ASR support for?15 new languages?– including Spanish, German, French, Italian, Hindi, and Japanese, released major improvements to our?Auto Chapters and Summarization models, Real-Time ASR models, Content Moderation models, and?countless other product updates.

They were founded in 2017.

“I was looking for speech recognition and natural language processing (NLP) APIs for past projects, and started AssemblyAI after seeing how limited, and low-accuracy, the available options were back in 2017,” Fox told TechCrunch in an email interview. “The company’s goal is to research and deploy cutting-edge AI models for NLP and speech recognition, and expose those models to developers in very simple software development kits and APIs that are free and easy to integrate.” - Dylan Fox

ASR stands for automatic speed recognition.

I consider AssemblyAI a rather promising AI startup.

With this new funding, they will be able to accelerate their product roadmap, build out better AI infrastructure to accelerate our AI research and inference engines, and grow our AI research team – which today include researchers from DeepMind, Google Brain, Meta AI, BMW, and Cisco.

The API for ASR

AssemblyAI has the go-to solution for analyzing speech, offering ultra-simple API access for transcribing, summarizing and otherwise figuring out what’s going on in thousands of audio streams at a time.

Their value proposition to me is really strong. Think about it, AssemblyAI offers AI-powered, API-based services in over 80 languages for automatic transcription, topic detection, and content moderation as well as “auto chapters,” which breaks down audio and video files into “chapters” with summaries for each. Using the platform, developers can call various APIs to perform tasks like “identify the speakers in this conversation” or “check this podcast for prohibited content” at a relatively low cost, starting at $0.00025 per audio-second.?

Automatic transcription
Topic detection
Content moderation
Auto Chapters
Speaker identification
Super cheap: $0.00025 per audio-second. That’s $.90 for 1 hour.

Super Minimalistic APIs

AssemblyAI offers a handful of different APIs that you can call extremely simply (a line or two of code) to perform tasks like “check this podcast for prohibited content,” or “identify the speakers in this conversation,” or “summarize this meeting into less than 100 words.”

So Many Use Cases

But Fox says AssemblyAI continues to grow at a fast clip, fueled by the pandemic, and — by extension — the rise of remote work. Audio and video is being incorporated into an expanding number of products, he notes, like videoconferencing and even?dating apps. That’s led product teams to look for ways to build additive, high-value features on top of audio and video data.

Thanks for reading!

I have a special summer discount going for my AI Supremacy Newsletter, help me get to 100 paid subscribers.

Get 25% off for 1 year

https://aisupremacy.substack.com/subscribe

What do you think of their unique value proposition, product market fit and future potential?

Respond in a comment below.

Artificial Intelligence Report

242,930 位关注者

Takahide Maruoka

2 年

Japanese startups are raising funds through crowdfunding. In Japan, audio clubhouses were popular last year. This year, the clubhouse boom has cooled. You are right to focus on audio. I wanted to be scientifically more advanced in voice recognition than in image recognition technology. Image recognition is being researched and developed more by large and small companies. There is a need, especially in the medical field. If we can develop a service specializing in audio relations with speech recognition, it will create business opportunities. I think start-up companies are better suited for this.

1 次回应

Mitch Austin

CEO at Spirare Center for Airway and Sinus

2 年

This is exciting tech. I’d like to see it used in translationally for speech analytics as well. This looks like a route of diagnostics for neurological issues in speech and therapy. Educational speech and learning developmental may also benefit with these algorithms.

1 次回应

POOJA JAIN

2 年

Insightful share ????Michael Spencer

2 次回应

Netra Hirani

Analyst at Bain & Co. | AI specialist | Author

2 年

It's brilliant! A great tool for audio analysis and conversational AI!

2 次回应

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

Such a promising startup. Joseph Zaghloul

2 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

The Stripe of Automatic Speech Recognition (ASR)

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

What is AssemblyAI?

AI-as-a-Service in an audio friendly world - powering Automatic Speech Recognition (ASR)

JULY 14TH, 2022 4:15 PM MONTREAL, CANADA

Origin Story

Automatically convert audio and video files and live audio streams to text with AssemblyAI's Speech-to-Text APIs. Do more with Audio Intelligence - summarization, content moderation, topic detection, and more. Powered by cutting-edge AI models.

The Stripe for Text to Speech

领英推荐

Democratizing Text to Speech

The API for ASR

Super Minimalistic APIs

So Many Use Cases

Artificial Intelligence Report

242,930 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

AI, Hero, Or Villain?

State of Generative AI in 10 charts

Transforming Small and Medium Enterprises with Generative AI

2024 AI Insights: From Startups to Enterprises

Pharmaceutical Artificial Intelligence in 2020: The Sector is Heating Up For Investments

Generative AI is changing the startup game

How AI innovation can drive 10X growth in Enterprises?

Generative AI for Startups: Unlocking AI-Powered Success

Tech giants and billion-dollar startups are duking it out over large language models in AI. But who will win?

AI in 2024: Moving Beyond Hype to Profitability and Enterprise Sales

What is AssemblyAI?

AI-as-a-Service in an audio friendly world - powering Automatic Speech Recognition (ASR)

JULY 14TH, 2022 4:15 PM MONTREAL, CANADA

Origin Story

Automatically convert audio and video files and live audio streams to text with AssemblyAI's Speech-to-Text APIs. Do more with Audio Intelligence - summarization, content moderation, topic detection, and more. Powered by cutting-edge AI models.

The Stripe for Text to Speech

领英推荐

Democratizing Text to Speech

The API for ASR

Super Minimalistic APIs

So Many Use Cases

Artificial Intelligence Report

242,930 位关注者

Guide to NotebookLM

2024年11月25日

The Genius of China's Open-Source Models

2024年11月20日

First Citizen of the AI State: Elon Musk

2024年11月19日

The Future of Search Upended - ChatGPT Search

2024年11月4日

Can India become a Leader in AI?

2024年10月31日

NotebookLM gets a Meta Llama Clone

2024年10月29日

Top Semiconductor Infographics and Newsletters

2024年10月25日

Anthropic Unveils Computer Use but where will it lead?

2024年10月24日

Why Tesla is not an AI Company

2024年10月16日

The State of Robotics 2024

2024年10月15日

社区洞察

其他会员也浏览了

AI, Hero, Or Villain?

State of Generative AI in 10 charts

Transforming Small and Medium Enterprises with Generative AI

2024 AI Insights: From Startups to Enterprises

Pharmaceutical Artificial Intelligence in 2020: The Sector is Heating Up For Investments

Generative AI is changing the startup game

How AI innovation can drive 10X growth in Enterprises?

Generative AI for Startups: Unlocking AI-Powered Success

Tech giants and billion-dollar startups are duking it out over large language models in AI. But who will win?

AI in 2024: Moving Beyond Hype to Profitability and Enterprise Sales