登录查看更多内容

STJ, YAWT, and other LNPs (Late Night Projects)

Yaniv Golan

Building companies @ lool ventures

发布日期: 2024年10月21日

It’s the holidays season, and with some free time, I decided to go ahead and create the transcription tool I always wanted. Not quite there yet, but as first step, I created YAWT, which stands for “Yet Another Whisper Tool” (yes, I know, a bit on the nose).

YAWT handles transcription of audio or video (e.g. zoom calls) into text, recognizes different speakers, and handles multiple languages seamlessly — super useful for anyone mixing Hebrew and English (aka Hibrish), which is pretty much everyone in Israeli tech.

It uses OpenAI’s Whisper large-v3 for transcription and PyAnnote. The output can be generated in multiple useful text formats like SRT, VTT, and a rich JSON format designed downstream apps (hello, LLMs!).

While YAWT is specifically designed to solve some of my own pains, hopefully it’s flexible enough to solve other people’s pains as well. You can check it out here: YAWT on GitHub .

But here is the thing: as I was playing with the JSON output from YAWT, I thought, “Surely, there must be a standard JSON format for transcriptions that I can leverage and extend, right?” Nope. I found… nothing.

It was late — 12am, to be exact — and I probably should’ve gone to bed. Instead, three hours later, the Standard Transcription JSON ( STJ ) Format , was born:

LanguageLine Solutions 5 个月前

Top LLM Papers of the Week (March First Week 2024)

Kalyan KS 8 个月前

Top LLM Papers of the Week (August Week 1, 2024)

Kalyan KS 3 个月前

A comprehensive JSON-based spec for transcription, subtitles, translations, and more.
It’s a superset, pulling features from SRT, VTT, TTML, SSA/ASS, and beyond
Includes tools to convert STJ into these formats.
Comes with validators, sample code — you name it.

Huge shoutout to OpenAI o1-preview, Claude Sonnet 3.5, Cursor and Perplexity — these tools made the process a breeze. Truly a brave new world we’re living in.

This is not actually my desk; my real one is a huge mess.

If you’re interested in STJ, Getting Started is a good place to, well, start. I hope someone out there will find it useful — and even if not, it was a fun way to spend a few late-night hours. Let me know what you think!

Next, I’ll be feeding YAWT’s STJ output into another tool that extracts relevant context from related documents and uses it to enhance transcriptions. But that’s for another late-night session.

Luke M.

Machine Learning Engineer

2 周

Wow, I'm surprised that there was not a standard JSON format for transcriptions. That's great that you were able to put this together to create a comprehensive JSON-based spec for transcription, subtitles, translations and other stuff!

Hervé （埃尔韦） Kabla

前数码公司老板

3 周

LNPs often are the best ones

Maya Azoulay

Partner at lool ventures

1 个月

KING! Problem to opportunity to value ??

1 次回应

习移

Founder & CEO MindLi - PRIME Your AI Thinking || Vising Prof. MBA Technion | Harvard Alum

1 个月

?? ?? ????? ???? ????? :-)

2 次回应

Michael Lugassy

Principal Engineer at Forter

1 个月

very nice, you seem to have covered most use cases. i wish many consumers and producers would use this. many of the word-by-word caption tools expect a word timestamp, will that work within the same transcript segments?

1 次回应

查看更多评论

要查看或添加评论，请登录

Yaniv Golan的更多文章

A Snapshot of Israel on October 7, 2024: A Year of Reflection

2024年10月7日

A Snapshot of Israel on October 7, 2024: A Year of Reflection

Today, October 7th, 2024, a few days after the Jewish New Year, Rosh Hashanah, I find myself in a contemplative state…
Navigating Opportunity: How the Potential Qualcomm-Intel Merger Could Impact Israel's Tech Ecosystem

2024年9月23日

Navigating Opportunity: How the Potential Qualcomm-Intel Merger Could Impact Israel's Tech Ecosystem

Recent discussions about a possible merger between Qualcomm and Intel have caught my attention. While it's uncertain…

2 条评论
The AI-Driven Software Revolution is Coming…Are You Ready?

2024年9月16日

The AI-Driven Software Revolution is Coming…Are You Ready?

If there is one thing that is constant in this world, it is that technology is constantly evolving; The recent…

4 条评论
lool Ventures Welcomes Back Maya Szutan-Azoulay as Partner

2024年5月1日

lool Ventures Welcomes Back Maya Szutan-Azoulay as Partner

We are thrilled to announce that Maya Szutan-Azoulay is rejoining lool ventures as a Partner, marking an exciting new…

7 条评论
Digital Dependence: Managing the ChatGPT Habit

2024年2月20日

Digital Dependence: Managing the ChatGPT Habit

I adore ChatGPT. It’s an integral part of my day-to-day routine, assisting me with everything from quick informational…

5 条评论
A VC’s Take on Board Dynamics: Lessons from OpenAI

2023年11月23日

A VC’s Take on Board Dynamics: Lessons from OpenAI

In light of the recent and rather unexpected departure and return of Sam Altman from OpenAI, we are reminded of the…

See all articles

STJ, YAWT, and other LNPs (Late Night Projects)

Yaniv Golan

Building companies @ lool ventures

领英推荐

Yaniv Golan的更多文章

社区洞察

其他会员也浏览了

Can we trust LLMs with translations?

Top RAG Papers of the Week (September Week 3, 2024)

Language Tech through Time: A Lookback at the Linguist’s Landscape

Some painful untold facts about AI-tools - explanation with forensic linguistic outlook

Why do LLMs Hallucinate?

February Newsletter

Evaluation methods for LLMs

Grammar is Money... Capiche, Capische, Capisce, or Capeesh?

Prompting Techniques in Large Language Models (LLMs)

HOW A COMMUNITY OF OBSCURE LANGUAGE INVENTORS MADE IT BIG WITH ‘GAME OF THRONES’

领英推荐

Yaniv Golan的更多文章

A Snapshot of Israel on October 7, 2024: A Year of Reflection

Navigating Opportunity: How the Potential Qualcomm-Intel Merger Could Impact Israel's Tech Ecosystem

The AI-Driven Software Revolution is Coming…Are You Ready?

lool Ventures Welcomes Back Maya Szutan-Azoulay as Partner

Digital Dependence: Managing the ChatGPT Habit

A VC’s Take on Board Dynamics: Lessons from OpenAI

社区洞察

其他会员也浏览了

Can we trust LLMs with translations?

Top RAG Papers of the Week (September Week 3, 2024)

Language Tech through Time: A Lookback at the Linguist’s Landscape

Some painful untold facts about AI-tools - explanation with forensic linguistic outlook

Why do LLMs Hallucinate?

February Newsletter

Evaluation methods for LLMs

Grammar is Money... Capiche, Capische, Capisce, or Capeesh?

Prompting Techniques in Large Language Models (LLMs)

HOW A COMMUNITY OF OBSCURE LANGUAGE INVENTORS MADE IT BIG WITH ‘GAME OF THRONES’