Hugging Face转发了
?? CrisperWhisper is an advanced variant of OpenAI's Whisper, designed for fast, precise, and verbatim STT with crisp word-level timestamps! ?? Unlike the original Whisper, nyra health's CrisperWhisper aims to transcribe every spoken word exactly as it is, including fillers, pauses, stutters and false starts. CrisperWhisper is on top of Open ASR Leaderboard??on Hugging Face Spaces: https://lnkd.in/gSwuKvZB Key Features ?? Accurate Word-Level Timestamps: Provides precise timestamps, even around disfluencies and pauses, by utilizing an adjusted tokenizer and a custom attention loss during training. ?? Verbatim Transcription: Transcribes every spoken word exactly as it is, including and differentiating fillers like "um" and "uh". ?? Filler Detection: Detects and accurately transcribes fillers. ??? Hallucination Mitigation: Minimizes transcription hallucinations to enhance accuracy.
Drop in enhancement for whisper especially if you care about accurate timestamps. Thank you nyra health
This is great
Thanks for amplifying our research. ?????????????? ????????: Official repo: https://github.com/nyrahealth/CrisperWhisper Research Paper: https://arxiv.org/pdf/2408.16589 Try the Model: https://huggingface.co/nyrahealth/CrisperWhisper We're actively working on addressing the remaining flaws and inconsistencies that we know about, as well as developing a turbo variant of the model. Stay tuned for upcoming releases! If you have any questions or encounter any issues, feel free to share them in the repo—we're always looking to improve. ??
Yeah! Interesting. But you all need to release the new features a little more slowly. ?? I'm losing track of everything else I want to try out.
??????
CSE Undergrad | MERN Stack Developer | Exploring GenAI
4 个月Gonna use it in next project ;"