登录查看更多内容

Ray Tracing for sound, the holy grail for data generation?

Weiming Li

Machine Learning Signal Processing | MLSP.ai

发布日期: 2025年2月25日

Ray Tracing (RT) should be a very familiar term in 3D gaming, but what might be less known is its application in rendering sound, and how it could solve the data problem in model development.

Light and sound’s transmission in air share some similar physical behavior, most importantly, what happens when it hits a surface. Therefore, it’s logical to think Ray Tracing engine could be used to do the same for sound. VRWorks from Nvidia, includes exactly that, a path tracing engine for audio. Just like RT, by calculating how sound wave changes through many different paths between the source and listener, then combine the results together we get a realistic rendering of sound at the listener location.

from "CREATING IMMERSIVE AUDIO EFFECTS IN GAMES AND APPLICATION USING VRWORKS AUDIO"

Before a demo, I would like to quickly share a project I did many years ago, a custom wake-word is needed to be part of the product. This is what happened:

50 employee carry a recording setup and record the wake-word in various environments, office, canteen etc. [2 weeks]
Data cleaning [2 days]
Model training [1 day]
Model verification [1 day]

The point is, data collection is a big effort task in audio model development and within this effort, being in the required physical environment contributes a lot of the headache. This is exactly where RT for sound come in handy, by rendering realistic sound in a 3D environment.

Let's have a listen to a demo clip, using headphone is highly recommended, the purple sphere represents sound source.

Demo video clip

VRWorks engine supports customization of:

The 3D environment (by giving it mesh file)
Reflection, absorption & transmission coefficient of surfaces
Sound source(s)
Listening location(s)

With these features and assets from game development world, it’s not hard to imagine scenes like office, canteen and church can be simulated, to a very high degree. So, could this be the holy grail for acoustic data generation?

要查看或添加评论，请登录

Weiming Li的更多文章

free trial: integrate NN processing in MCU with 2 lines of C code

2025年3月10日

free trial: integrate NN processing in MCU with 2 lines of C code

Trying is believing. In this post, I would enable everyone to be able to try bringing my example NN processing into…
from minimize error to raise quality

2025年2月18日

from minimize error to raise quality

In this post, I am going to share the finding (and audio samples) of applying perceptual quality as training target for…
Looking forward to Cortex-M55 + Ethos-U55

2025年2月10日

Looking forward to Cortex-M55 + Ethos-U55

The 50x inference speed up and 25x efficiency jump are very exciting, but what I really look forward to is how it could…
SVDF, just give Conv a bit of time

2025年1月19日

SVDF, just give Conv a bit of time

Simply add a dimension of time to standard Conv layer, it becomes the SVDF layer, the core component powering our…
Peek into the future

2025年1月13日

Peek into the future

The Devil is in the details, a often hidden small detail that we must not miss when interpreting performance figures…
Tiny model for tiny system

2025年1月6日

Tiny model for tiny system

Large model shows us the limitless perspective of what’s possible, but model doesn’t have to be big to do amazing…

6 条评论
build trust with black box

2024年12月29日

build trust with black box

Putting a black box in a product requires courage, a few ways to turn some of the courage into confidence. A NN model…
from batch to streaming

2024年12月19日

from batch to streaming

Unexpected complication I wish I were well aware of from the beginning. If you coming from a conventional DSP…
Fuzzy Memory

2024年12月16日

Fuzzy Memory

I don’t mean the kind we have after a hangover, but the kind powering some of the greatest models we know. “But do I…
Stochastic Rounding

2024年12月12日

Stochastic Rounding

When comes to digital signal, NN has the same liking as our ears. Rounding a number is a very common operation in DSP…

1 条评论

See all articles

Weiming Li的更多文章

free trial: integrate NN processing in MCU with 2 lines of C code

from minimize error to raise quality

Looking forward to Cortex-M55 + Ethos-U55

SVDF, just give Conv a bit of time

Peek into the future

Tiny model for tiny system

build trust with black box

from batch to streaming

Fuzzy Memory

Stochastic Rounding