Please Explain 3D Audio
PART 1
While this may seem like a simple request, the subject is much bigger than you might think. So we'll touch on only the basics here. Let's jump right in.
With all the interest in things like VR these days, you might think 3D Audio is something new, when in fact, it's been around for quite a while - mostly because it's relatively simple to do.
While I do appreciate the occasional technical inquiries like this, answering in a non-technical way is challenging - while at the same time, keeping it relevant to your interest in film. But I'll try - so stick with me.
Let's start by clarifying the question. Do you mean, how I do 3D Audio, or how 'convention' does it?
Let's look at conventional first.
The current efforts appealing to psychoacoustic spatial perception are an interesting attempt at audio's "virtual reality" - but still leaves something to be desired - as we'll see in a moment.
So bring your best ears with you as we pay a visit to the playful fun-house of 3-D sonics, and Binaural vs Transaural.
For simplicity, we'll start with two-channel stereo format.
First, in contrast, Monaural playback pays no particular respect to left/right, nor three-dimensional aspects of perception, whereas, Binaural and Transaural do - with some big differences.
Binaural recording techniques intend left channel information to be heard only by the left ear, and vice versa. The method for recording binaural sound is generally accomplished by "dummy-head" recording - where small microphone modules are placed within the ear canals of a life-size dummy head. This is known as Head Related Transfer Function - and it's very effective.
For playback however, the discrete left/right information must remain separated for the effect to be properly realized by the listener. This means that loudspeaker playback is incompatible in this case because both speakers are heard by both ears - nullifying the effect.
For binaural recordings to render properly, the listener must agree to wear some sort of listening apparatus for keeping left and right sounds separate from each ear (stereo headphones). But turn your head and the whole perceived venue also turns, or when you take a few steps forward, you haven't moved any closer to the source. This ain't real!
Although advancements in head-tracking-movement are underway, you may wonder why reality simulation requires the observer to be dependent on a decoding device such as headphones, or polarized or 2-color glasses for 3D visuals.
And certainly for VR with that goofy-looking headgear we're supposed to pretend isn't there. Experiencing true reality in the real world requires no such 'appliance' attached, and is consistently absolute. Yet such simulations are routinely haled as "reality". - I don't think so.
When considering playback in open-space without being attached to some appliance, let's look at Transaural.
Assuming 2-speaker playback, small amounts of left and right signal co-occupy each others' channel out of phase. This technique is known as crosstalk cancellation, or sometimes called, cross-phasing. This is employed to artificially expand the perceived width of the soundstage by psychoacoustic misdirection.
Continuing beyond simply the widened panorama, are schemes which propose virtual surround from only two speakers. This approach applies a considerable amount of fore/aft phase manipulation within a controlled frequency range to suggest source positions residing in mid-air in an O-shaped, or U-shaped configuration before the listener.
Although a certain amount of head rotation is permitted, the severe phase angles necessary to produce this orbital pattern eventually plays itself out and becomes predictable and tiring. (listening fatigue)
For those with a Home Theater set-up, this should be familiar to you since most AV Receivers contain pre-sets for creating these phantom modes as an alternative to discrete surround such as 5.1 or other.
The simulation of phantom images displayed by this effect can sound quite amazing at first, as long as the listener is willing to remain fixed within a relative "sweet-spot" between the speakers. However, the illusion will be greatly compromised or lost completely if the listener were to freely move about the room as one might do in a natural setting - leaving a stark emptiness of dis-correlation, or, nonsense to the ear.
Odd, this parallax breakdown doesn't happen when listening to Real sounds in open space. Besides, your own ears affirm that the exploited phase-angle distortion of this mirage is not a product of nature - therefore, not real at all.
The owner of an electronics repair shop told me he used to have a certain portable stereo unit that had a 'magic button' which produced this orbital 3D pattern. He said he really liked that feature and now wished he had that stereo back. So I told him if he really liked that effect, and really missed it, I'd build him one of those circuits over the weekend from scratch - walk in here Monday morning and hand it to him. It's yours. He could then hard-wire it into another unit and have his magic button back.
I told him, that's how elementary that is for me to do, since it's just a common phase rotation exploit that's been around forever - and not high tech at all.
I can show you an even simpler spatial expansion trick you can do yourself, which costs only pocket change for parts, and takes just a few minutes to do. The only problem is, because of the stress it places on your amplifier, there's a risk of consequential damage in doing so.
An early memory noticing this type of signal processing applied to a popular recording was on Led Zeppelin's "Whole Lotta Love" from the late '60s. Yes, it's cool, but that's certainly not new.
So why is 'phase' such as dirty word though?
In this context, it involves imposing two dissimilar signals upon themselves to create a dis-correlated freak exposit - causing a displacement warp in apparent position. But to get this apparition to perform just right, you must be positioned proximate to each speaker in that sweet-spot.
This also includes being not too close, and not too far away either. Otherwise, being off-focus only produces vague sonic confusion - plus ear burn-out over time due to the unnatural pressure it creates on the ears.
Another downfall you discover along the way, is that if you were to listen in mono (both channels combined) you lose certain portions of the mix, and rewarded with a hollow, 'pinched' sound instead. Not cool. This is because the discorrelations are being forced into a common composite.
But this effect isn't that bad when used sparingly - but that's not always the case.
I remember a story going around where Madonna got herself in trouble with the radio stations for submitting a song for airplay with too much phase warp. Doing so can potentially interfere with the FM stereo broadcast multiplexing, and cause de-modulation aliasing confusion, and producing unwanted artifacts at the receiver end. (Not good)
Yes, you can put all the phase angles you want on a CD and it will 'stick', but stereophonic broadcast for FM is not discrete L/R, but rather, derived by precise phase relationship between sum & difference of the source 'L/R'.
The story goes that they asked her to submit another mix, toned down - and more suitable for air play.
While it's certainly plausible, I can't prove this story - it's just what I'd heard, but you can hear an example of heavy phase application on her song, "Vogue". With a little looking, you will also find multiple mixes of this song - which still doesn't prove or disprove the story. But again, while this effect is cool to listen to, it is still very much listener-position-dependent - and unnatural to the ears.
In a similar way for vinyl. There is only so much phase modulation you can get away with for a record groove without violating certain mastering parameters. And the term 'sum' should really be called, "absence of separation" and, 'difference' should be called, "anti-mono" (in opposition to) which is really what it is. So, the two competing vector angles can potentially create the perfect storm within the groove with high-displacement modulation. And, expecting even a high-compliance pick-up to track that much violence without jumping out of the groove is asking a lot. I've seen it happen. So, while a little is okay - a lot is not.
The unfortunate shortcoming of these types of attempts aimed at expanding the usual stereophonic experience, is obviously their conditional playback requirements and often speculative mimicry.
Again, the naturally occurring 3-dimensional depth observed when listening to a live occurrence involves no such contingent or restrictions in order to be fully enjoyed and recognized as authentic. In other words, real is real because it behaves real without trickery - and it endures.
Are you ready for more?
Let's take it up a notch and bring all this into the Movie Theater and see what happens next for mass audience on a grand scale.
Here we have another set of challenges to overcome - such as, there are many more speakers and channels of amplification at play - AND those speakers are much farther away from us.
And what about any sweet-spot requirements in the room? Well, that would be designated to only a small cluster of seats somewhere in the center of the theater. So, what about all the other seating locations not included? Well - tough luck.
But what about claims to the contrary of "no sweet-spot"?
You can believe it if you want, but at high volume you can't ignore the ringing in your ears and your splitting headache as a result of exposure to this foreign assault.
Any optics expert knows that a telephoto lens can make a distant object look bigger.
Bigger yes, but not actually closer. Why not? Because, while it may be bigger, it appears flatter - as looking at a cardboard paper-doll which has no depth perception or texture to it. In other words, it's lost its proximity perspective, and therefore, is truly lifeless. Flat. Same sonic disadvantage exists with speakers that are far away.
Then, there are other approaches such as multi-speaker arrays causing intentional mid-air convergence as another attempt - and line-arrays which are completely inept at providing accurate point-source definition.
I'm suggesting that 3D Audio in this setting is disadvantaged and completely out of its league. And flying sounds around the room from fixed speaker locations gets old really quick.
In a similar way, how long are you amused by your warped reflection from a fun-house mirror? Only for a while until the novelty wears off - which is pretty quick. Then, you're ready to move on to something else.
Because of this gross handicap on a grand scale, such practices fail to deliver for these three reasons;
1, Insult intrusion (not friendly)
2. It's alien (foreign / artificial)
3. The whole attempt fails to satisfy.
These days in film, the eyes are treated to spectacular CG, VFX and camera work like never before, but sonic advancements remain far behind that progress curve - with the ears receiving the same mis-treatment as always - just more of it.
While the tools to do this type of signal processing today look and operate differently than in the past, they're still based on the same basic principle as always.
To make matters worse, there are a number of competing and incompatible systems, all with their own take on what 'immersive' means, which is enough to cause confusion and disgust of their 'speaker wars' of the whole sound-for-film world.
While all that provides some new tools for the audio engineers to work with, of course they're going to support, and root for, what they've invested heavily in, but who is actually serving the audience in all that??
As a gentle reminder, in most cases, sound production services are in business to serve the Movie Studios, and the audience is merely the unwitting recipient of whatever happens up-stream.
The fact is, the audience is mostly indifferent to whoever has the greatest sonic circus act - which is why the collective mutterings continue with, "Why can't you guys get it together? Who knows, we might go to the movies more often if the experience were actually enjoyable". But their voice is not heard.
My advocacy is for that audience - and the audience is the filmmakers' customer! That's where the responsibility lies to provide something better - something more consummate.
But a solution to overcome the dissatisfaction of the conventional way of doing things must involve something beyond just adding more lipstick to it. It must be fundamentally different by design - and not a re-hash of the past, 'competing' with what we've already heard before.
Now THAT is an exceedingly high challenge for anyone!
So let's leave this noise and I'll show you something much more to your liking that doesn't sound like "mechanized 3D" or an exaggerated caricature, but more resembling real life as we know it.
And to those who feel the subject of sound is boring, for perhaps the first time in your life, it's actually FUN from this point on!
As each of us go about our everyday lives, we are already hearing an active 3-dimensional sound field as part of our reality. But no one says, "Wow, I'm hearing 3D today!" - because it's completely natural to us.
In the world of audio reproduction, the ear is hungry for something resembling that world we know - something we can identify with and relate to. That's why some parlor trick doesn't satisfy.
So, creating a convincing illusion of such, requires it NOT to appear artificial, gimmicky or 'affected', but as natural as can be - like real life.
That is exactly what I set out to do - in a whole new way.
This is accomplished by a special post treatment up-grade, which is based on an entirely new mentality from the inside out - and immune to the usual insurmountables.
And as a bonus from this being a comprehensive solution, there is also a significant improvement in point-source details, and dialog intelligibility - and with an organic glow & allure people really like when they hear it!
This is possible because this treatment was originally developed as an audiophile-grade up-conversion for music mastering. (Where sound quality REALLY matters) But movies are in desperate need of improvement and there is much more on the line for the content creators than there is for somebody's music CD. So I'm happy to share this with the film community.
Because of the proprietary status of the technology involved, I must preserve the trade secrets how this is done, and I'm sure you can respect that - even though I'm aware it's frustrating for some, being faced with a puzzle you can't solve. But don't feel bad, I've had some of the toughest cookies try to guess how this is done. So relax, it's like trying to figure out a perfectly executed stage illusion by fellow magicians. Just enjoy the show!
The whole point is to provide a better customer-centric human interface where the means of influence are not readily apparent - as the audience isn't listening to the 'sound', but more captivated by, and emotionally lost in, the story being told. So this is not about showing off sound for sound's sake.
This is much more refreshing than what we currently hear at the movies. A way to draw the audiences' attention toward what's going on - instead of shouting 'at' them.
While studies like Color Science and Aroma Therapy are well known, who is doing anything like that for hearing? Or is that even possible??
For sound aesthetics to have a persuasive influence requires an elusive X-factor unknown.
If the film-sound community could reach into outer space and harness such an X-factor they would. I know this from talking with enough of them about it.
But don't rely on your favorite sound company to provide anything like that for you, because they don't even know what it is, and are preoccupied with other endeavors appealing to the audio community - THAT'S their client, NOT the audience themselves. And no conference room committee could produce such a thing for you anyway. The real test of validity is if it resonates with the audience.
So let's take it apart and look at each of the various improvements one by one - starting with . . .
--TRANS-MODE COMPATIBILITY
I was invited to one of the largest Sound-for-Film companies in Hollywood as they were curious to see what this is all about because of a suggestion by a mutual friend. While listening to examples of my work, and immediately upon identifying the spatial qualities, their engineer hit the mono switch on impulse - expecting to hear that familiar 'pinched' sound as always. But instead, announced to all of us present, "HEY. There's no phase problem!". (attribution available)
And a long-established Post House, just blocks away said, "This is almost TOO good". "Because of our alliance with [some other persuasion] there's just no room [around here] for THIS much improvement". (attribution withheld)
That's okay guys, you're not my client anyway. The filmmakers are.
--HUMAN RECEPTIVITY
Due to a car vs pedestrian accident years ago, my father lived out the second half of his life with hearing in only one ear. That's unfortunate, especially for a music lover. And yet, he told me he could still hear every bit of the benefit my system delivers.
Two of my business partners were interviewed by a local FM radio show playing example music recordings live over the air. After the show, we received phone calls and emails from all over the L.A. reception area.
One listener told us he is completely deaf in one ear, and yet, said remarkably, he could clearly hear the 3D effect over his radio! Yes, we already knew that. (Compare that to stereoscopic visuals which require sight in both eyes)
--PROXIMITY UNRESTRICTIONS
"Somehow, the sound is larger than the speakers" --Film Sound Engineer
"It's everywhere within the room - and even out into the hallway" --Recording Studio Owner
"You can be standing sideways to the speakers - even behind them and never lose the imaging" - Record Producer.
--POINT-SOURCE DEFINITION
"I can't get over the DETAIL" --Feature Film Director
"I can point to everything I'm hearing" --Audio Manufacturer
"I can't believe how you can pull sounds from the background like that" --Audiophile
--HUMAN ASSIMILATION
"I heard the difference immediately!" --Recording Artist
"It gave me goosebumps" --Industry Pro
"It makes everything else sound bad." --Music Pro
--3D RENDITIONING / DEPTH PERCEPTION
"Without it, everything sounds 2-dimensional by comparison" --Scoring Engineer
"I've been looking for this sound for 10 years and will never go back to flat again" --Grammy Winner
"It's like, I'M THERE" --Too numerous to list.
--EARGONOMIC AESTHETICS
"Something extra is added that is very pleasant to listen to" --Pro Audio VP
"The Digital 'Ear Bleed' is virtually eliminated" --Film Sound Editor
"We know Directors who would want this Full Throttle!" --Sound Designer
--AUDIENCE SATISFACTION / TAKE-AWAY
"YOUR sound held my attention, but without it I found myself dozing off" --Music Ind. Pro
"You've accomplished what all the others have been TRYING to do" --Recording Industry Pro
"Whatever you guys are charging for this, it's not enough" --Music Lover
Perhaps you've noticed something in common among these diverse listeners - they're not complaining anymore, being fully engaged in what's going on - and 'audience engagement' is the unspoken desire of many filmmakers. All this from an additional step added to the usual workflow.
And if any of this seems hard to believe, it's because you're still thinking conventionally.
But being that I'm an independent and NOT prone to hop aboard any particular brandwagon or political methodology, I have the freedom to think and create un-conventionally - and up-grade YesterSound to the 21st century for you. I'd say it's about time!
So here's today's rhetorical question;
If you had a magic wand to make your film sound however you wish, what would it be like?
---
How to hear this for yourself is found in my article, "Why Can't Movies Sound Better".
Director at Studio PB&J, LLC
3 年Great article Benny! Even for those of us who lack technical knowledge. Thanks!
IT Project Manager - Security, Compliance & Infrastructure
7 年Yes, just imagine filming a 1st-person perspective movie while capturing audio with the mics from Hooke Audio. Video games such as Hellblade: Senua's Sacrifice are doing this now. In my spare time, I manually render 5.1 films into binaural for fun using object-based software. What a difference!