[Csnd] Dolby Atmos and other software for spatialization in computer music?

Michael_Gogins · March 24, 2022, 8:43pm

Over the past year or so I have noticed an increasing use of Dolby Atmos as a commercial standard for spatialization.

I would very much like to hear from any computer music or electroacoustic composers, or software developers, about their experience with Atmos or their comparisons of Atmos with other software/methodologies for spatializing audio.

For those don’t know, Dolby Atmos spatializes sounds with a combination of “beds” (mixes of sounds whose locations do not change with time) and “objects” (mono or stereo mixes that can move over time and in real time). Furthermore, the Atmos format is abstract in somewhat the same sense as the Ambisonic format, and the Atmos Renderer can render the Atmos mix on any speaker rig (within limitations) including stereo and binaural.

My informal impression based purely on binaural mixes rendered for headphones by Amazon Music is that it works very well, but I would really like a sense of the response of the electroacoustic community to this software and the situation that it creates.

Best,
Mike

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

tarmo · March 25, 2022, 8:41am

Hi!

A friend of mine that works a lot with spatial audio said:

"

Its a bit silly to be honest, its just vbap with some extra bus features. Something
easily produced in max, Csounds, etc…

They built it I think to try to be the main source of spatialization in the film industry.
If they are the standard, then everyone uses it and no one will really know why.
"

I don’t know myself but sounds believable…

tarmo

Kontakt Michael Gogins (<michael.gogins@gmail.com>) kirjutas kuupäeval N, 24. märts 2022 kell 22:43:

Over the past year or so I have noticed an increasing use of Dolby Atmos as a commercial standard for spatialization.

I would very much like to hear from any computer music or electroacoustic composers, or software developers, about their experience with Atmos or their comparisons of Atmos with other software/methodologies for spatializing audio.

For those don’t know, Dolby Atmos spatializes sounds with a combination of “beds” (mixes of sounds whose locations do not change with time) and “objects” (mono or stereo mixes that can move over time and in real time). Furthermore, the Atmos format is abstract in somewhat the same sense as the Ambisonic format, and the Atmos Renderer can render the Atmos mix on any speaker rig (within limitations) including stereo and binaural.

My informal impression based purely on binaural mixes rendered for headphones by Amazon Music is that it works very well, but I would really like a sense of the response of the electroacoustic community to this software and the situation that it creates.

Best,
Mike

Michael Gogins
Irreducible Productions
http://michaelgogins.tumblr.com
Michael dot Gogins at gmail dot com

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Michael_Gogins · March 25, 2022, 5:22pm

I am sure you are right about marketing, but that is only part of what I want to know.

The main thing I want to know is, supposing that Atmos becomes built in to every usable mixing board and every commercial recording studio (and this seems to be well under way), what happens to people who make electroacoustic art music or have their own studios? Do they need to get Atmos software? Is there some way to translate existing Ambisonic or VBAP mixes to Atmos, and vice versa?

My interest was sparked by an article “An Introduction to Immersive Audio” in the January 2022 issue of Sound on Sound magazine, discussing Atmos (and other formats) and explaining some of the technicalities. Also Dolby’s online marketing and technical documentation. Atmos is a hybrid format combining “objects” (i.e. one or two channels of audio with metadata) with “beds” (i.e. 7.1.2 surround sound mixes) elements. A nice feature of Atmos is the renderer, which can take an Atmos mix and adapt it to various speaker rigs without much work by the producer. The final product of Atmos is an ADM file, which is a Broadcast WAV file with a lot of metadata. The renderer can play this file on a number of speaker setups.

Dolby publishes this to help people write software compatible with Atmos: https://professionalsupport.dolby.com/s/article/Dolby-Atmos-ADM-Profile-specification?language=en_US.

I searched through this and didn’t see specifically anything about VBAP or Ambisonics, but because the specification does allow Cartesian coordinates and output speakers are channel-based, I infer that it is indeed VBAP. The implication is that panning an Ambisonic object can be done either by sending it to a specific speaker, or by using Cartesian coordinates.

Although this specification is public, the renderers are closed source and proprietary, and the kind used to make films is not cheap, either. There are Atmos panners for leading commercial DAWs, but not e.g. for Reaper except through a bridge of some sort.

Regards,
Mike

Robert_Craig · March 25, 2022, 6:55pm

BlueRippleSound.com has some decoders, including Atmos, which you might have a look at. I’m sure they’re hard at work making the conversion between Ambisonics and various other forms of surround sound available at reasonable cost.

No affiliation, just a happy user.

Robert

Alex_Weiss · March 30, 2022, 3:17am

Most of my work is in film, so some of this might be helpful:

The actual panning algorithm Dolby uses is proprietary but I would suspect that it is indeed VBAP or very similar; for panning without height information simple pair-wise amplitude panning is used.

The distinction between beds and objects is particularly useful for film-style work. The former are identical to channel-based 2.0/5.1/7.1/etc. formats and will play back in a movie theater in the same way. For example, that means any surround information will be played through the entire speaker array (just like a regular 5.1 mix would) instead of individual distinct speakers. Sounds in beds can still move around, but any movement or panning will be baked into the bed instead of handled by the Atmos renderer (similar to how any panning information is baked into a regular stereo mix).

Objects, on the other hand, are discrete mono or stereo channels with attached panning information. The renderer is fed the configuration of the room (i.e. the number of speakers for front/side/back/height) and uses the panning metadata to play the sounds back in real-time. The more speakers, the better spatial accuracy, obviously. For surrounds in particular that can make a big difference since objects can be panned between individual surround speakers (instead of using the entire array as the beds do). As I mentioned, panning between two speakers is just pairwise amplitude panning.

In addition to panning metadata, each object can also be assigned binaural information: it can either be “near”, “mid”, “far”, or “off”. The Atmos binaural renderer uses this information together with the panning metadata to render objects binaurally. There is some controversy to this right now as some Atmos encodings drop this additional metadata (Apple Music does, for example).

For Dolby Atmos mixes that have to be encoded (e.g. for streaming), a final step called Joint Object Coding lumps spatially close objects together to save bandwidth.

The Atmos renderer can render to Ambisonics, but I don’t know of a way to convert from B-format to Atmos. If the B-format mix was created through individual sounds with panning metadata fed to an Ambisonics encoder, theoretically I suppose you could convert to Atmos. But it would sound very different due to the different panning algorithms used.

Michael_Gogins · March 30, 2022, 5:24am

Thanks for the information!