Hi Thomas, I think that’s a great point. I went through BS.2051-3. I honestly wish there was a more clear standard when it comes to immersive audio. There are quite a few different formats out there and they all work slightly differently.
When it comes down to it, you have 2 ways of describing immersive panning, either in a shoebox format (X, Y, Z coordinates) or as a sphere (azimuth, elevation). VBAP works quite well, but it would be great if there was a standard that could translate as accurately as possible between the formats, so you can decide on your intent, and it would automatically translate it into the different formats, without having to author them seperately.
This would be similar to what AES67 did for Ravenna and Dante. Because right now, I feel you either mix for the different formats separately, you use JJ’s approach or you deal with generic up-mixers. And I’m not sure if any of them are 100% ideal. Also, if there was a standard way of describing immersive audio, it would fullfil the BS2051 goal which is to make sure it will be possible to read a “master” many years from now. We can only do that with an open format, don’t you think?