Capture of a room response at one point involves 4 variables. This might be the only use I would personally wish to put a basic ambisonic microphone to. You have 4 variables, dx dy dz and pressure.
The ear detects pressure via the eardrum, but volume velocity interacts with the head at frequencies where the head is wider than maybe 1/8 wave or so, so it can create pressure in a spot where there would otherwise be no pressure component.
This is part of what Bob O has pointed out, but there’s more in the issue with high frequencies in rooms with microphones, speakers, etc. Typically speakers do not radiate the same pattern at all frequencies. This affects the timbre of the room reflections substantially. If you measure frequency response with a long window at high frequencies, the high frequencies are going to be “off” usually in the “too little” direction, even if the direct sound is flat. So, measurement must be made at a variety of frequencies. This can lead to several mistakes, including turning up the treble until your earlobes start to bleed, but also causing ‘dark’ masters because the studio was too “hot”. Even if your speakers are both direct flat and power-wise flat the room response can fool you here, and most speakers “aren’t even close” to being both power flat
There are a lot of interactions to cope with, and it’s easy to oversimplify.