A video conference endpoint includes a microphone to capture sound from a participant in a room and then the endpoint transmits the captured sound to a conference server or another endpoint. A controller in the endpoint may attempt to track a position of the participant based on an angle of arrival of the sound from the participant at the microphone. The controller computes the angle of arrival based on the speed of sound in air (i.e., air speed), which is a function of air temperature. Typically, the controller assumes an air temperature of, e.g., 20° C., and computes the angle of arrival based on that assumption. Often, the actual air temperature is much different from the assumed air temperature. As a result, the computed angle of arrival is different from the actual angle of arrival. This angle error can disrupt participant position tracking in the endpoint.