US 12,170,097 B2
Detection of audio communication signals present in a high noise environment
Scott Sommerfeldt, Mapleton, UT (US); Curtis Lynn Garner, Provo, UT (US); Jonathan Daren Blotter, Heber City, UT (US); and David Charles Copley, Peoria, IL (US)
Assigned to Caterpillar Inc., Peoria, IL (US)
Filed by Caterpillar Inc., Deerfield, IL (US); and Brigham Young University, Provo, UT (US)
Filed on Aug. 17, 2022, as Appl. No. 17/889,871.
Prior Publication US 2024/0062774 A1, Feb. 22, 2024
Int. Cl. G10L 25/84 (2013.01); G10L 21/0216 (2013.01)
CPC G10L 25/84 (2013.01) [G10L 21/0216 (2013.01); G10L 2021/02166 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a plurality of error microphones disposed in a predetermined pattern, each error microphone of the plurality of error microphones operational to:
capture a respective audio signal including a speech, and
generate a respective captured audio signal;
one or more reference sensors operational to capture a reference noise signal from a noise source;
a processor communicatively coupled to the plurality of error microphones and the one or more reference sensors; and
memory communicatively coupled to the processor, the memory storing computer-executable instructions that, when executed by the processor, cause the processor to perform operations, comprising:
generating a plurality of partially processed audio signals by removing at least a portion of the reference noise signal from each captured audio signal;
generating a plurality of signal pairs by pairing partially processed audio signals;
for each signal pair of the plurality of signal pairs, generating a respective rotated angular domain cross-correlation vector based, at least in part, on a physical angle associated with locations of a pair of error microphones associated with the signal pair;
generating a summed angular domain cross-correlation vector by summing the rotated angular domain cross-correlation vectors;
generating a weighted angular domain vector by applying a weighting vector to the summed angular domain cross-correlation vector; and
identifying directional information of a desired audio signal associated with the speech from the weighted angular domain vector.