1. Technical Field of the Invention
The present invention relates to the field of mobile electronic system, and more particularly to indoor positioning using sounds of a music piece or a human speech.
2. Prior Arts
Sound has been suggested as a medium for positioning. For positioning in a large venue like a shopping mall or a supermarket, an important metric is the range of the sound. To limit the number of sound sources (e.g., loudspeakers), the sound used for positioning preferably has a long range.
Ultrasound, although widely used for ranging, fails in this aspect. Ultrasound suffers from severe attenuation when transmitted in air. For a distance of 100 meters, the transmission loss for a 40 kHz ultrasound is ˜140 dB (FIG. 1). In fact, ultrasound can only be practically projected to a range of ˜15 meters in air. As a result, ultrasound is not suitable for positioning in a large venue.
On the other hand, audible sounds attenuate much less in air. For example, the transmission loss for a 1 kHz audible sound is only ˜40 dB for 100 meters (FIG. 1). To be projected to a long range, audible sounds are preferably the sounds of a music piece or a human speech, whose volume can be turned up without causing annoyance to humans in the immediately vicinity. Furthermore, large venues are generally equipped with public address (PA) systems, where loudspeakers are required to provide a good acoustic coverage. It would be very attractive to leverage the existing PA systems and use the sounds of a music piece or a human speech for positioning in a large venue. Hereinafter, music is used as a primary example for indoor positioning. This concept can be easily extended to human speech.
Although it has many advantages, music-based positioning (MP) faces a difficult challenge. A large venue is filled with background noises and multi-path reflections. Apparently, not every portion of a music piece can be used for positioning. For example, the portion of the music piece that is barely distinguishable from background noise cannot be used. To be suitable for positioning, a musical segment (i.e., a burst) should possess enough uniquely identifiable properties. A figure of merit is its correlativity, which represents the relative strength of its auto-correlation vs. its correlation with other signals. A burst with a large correlativity is relatively un-correlated with its lagged replica or background noise. In a music piece or a human speech, a burst suitable for positioning is referred to as its signature burst. The auto-correlation function of the signature burst will exhibit a distinct central peak with quickly diminishing side lobe tails.
Litwhiler et al. (“A simple method for evaluating audible signals for acoustic measurements”, the Technology Interface, Vol. 7, No. 2, Spring 2007) taught a music-based positioning (MP) method. A music file is first sliced into bursts of 1 s long each. The correlativity of each burst is then calculated as the ratio between the peak value and the root-mean-square (rms) value of its auto-correlation function. A burst with correlativity higher than a pre-determined threshold (e.g., 20) is a signature burst, while the interval between two successive signature bursts is a non-signature interval. In the example illustrated in FIG. 2, there are 19 signature bursts (shown as cross-hatched bars) among 60 bursts evaluated. Its temporal coverage (i.e., the percentage of bursts that are suitable for positioning within a period) is ˜30%. The longest non-signature interval is 10 s, which is substantially longer than any signature burst. During a non-signature interval, no positioning can be performed using the musical sounds. Because it only provides positioning service sporadically, music-based positioning (MP) was not suitable for indoor positioning.