Conventional analogue loudspeakers rely for their operation on the motion of a diaphragm which is driven by some type of electromechanical motor, moving coil being the most common, though electrostatic, piezoelectric and ionisation devices have all been tried and used. The analogue loudspeaker as a whole attempts to reproduce the desired sound by moving all or part of the diaphragm closely in synchronism with a smoothly varying analogue electrical signal which is usually interpreted as representing the instantaneous sound pressure that a listener to the loudspeaker device should hear. The inherent limitations of such analogue loudspeakers are related to: the stiffness of the diaphragms used, the mass of the diaphragms, the linearity and efficiency of and power available from electromechanical motors with adequate bandwidth and limitations on the throw of the diaphragm. These and other factors combine to cause the analogue loudspeaker to operate with low efficiency and relatively high distortion levels.
With the current prevalence of high quality digital audio material available, frequently in 16-bit binary format with an inherent distortion level of close to 0.002%, it is clear that analogue hi-fidelity loudspeaker systems operating close to the 1% distortion level (500 times worse) are now the limiting factor in audio quality when listening to reproduced sound (including radio, television, compact discs (CD), and digital tape). Recent trends in electronic equipment have also been to minimise power consumption, not only to reduce power wastage, but also to reduce equipment operating temperatures thus allowing miniaturisation and high reliability, as well as portability, and allowing operation from small batteries. Again, the linear analogue power amplifier/loudspeaker combination operating at the 0.3% to 1% efficiency level is out of step with these trends. Lastly, even though digital audio source material is now commonplace and becoming increasingly so with the advent of digital radio and television, all conventional hi-fidelity systems for the reproduction of digital source material need to contain a digital to analogue converter (DAC) at some point in the system, to produce analogue signals for application to the analogue loudspeaker. The DACs themselves produce further noise and distortion that adds to that already present in the system, and also add extra cost.
Attempts have been made to develop a digital loudspeaker design that overcomes some or all of the limitations of analogue loudspeakers mentioned above. These fall into several categories: Pseudo-digital loudspeakers comprising a digital signal processor driving a standard analogue loudspeaker; Moving Coil Digital Loudspeakers with tapped “voice-coils”; and piezoelectric and electrostatic drivers, where the area of the diaphragm is divided into separate regions with binary-related areas
Pulse-width-modulation (PWM) has also been used in the context of “digital loudspeakers”. Here an analogue or digital input signal is converted into a two-level (binary in some sense) digital waveform whose instantaneous mark-space ratio is proportional to the instantaneous value of the input signal, with 50% mark-space ratio corresponding to zero input signal. The frequency of the PWM waveform may or may not be constant, but needs to be much higher than the highest input frequency, and for audio applications this implies it must in practice be greater than about 40 KHz. So long as that criterion is satisfied, the actual frequency is not critical. With a digital input signal, it is possible to produce a PWM waveform entirely digitally. However, when it comes to producing sound output, the PWM signal is applied to conventional linear transducers (e.g. moving coil loudspeakers). The result is that the inertia of the transducer causes it to respond to the average value of the PWM waveform (which instantaneously is the same as the mark-space ratio) which in turn is equal to the instantaneous value of the input signal. Sound is then produced corresponding to the input audio signal. As the device relies on the linearity of the transducer, this system has all of the disadvantages of analogue loudspeakers plus some extra ones related to the PWM conversion process, and so is really a digital amplifier technology, not a digital loudspeaker technology. It does have the virtue of higher efficiency than linear amplification.
Most previous attempts at building a digital loudspeaker system have assumed that binary digital code was the digital signal medium, not only at the input of the device but also right through to the output transducers. This causes serious technical problems in practice.
In an n-bit system, the transducer used for the least significant bit (LSB) of the output operates at a power level 2n−2 times less than the most significant bit (MSB) (discounting an assumed sign bit included in the n-bits). In an 8-bit system (the least that is useful for reasonable sound reproduction) there is thus a 64 times power ratio between MSB and LSB transducers. Because of the necessarily mechanical nature of sound producing devices (sound is a mechanical movement of air) this wide dynamic range imposes serious design constraints on the types of devices used for LSB and MSB transducers, and thus makes matching of the devices very difficult—the problems are much worse when one considers a more realistic 10 or 12 bit system where the ratios in power levels between the MSB and LSB transducers are then of the order of 250 to 1000 times, and for a 16-bit system the ratio rises to greater than 16000.
In a binary-weighted transducer (or transducer-array) system, there are serious transient problems caused at points where the code changes from a value with many consecutive low order zeroes or ones to the next level (up or down) where there are many consecutive low order ones or zeroes. For example, consider a 9-bit binary code where the signal level changes from (decimal) 25510=(binary) 0111111112 to (decimal) 25610=(binary) 1000000002. At this transition, the signal itself has changed by one least significant bit, i.e. a very small change. The binary code representation has changed from a zero plus all-ones code to a one-and-all-zeroes code. The effect of this on a system where the code bits each drive binary weighted transducers (and also binary weighted transducer-arrays as described in U.S. Pat. No. 4,515,997 which does not address this problem) is that in the first state all transducers except the most significant will be on, and in the second state all will be off except for the most-significant. Thus two half full-power acoustic transitions occur at this code point change which will inevitably produce considerable sound energy, even though the code change represents only a least significant bit change in signal amplitude which normally would be expected to be nearly inaudible. Other such ones-to-zeroes and zeroes-to-ones transitions occur throughout the signal amplitude range, and become more of a problem as the total number of bits increases, as the power of the transient increases relative to the system's least significant bit power level. Thus, increasing the resolution of the system by adding more bits makes the problem worse, not better.
In addition to the switching transient problem outlined above, there is also a level error associated with such zeroes-to-ones and ones-to-zeroes code changes. This is because in a real system the transducers cannot easily be matched precisely enough that the most significant bit transducer is precisely one least significant bit greater in effective power or amplitude than the sum of all the lesser-bit transducers acting in concert. The same is also true to a smaller extent for the next most significant bit transducer and its lesser-bit transducers acting in concert. Such unavoidable errors can in practice easily dominate the accuracy of the system and quite independently of the transient effects described above can lead to large distortion components. In a binary weighted transducer or transducer-array system (as described in U.S. Pat. No. 4,515,997 which does not address this problem either), only extreme mechanical precision can eliminate this problem even in principle, which will inevitably lead to high manufacturing costs even if the precision required is achievable. In practice, the transducers will necessarily be spatially separated and the matching problems at such transition points then become intractable. In a 16-bit system, compatible with current digital audio standards, it is highly unlikely that the necessary precision could be achieved at any cost.
Another problem not adequately addressed by existing digital loudspeaker designs is that of transducer dynamics and appropriate drive waveforms for producing the desired acoustic sound output waveform. All previous designs appear to make the assumption that the application of a square drive pulse (of voltage or current as appropriate) to the output transducers will produce a square acoustic output pulse. This is almost never the case in practice and leads to serious distortion in the generated acoustic waveform. For example, in the common case where the transducer moving mass is the dominant factor, and the principal forces to be overcome are inertial, then the application of a square drive pulse to such a transducer will produce approximately constant acceleration of the diaphragm which in turn will produce, to first approximation, a triangular or ramped acoustic output pulse, which will continue after the end of the input drive pulse at approximately constant amplitude as the diaphragm continues to “coast” due to its inertia. For the other common case where the diaphragm restoring spring forces are the dominant factor, then the application of a square drive pulse to such a transducer will produce a very rapid initial acceleration of the diaphragm causing it to move quickly to the point where the spring restoring force equals the driving force after which it will overshoot (depending on the damping of the system), and then settle around that point of equilibrium, after which the end of the drive pulse will produce a similar velocity profile in reverse. This motion will produce, to first approximation, a pair of narrow impulsive spike acoustic output pulses of opposite sign, separated by a time interval approximately equal to the input driving pulse length. Only in the case where the dominant forces on the moving mass of the transducer are resistive (e.g. due to friction or viscosity of the air being moved by the diaphragm) will its motion be of approximately constant velocity when driven by a square drive pulse, and only in this case will the output acoustic pulse be approximately of square pulse waveform. What this means in practice is that electrostatic transducers with exceptionally light diaphragms (which constitute the only moving mass in this type of transducer) are the only devices where a square drive pulse might be expected to produce an approximately square acoustic output pulse.