Digital filtering is employed in many areas, for example, in measurement systems, source coding, echo cancellation etc. A relatively common filter function is decimation.
A decimator is a structure that combines samples into a single sample. It typically consists of an electronic hardware structure (ASIC's, FPGA's and the like) or software (DSP's), and can be used in any environment where sampling is possible. The decimation function typically has two objectives, namely reduction of the number of samples and an increase in the accuracy of the samples.
Sometimes a side effect of decimation is the most useful property, namely its low pass characteristic. Fast variations between samples ‘disappear’, or better, average out. Although the low pass characteristic really is a side effect, it is possible to make a low pass function without reducing the number of samples.
The way that the decimation function performs this operation is relatively straightforward. The samples are mixed together and averaged. The increase in accuracy is related to averaging the spread of the samples.
In electronic and software environments such decimation is a common function, used in many applications. In these fields a few factors typically influence the decimator, typically the input sample rate, the output sample rate, and the allowable chip real estate (hardware) and time (software).
Sometimes these factors are hard to satisfy. For example, it can happen that the input rate has a large dynamic range, whereas the output does not scale along with it. Such an example can be seen in PLL's, where the reference frequency may be as low as in the Hz range, but as high as 10 GHz. Sample processing at 10 GHz is not feasible with current technologies, and only limited processing, such as counting, is really feasible. Still, up to about 1 GHz current technologies can properly handle the processing, albeit at the cost of power and complexity.
The consequence of a large dynamic range for sampling implies that a decimator function may necessarily be flexible. Normally flexibility in a decimator requires extra hardware.
A conventional decimator is a structure with a group of memory buffers. The simplest form of decimator decimates by two. A single memory stores a first sample, which is added to a second sample to yield one combined sample. The second sampler uses a slower sampling rate; twice as slow. Such a structure is shown in FIG. 1.
If the circuit needs to be expanded to three samples being combined an extra memory and an extra adder operation are added. Such an arrangement is shown in FIG. 2, which illustrates decimation by 3. The adder operation can be shared in the time domain, but that in turn requires multiplexing hardware, which requires overhead.
FIG. 3 shows a decimator that does not need to change its sample rate. In such cases it is typically difficult to share adders, unless the decimator runs at much lower speeds than the system clock.
A decimator that averages for instance 128 samples requires a lot of hardware with the above structures. It is possible to change the structure slightly, so that at least the number of adder stages is limited, as is shown in FIG. 4. This structure uses an in-between integrated value, in which each sample is added and after a delay again subtracted. Thus the contribution of a single sample is only temporary, and at the same time the number of adders is limited to two. If this structure is expanded to 128 memory locations, the number of adders does not increase. The structure allows for a high outgoing sample rate. This can be important in certain applications, although they are probably few that do not use sample rate reduction. The structure, however, has a potential flaw if the digital parts are not 100% reliable. For instance, due to the presence of an alpha particle a memory location might change, and thus make a difference between the contents of the memory delay line and the extra integrator. This is unavoidable, and can only be repaired at high extra hardware or software cost. The structure can easily extended to any number of stages.
There is another problem with this structure, and that is the output divider. Only representations that fit well with the division are simple to divide. As such a ternary coding scheme allows simple division by 3. However, most digital hardware is based on binary coding, and thus is only simple to use with divisors that are powers of two; in that case the division is a simple shift, which does not cost any hardware. Most applications use division by powers of two and rate reduction. This can be achieved by using the circuit of FIG. 1 as a repeated module as shown in FIG. 5. This module can be repeated to reduce the sample rate in binary steps.
Which of the above prior art structures is most attractive for a particular application depends on many factors, and is not very easily established. Typical design factors are the process for which the design is intended, the sample rate and the sample size (wordsize). Microcontrollers and DSP's find a load of memory with a round robin structure attractive and fast; memory is low cost. Thus the structure shown in FIG. 4 is quite often most attractive. In hardware the cost of memory is typically not negligible, and the structure shown in FIG. 5 will quite often be more attractive. If the word size is very small, hardware in the form of the structure shown in FIG. 1 may in fact be quite attractive, since the total hardware size is small, even for large decimation numbers.
For all three structures, however, it is not very simple to introduce flexibility. Existing flexible structures normally use a mixed approach, as shown in FIG. 6. This module will normally be designed such that N can only take on powers of two; this somewhat limits the complexity of the module. The structure is now the same type of structure discussed, but not with rate changes of 2 per unit, but some other number N1, N2, N3 etc.
The structures that are relatively common have two or three modules, a fact which illustrates the attractiveness of this approach. The flexibility that typically is required will, even with this structure, require a considerable amount of programming of the constituent parts. In general this is highly unattractive.