A particular application of the invention lies in detecting speech, for example in apparatus for concentrating telephone signals by speech interpolation, or for generating background noise intended to reduce subscriber awareness of interruptions in a telephone network where transmission paths between subscribers in communication with each other may be temporarily interrupted, e.g. for echo suppression or for concentrating channels.
Known devices for concentrating telephone signals by speech interpolation as applied to some multichannel telephone links serve to increase the capacity of said links by not transmitting the silences in conversations. These devices thus include speech detectors for identifying speech periods and periods of silence on each of the channels to be concentrated. The performance of such concentrator devices is directly related to the quality of speech/silence discrimination as performed by said speech detectors. In particular, it is desirable that these speech detectors should be most insensitive to noise in order to obtain a high concentration rate, while at the same time being highly sensitive to speech signals in order to avoid chopping speech.
One known method of detecting a speech signal on a telephone transmission channel consists essentially in monitoring the level of the signal on the channel in question. Thus, any signal on the monitored channel whose level over a given time interval is greater than a threshold set above the noise level on the channel, is assumed to be a speech signal, while everything else is assumed to be silence. In order to perform such speech/signal detection on a channel, it is known to generate a signal which is representative of the average level of the signal present on the channel over a unit time interval and to compare said generated signal with an upper threshold level for noise on the channel. Another known way of performing such speech signal detection on a channel consists in sampling the channel at 8 kHz and in comparing the amplitude of each sample of the signal present on the monitored channel with an upper threshold for noise on the channel, and in giving each sample an algebraic "mark" which is a function of the comparison, and in accumulating the marks given to successive samples in order to discriminate between speech and silence on the basis of the accumulated "marks".
In order to determine the comparision threshold for use in such speech signal detection, the maximum noise level allowed by the CCITT for a telephone channel, i.e. -40 dBmo is sometimes taken as the reference noise level. However, the noise level on a telephone channel may be substantially different from this value, and better speech/silence discrimination may be obtained by evaluating the noise level on the channel being monitored and by varying the comparison threshold as a function of the evaluated noise level. Proposals for this technique are also known.
Evaluating the nose level on a telephone channel is also useful for generating background noise in some kinds of telephone network. Thus, when a telephone call uses a four-wire circuit fitted with echo suppressors, and when only one of the subscribers on the call is speaking, the telephone channel in the four-wire circuit for transmitting from the silent listening subscriber towards the speaking subscriber is interrupted in order to prevent echo signals from propagating.
Downstream from the interruption, background noise is injected into the interrupted channel in order to compensate for the background noise normally heard by the currently speaking subscriber and generated by the equipment in the network from which the currently speaking subscriber has been disconnected by the interruption. Evaluating the level of the genuine background noise and matching the level of the injected background noise thereto ensures that the noise level is properly compensated, and thereby substantially reduces the subscribers' perception of said interruptions, which perception interferes with smooth conversation.
The problem of reducing subscribers' perception of temporary interruptions in transmission channels during a call is also present in telephone networks which are equipped with speech interpolator concentration devices. In such devices, an evaluation of the noise level on a channel may be used both to detect speech and to generate background noise.
The essential problem to be solved when the noise level on a telephone channel is to be evaluated on the basis of the signals present on the channel while it is in service, is to avoid the speech signals conveyed by said channel, some of which may be at very low level, from interfering with the evaluation. One solution to this problem proposed in published French patent document No. 2 451 680 corresponding to U.S. Pat. No. 4,331,837, which describes a noise level evaluation circuit for use on a telephone channel integrated in a speech detector, is to use two average filters, a first one of which has a fairly narrow band while the second filter has a very narrow band. Both filters process the signal present on the channel being monitored during periods of silence as determined by the speech detector, the first filter also processes the signal present on the channel during periods of speech activity as detected by the speech detector, while the second filter is continuously initialised to twice the value of the output signal from the first filter. The noise level of the channel is evaluated as being equal to the last common convergence value of both filters. Such a method of proceeding is based on the fact that the background noise on a telephone channel is generally fairly constant, unlike speech signals which have a highly variable level. Under such conditions, the two filters only converge towards the same value when the background noise has been present on the channel on its own for a relatively long period of time, where the equivalent time constants of the filters are respectively approximately 8 ms and 0.128 ms. The resulting common convergence value is then representative of the average noise level.
Another solution for solving the same problem is also based on these different characteristics of speech signals and background noise on a telephone channel and is described in published French patent document No. 2 482 389. In this case, a circuit for calculating an average generates a representation of the average level of the signal present on the channel being monitored over a unit time interval. Two sample-and-hold circuits extract and retain the maximum and the minimum values respectively of said representation, said sample-and-hold circuits being re-initialised at the same rate which has a long period relative to said unit time interval. A comparator compares the minimum and maximum values provided by the two sample-and-hold circuits in order to detect moments when the two values satisfy a given relationship determining a small difference between the values. The nose level is then obtained from the minimum values of the representation for which said relationship is satisfied. To perform these operations, the equivalent time constant of the average calculating circuit is 16 ms and the two sample-and-hold circuits are re-initialised at a higher rate while the channel is assumed to be inactive than while it is assumed to be active, e.g. every 256 ms and every 2 s respectively, said assumed states of the channel corresponding to states of a sequential machine which is sequenced under the control of parameters which are a function of the representation of the average signal level present on the channel, with the evaluated noise level playing its part in determining said parameters.
With both of these solutions, the noise level is evaluated only during intervals of silence which are fairly long. This can lead to excessive acquisition times, particularly in the case where the signals on a channel include not only speech signals from the subscriber transmitted over the channel, but also echoes signals of the speech signals from the other subscriber. In such a case, the noise level will only be evaluated when both subscribers stop talking simultaneously for a considerable length of time. Further, both of the above solutions require periods of speech and periods of silence to be recognised which means that complicated equipment is necessary and may be excessive in contexts where speech detection per se is not called for.
Preferred embodiments of the present invention remedy these drawbacks and enable the noise level on a telephone channel to be evaluated rapidly and accurately by means of simple equipment.