Editing is commonly done in text oriented systems by reference to certain words, e.g., "delete all after word `A` and before word `B`." In sound oriented systems such as those dealing with voice messages, it is hard for a user to remember specific word locations; and there is no convenient way in the present state of the art to provide a display of the voice-entered message as an aid to the user. In one particular aspect, it is often difficult for a user to achieve a satisfactory time fit of edited signal segments.
The need for an editing capability for voice messages has long been evident in dictation equipment, magnetic sound recording equipment, and sound motion picture equipment. Efforts to meet that need have usually been limited to putting marks on the same, or an associated, record either so that a manual cut-and-splice operation can be done at a later time as taught, for example, in the G. R. Glenn U.S. Pat. No. 2,852,616, or so that a person later transcribing the message to text form will be warned where to expect changes as taught, for example, in the V. Stuzzi U.S. Pat. No. 3,916,121. One laboratory system is known to contemplate extraction of recorded message segments to construct separately a new segment sequence. This is taught by L. H. Nakatani in a paper entitled, "Computer-aided Signal Handling for Speech Research" and appearing in the Journal of the Acoustical Society of America, Vol. 61, No. 4, April, 1977, pp. 1056-1062. In that system, a skilled operator can observe an analog signal wave display of a sound recording, manually position cursors on the display to designate so-called "tokens" for removal, order the marking of cursor positions, and then order the tokens to be extracted and played back in a subsequently selected order. Although the system is useful for speech perception and synthesis research, it is not useful to the unskilled user for message editing.