The present invention relates to the field of speech communication. In particular, it relates to speech processing for speech communication so that a low data transmission rate and substantially intelligible speech communication is achieved.
The recent development of digital communication has significantly impacted the way in which people around the globe communicate. For example, the Internet explosion has changed the lives of many people, both in businesses and as consumers. To some, the Internet is a source of information. To others, the Internet is a medium for communicating sound and/or images of communicating parties. Video conferencing among multiple parties via the Internet is available. Internet telephony is also fast becoming popular.
Whichever of the above modes of communication a business or a consumer chooses to use via the Internet, voice or speech communication inevitably forms a vital component of that mode of communication. One advantage of speech communication is that it is an efficient mode of communication. That is, the communicating parties do not need to write or type to consolidate their thoughts for communication. Another advantage of speech communication is that the xe2x80x9cvoice personalityxe2x80x9d of the communicating parties can be communicated. The intonation, pitch, accent, and like qualities of a speaking party can be transmitted to a listening party to invoke a more personal ambience during the communication. Conventional speech communication schemes are not, however, without their shortfalls. These speech communication implementations via the Internet, whether in conjunction with visual communication such as video conferencing or on its own such as Internet telephony, are based on frame synchronization of communicated speech data. Speech data is obtained by processing speech suitable for communication. The Internet, however, does not provide for synchronized data communication. Hence, frames of speech data that need to be synchronously communicated are not done so on the Internet, thereby rendering the speaking party""s speech to be discontinuous when the speech reaches the listening party. Discontinuously communicated speech typically contains interruptions that occur inconsistently and have varying durations. Hence, the effect of the communicated speech on the listening party is at best bothersome and at worst unintelligible. The Internet""s inconsistent data transmission rates further compound this problem. At times, the data transmission rate can be lower than the acceptable threshold required for reasonably intelligible speech communication. When both problems occur, the resulting effect causes speech communication to fail or become unacceptable.
The above adverse effects on speech communication differ in varying degrees for different languages. In languages comprising ideograms, for example the Chinese, Japanese and Korean languages, each spoken ideogram is monosyllabic or consists of a single phoneme. Hence, when conventional speech communication schemes are used for these languages, the resultant discontinuously communicated speech sensitizes the listening party. The intelligibility of the communicated speech for any of these languages depends heavily on the continuity of the single syllable or phoneme of each spoken monosyllabic ideogram in that language. Clearly, there exists a need for low data transmission rate and intelligible speech communication scheme for use on an asynchronous communication channel having inconsistent transmission rate.
Various aspects of the invention are directed to ameliorating or overcoming one or more disadvantages of conventional speech communication schemes. In particular, the aspects of the invention are directed to addressing the disadvantages associated with conventional speech communication schemes for use on an asynchronous communication channel having inconsistent data transmission rates. Furthermore, the aspects of the invention are directed to improving speech communication for languages consisting of ideograms.
In accordance with a first aspect of the invention, there is disclosed a method of processing speech representative of ideograms for speech communication using an asynchronous communication channel. The method includes the step of processing speech units of a speech and data indicative of the speech units. Each speech unit is representative of an ideogram or a plurality of semantically related ideograms, and the data indicative of the speech units is discretely communicable on the asynchronous communication channel for providing substantially low data transmission rate and intelligible speech communication.
In accordance with a second aspect of the invention, there is disclosed a method of processing speech representative of ideograms for speech communication using an asynchronous communication channel. The method includes the steps of: processing meaning groups of a speech and data representing the meaning groups, wherein each meaning group is formed from at least one ideogram identifiable by a meaning and the data representing the meaning group is discretely communicable on the asynchronous communication channel; and processing data dependent on the speech pattern of the speech in relation to one of both of the time and frequency domains, the dependent data communicable on the asynchronous communication channel, whereby substantially low data transmission rate and intelligible speech communication is provided.
In accordance with a third aspect of the invention, there is disclosed a speech processing device, including: a speech digitizer for processing a speech in an ideographic language and digitized speech thereof, and a semantic processor for processing the digitized speech by processing speech units representative of an ideogram in the speech or a plurality of semantically related ideograms and data indicative of the speech units which are discretely communicable on an asynchronous communication channel for providing substantially low data transmission rate and intelligible speech communication.
In accordance with a fourth aspect of the invention, there is disclosed a speech communication system for an asynchronous communication channel, including: a speech processing device for processing a speech in an ideographic language and digitized speech thereof by processing speech units representative of an ideogram in the speech or a plurality of semantically related ideograms and data indicative of the speech units which are discretely communicable; and a communication controller for communicating the speech information on the asynchronous communication channel for providing substantially low data transmission rate and intelligible speech communication.
In accordance with a fifth aspect of the invention, there is disclosed a computer program product for processing speech for communication on an asynchronous communication channel, including: a computer usable medium having computer readable program code means embodied in the medium for causing the processing of speech representative of ideograms for speech communication, the computer program product having: computer readable program code means for processing speech units of a speech and data indicative of the speech units, wherein each speech unit is representative of an ideogram or a plurality of semantically related ideograms and the data indicative of the speech units is discretely communicable on the asynchronous communication channel for providing substantially low data transmission rate and intelligible speech communication.