The present invention relates to a natural language processing system and more particularly a boundary extracting system as a method for extracting the boundary of a clause or phrase or the boundary of a subject and its predicate as a breaking point for defining the meaning of a sentence or the pronunciation of its words.
Recently, with the development of voice-synthesizing techniques, a reading system and a voice-response system have been realized. However, ordinary systems can generate only mechanical sounds rather than a human voice; instead, reading and responses should be given in more natural-voice sounds.
To give natural-voice sounds, it is necessary to produce an intonation pattern which correctly represents the semantic delimitations of a sentence.
The boundary representing the semantic delimitations of a sentence can be extracted by describing the grammatical rules of a language, analyzing its syntax, and then extracting the boundaries from a syntax tree. This method is not widely used in synthesizing voice sound because, with this method, a whole sentence is considered a syntax tree. A more commonly used method in this field is to extract the boundaries within a sentence by syntax-analyzing a sentence from phrases; that is, by analyzing a sentence from smaller units.
In the method where a sentence is analyzed from smaller units to extract a phrase boundary, an inputted sentence is analyzed according to grammatical rules established by linguistic knowledge. Therefore, a great many grammatical rules must be established beforehand, and a sentence having an element that cannot be defined by such grammatical rules may not be outputted. Also, relative to languages which have not been widely studied, it is very difficult to establish systematic grammatical rules. Accordingly, there is a problem that extraction of a clause/phrase boundary in a sentence cannot be performed successfully.