An example of conventional voice browser systems having a voice input/output function is a voice-controllable computer proposed in Japanese Patent Laid-Open No. 10-124293 by which a client performs voice synthesis and voice recognition. Unfortunately, a voice browser system having this configuration has the problem that when a client is implemented by hardware such as a portable terminal having small calculation resources, the processing load on the client is too large compared to the resources.
Accordingly, voice browser systems which synthesize and recognize voices by using hardware different from hardware for implementing a client have been invented. An example is a browser system or a voice proxy server proposed in Japanese Patent Laid-Open No. 11-110186.
In the above conventional voice browser system, however, a browser process for displaying data described in a markup language such as HTML is separated from a process for outputting and inputting voices by voice synthesis and voice recognition. Therefore, between hardware for performing voice synthesis and voice recognition and hardware for implementing a client, communication for exchanging voice output data and voice input data must be performed in addition to communication accomplished by HTTP or the like to exchange data described in HTML or the like.
This requires complicated communication control and control for synchronizing the individual processes and hence makes the construction of a voice browser system difficult. In addition, a fire wall which prohibits communication except for HTTP communication is often formed between a client and a server. Since no other communication is possible in this case, a voice browser system is difficult to construct.