This invention relates to a media communication system, a terminal apparatus and a signal conversion apparatus used in this system. In particularly, the invention relates to a media communication system for communication of media such as voice and images via an IP network between terminals constructed so as to be capable of IP communication, as well as to a terminal apparatus and signal conversion apparatus used in this system.
VoIP (Voice over IP) technology through which telephone communication is implemented by IP communication has been developed and has reached the product phase in recent years. What characterizes VoIP technology is the fact that a voice signal is transmitted via IP communication. By using IP communication to control a terminal that has been connected to an IP communication network, the provision of more flexible services can be expected. In order to realize VoIP communication, a protocol stipulated by ITU-T Recommendation H.323 has been developed and is now in wide use. In recent years, moreover, consideration has been given to controlling the connection of calls by SIP (Session Initiation Protocol). Study of this protocol is being forwarded with a view to implementation.
In order to maintain the quality of voice when voice is transmitted using IP communication, variations in transmission delay of the IP packets that transport voice must be not be allowed to exceed a certain fixed value, and techniques for maintaining the network conditions of IP communication for this purpose are being studied widely. Voice communication at a problem-free quality is feasible even at the present time if the communication path for carrying out IP communication has a bandwidth sufficiently. If this is not the case, however, the state of the art is such that voice quality cannot be maintained satisfactorily. Investigations and research in this area are being conducted aggressively at the present time.
In communication systems in which wireless communication is used up to the terminal, as in cellular telephone systems, the data transfer bandwidth over such wireless segments of the communication path is not large. As a consequence, if communication traffic by way of IP is increased for the purpose of achieving more economical communication, the IP communication traffic in the wireless segments increases and so does delay. In an instances where IP communication is performed using an ordinary telephone line, an increase in traffic will make it difficult to carry out high-quality voice communication unless sufficient bandwidth is provided for the IP network.
Thus, there is keen demand for a method or system that will assure voice quality by reducing delays in the transmission of voice IP packets even when there is an increase in IP communication traffic.
The present invention seeks to solve the above-mentioned problem by performing media communication such as voice communication through conventional communication techniques without relying upon IP communication over segments of the transmission path where sufficient bandwidth cannot be acquired. As a result, control of a terminal can be carried out by IP communication while assuring the quality of voice, and it is possible to realize a media communication system that provides the flexible service that is the characterizing feature of VoIP communication. It should be noted that although the present invention is applied to general media communication inclusive of voice communication, the invention will be described with regard to voice communication because limiting the discussion to voice communication will better facilitate an understanding of the invention. However, since voice communication is but one form of media communication, it will readily be understood that the invention can be expanded to cover other media communication. In other words, the present invention is not limited to voice communication.
FIG. 16 is a block diagram showing the configuration of a network for implementing ordinary VoIP communication according to the prior art. Here a VoIP terminal 101 (VoIP terminal A) and a VoIP terminal 102 (VoIP terminal B) are connected to an IP network 130. The VoIP terminal 101 has a control unit 120 for performing connection control and media control, a voice/signal converter 122 for performing a conversion between voice and an electric signal, and an IP packetizing unit 124 having a function for placing a voice signal in an IP packet. The VoIP terminal 101 further includes an IP interface 126 for transmitting a control-signal IP packet sent and received under the control of the control unit 120, and for receiving a control-signal IP packet from the IP network 130, and an IP interface 128 for transmitting a voice-signal IP packet to the IP network 130 and receiving a voice-signal IP packet from the IP network 130. The VoIP terminal 102 has a structure similar to that of the VoIP terminal 101.
A procedure for connecting the call of a VoIP terminal generally is carried out in phases as shown in FIG. 17. The procedure, which is described in Chapter 8 of ITU-T Recommendation H.323 stipulating the H.323 procedure, can be divided into five phases, namely Phase A, Phase B, . . . , Phase E. Each phase will now be described with regard to a case where voice is communicated upon connecting terminals A and B together.                1. Phase A: Call Setup Phase        
This phase is a procedure through which agreement is obtained for the purpose of setting up a call between the two terminals. If the VoIP terminal 101, which is the originating terminal, is operated by a user to issue a call, then the VoIP terminal 101 sends a Setup message, which is for setting up the call, to the VoIP terminal 102, and the latter responds to receipt of the Setup message by deciding whether or not to set up the call. If the call is set up, the VoIP terminal 102 notifies the VoIP terminal 101 of call set-up by a Connect message and reports also the address (connect address) that will be necessary in the ensuing Phase B. The details of the procedure of Phase A will now be described with reference to FIG. 18. The latter is an example that makes use of the H.323 protocol.
First, a connection controller 141 in the control unit 120 of VoIP terminal 101 determines the destination using the IP address of an IP interface 127 of a control unit 121 in the VoIP terminal 102, edits an IP packet that contains a message (Set Up message 301) for requesting call set-up and requests the IP interface 126 to transmit this IP packet. The IP interface 126 transmits the IP packet to the IP interface 127 of VoIP terminal 102. In this case, it is required that the control unit 120 know the IP address of the VoIP terminal B. In order to simplify the description, however, it will be assumed that the control unit 120 already knows this IP address. A method of acquiring an IP address when the IP address is unknown is well known in the art and is described also in the H.323 documentation.
Upon receiving the Set-Up message 301 from the VoIP terminal 101, the IP interface 127 of VoIP terminal 102 delivers this request message to a connection controller 143 in the control unit 121. The connection controller 143 determines whether a call can be connected in response to the Set-Up message 301 and, if it determines that the call can be connected, sends back an answer message (Connect message 302) to the VoIP terminal 101. The Connect message 302 reports the IP address and port number to be contacted by the VoIP terminal 101 in Phase B. In the example of FIG. 16, the contact address is constituted by the IP address and port number necessary to communicate with a media controller 144 that exercises control in Phase B. Accordingly, the IP address of the IP interface 127 and the port number for selecting the media controller 144 are reported to the VoIP terminal 101 as the contact address. Upon receiving the above-mentioned answer message, the VoIP terminal 101 in effect agrees with the other terminal to connect the call and delivers the privilege for subsequent control to a media controller 142 (Start 303 in FIG. 18) together with the IP address and port number of the VoIP terminal 102 to be contacted.                2. Phase B: Initial Communication and Capability Exchange        
The media controller 142 of VoIP terminal 101 edits information necessary for voice communication, the information including (1) the voice encoding scheme of the VoIP terminal 101, (2) the IP address of the IP interface 128 that sends and receives voice packets, and (3) the port number of the IP packetizing unit 124, and sends this information (Open Logical Channel message 304) to the destination indicated by the IP address and port number of the VoIP terminal 102 that were reported through the procedure of Phase A. Upon receiving this message, the media controller 144 of control unit 121 in VoIP terminal 102 determines whether the voice encoding scheme on the side of VoIP terminal 102 matches the requested encoding scheme. If the schemes match and voice communication is possible, the media controller 144 edits information necessary for voice communication, the information including the IP address of an IP interface 129 of VoIP terminal 102 that sends and receives voice packets, and a port number for selecting a voice IP packetizing unit 125, and sends this information to the VoIP terminal 101 (Open Logical Channel Ack 306). As a result of these operations, information for communicating voice between the VoIP terminals 101, 102 is obtained on both sides.
In a case where SIP (Session Initiation Protocol) is used to make the connection instead of the H.323 protocol, the sequence becomes as shown in FIG. 19. In this case, Phases A and B are consolidated and expressed by a single message. Specifically, the connection controller 141 of VoIP terminal 101 queries the media controller 142 regarding the conditions usable in media communication (401) and, as a result, information necessary for media communication, namely the voice encoding scheme of the VoIP terminal 101, the IP address of the IP packetizing unit 124 and the port number, etc., is obtained. The connection controller 141 sends the call set-up request, which is inclusive of this information, to the connection controller 143 of VoIP terminal 102 (Invite message 402). Upon receiving the Invite message 402, the VoIP terminal 102 determines whether the connection can be established and, if the connection can be established, reports the conditions of VoIP terminal 101 to the media controller 144 of VoIP terminal 102, acquires the conditions (403) on the side of VoIP terminal 102 from the media controller 144 and sends this information to the VoIP terminal 101 by an OK message (404). In order to verify receipt of the OK message, the VoIP terminal 101 transmits an ACK message to the VoIP terminal 102 (416).
The connection controller 141 of VoIP terminal 101 delivers the information of the OK message to the media controller 142. As a result of these operations, information for communicating media between the VoIP terminals 101, 102 is obtained on both sides.                3. Phase C: Establishment of Audiovisual Communication        
The media controllers 142, 144 of the control units in both VoIP terminals notify the IP packetizing units 124, 125 of the destination IP addresses and port numbers, which are for sending and receiving voice packets, acquired through the above-described procedure, and the IP packetizing units 124, 125 start sending the voice signal using the reported IP addresses and port numbers as the destinations. Start messages 305, 307 in FIG. 18 and Start messages 407, 408 in FIG. 19 correspond to the parts of the procedure set forth above.
The voice packet arrives at the IP interface 129 having the set IP address of the destination and the packet is input to the receiving-side IP packetizing unit 125 selected by the specified port number. Next, the IP packetizing unit 125 converts the IP packet to a voice signal and a voice/signal converter 123 converts the voice signal to voice and outputs the same. The voice signal in the opposite direction is transmitted in a similar manner, whereby voice communication becomes possible (308 in FIG. 18 and 409 in FIG. 19).                4. Phase D: Call Service        
By changing the IP address of the communicating party to another IP address during a call, it is possible with the communication established in Phase C to change the destination of the connection. Services such as third-party conversion and call transfer are implemented using this function.                5. Phase E: Call Termination        
In order to release a connected call, the connection controller 141 on the calling side sends a release request message (Release message 309 in FIG. 18 and Bye message 410 in FIG. 19) to the VoIP terminal 102 on the called side and instructs the IP packetizing unit 124 to terminate the sending of voice (313, 314 in FIG. 18 and 414, 415 in FIG. 19). Upon receiving the release request message, the connection controller 143 of the VoIP terminal 102 instructs the IP packetizing unit 125 to halt the sending of voice (311, 312 in FIG. 18 and 412, 413 in FIG. 19) and sends a message to answer the release request (Release Ack 310 in FIG. 18 and OK 411 in FIG. 19). As a result, the resources that were being used in the connection of the call are released and the call can be disconnected.
If the VoIP terminal 101 that requested release cannot receive Release Ack 310 within a fixed period of time, this terminal resends the release request message. This makes it possible to release the call connection reliably even in cases where the message has been lost.
FIG. 20 is a diagram showing the network configuration of a conventional All-IP architecture inclusive of a wireless transmission segment. Here a mobile network inclusive of a wireless segment is constructed by a cellular telephone terminal 51, which is a mobile station, a wireless base station 52, an SGSN (Serving GPRS Support Node) 53 and a GGSN (Gateway GPRS Support Node) 54. GPRS (General Packet Radio Service) is a function having 3GPP architecture that provides the mobile subscriber with a packet-data service. SGSN 53 and GGSN 54 are both nodes having a gateway function for a 3GPP core network furnished with a packet service. SGSN 53 is provided on the side of the base station, GGSN 54 is provided on the side of an IP network 55 and both send and receive packets in accordance with the GTP protocol.
Connected to the IP network 55 in addition to the GGSN 54 are an SIP proxy server 56, which performs connection control, and an IP telephone (IP Tel) 57. Furthermore, a PSTN (Public Switched Telephone Network) 59 is connected to the IP network 55 via a media gateway (MG) 58.
In response to a request from a user to originate a call, the cellular telephone terminal 51 creates an IP packet inclusive of an invite message in accordance with SIP/TCP/IP and sends the packet to the SIP proxy server 56 via the wireless base station 52, SGSN 53, GGSN 54 and IP network 55 in the order mentioned. The SIP proxy server 56 obtains the IP address of the communication destination based upon information concerning the communicating party contained in the invite message and sends the invite message to this communication destination. If connection to the cellular telephone terminal 51 is possible, the communication destination sends the IP packet, which includes an OK message, to the cellular telephone terminal 51 by way of the SIP proxy server 56. The cellular telephone terminal 51 thenceforth places voice in an IP packet in accordance with RTP/UDP/IP and transmits this IP packet to the communicating party via the wireless base station 52, SGSN 53, GGSN 54 and IP network 55 in the order mentioned. The voice packet from the communicating party is received, returned to a voice signal and output. It should be noted that RTP stands for Real-time Transport Protocol.
Thus, in conventional VoIP communication, the VoIP terminal is connected to the server (the SIP proxy server in FIG. 20) and control signals are exchanged by the server and terminal to effect the connection between them. In this case, the server need only be connected to the IP network and therefore connection control can be carried out utilizing any server that does not depend upon a telephone company. This is advantageous in that flexible service can be provided.
In summation, the following advantages (1) to (3) are obtained in accordance with VoIP communication of the conventional All-IP architecture:
(1) End-to-end control is possible. Service can be implemented by the functionality of a terminal or by the functionality of a node (server) that is independent of a network.
(2) Because end-to-end control can be carried out, a mechanism for service implementation can be constructed independently of an IP network. Further, functions for implementing service can be utilized in common by various communication networks and, hence, the cost of service implementation can be reduced.
(3) If IP data communication increases, so does data traffic and a strategy is instituted to increase the capacity of the IP network to cope with this. If the capacity of the IP network is enlarged, the amount of communication resources for special communication such as voice communication declines in comparison with the capacity possessed by the IP network and it becomes unnecessary to set aside resources.
Nevertheless, VoIP communication of the conventional All-IP architecture has certain problems, which are as follows:
(1) Human beings are sensitive to voice quality. It is necessary, therefore, to provide a high IP-communication quality in order to avoid a decline in voice quality as caused by delay of IP packets. With the prior art, however, measures for dealing with delay of IP packets are unsatisfactory and high-quality communication of voice cannot be achieved. A decline in quality due to delay is great especially when there is a segment in the communication path that does not possess sufficient bandwidth for data transmission.
(2) An IP packet is composed of a header and payload, and overhead resulting from the header is large. The problem that arises is that efficient communication cannot be carried out in the case of voice communication where the amount of data in one IP packet is small. More specifically, with voice communication, it is necessary to send IP packets in short intervals (e.g., 20 ms) and therefore sophisticated functionality is needed to compress the header. This is not easy to furnish.
(3) In instances where radio communication is used, as in the case of a cellular telephone, the data transmission bandwidth over the wireless segment of the transmission path is small. Delay over such a segment is large and degrades voice quality. In networks that have wireless segments, therefore, sophisticated techniques are required to implement IP communication with high quality. This is not easy.
(4) In order to raise the voice quality of a cellular telephone, the signal is transmitted upon being separated into a portion that is important for voice and a portion of lesser importance. If the same method is employed with communication of IP packets, however, the amount of data contained in one IP packet diminishes even further and efficient transmission is difficult to accomplish.
(5) As mentioned above, it is necessary to send IP packets at small intervals (e.g., 20 ms) in order to transmit voice using IP packets. As a consequence, the number of IP packets sent over a fixed period of time is large and the routers that transfer the IP packets require a high processing capability. In particular, when a firewall is used to maintain security, processing for verifying all voice IP packets is required. The result is an increase in amount of processing, making a high-performance firewall necessary. This raises cost.
(6) Though VoIP devices have proliferated, conventional voice telephone equipment is still prevalent by far. This means that it is necessary to utilize conventional voice telephone facilities efficiently. However, such facilities cannot be exploited satisfactorily with the conventional All-IP architecture.