1. Technical Field
The present disclosure relates to a speech communication terminal and connection method used for applications such as Voice over IP (VoIP), and relates to technology that switches the mode of the speech codec.
2. Description of the Related Art
In a communication system according to a standard such as 3GPP, call control is conducted when user equipment (hereinafter, UE) conducts communicated based on the Internet Protocol (IP). With call control, an IP address and port number to use for communication is exchanged with the communication peer, the speech codec to use for communication is negotiated, and a data pathway is secured, for example. Call control according to 3GPP is conducted over an IP Multimedia Subsystem (IMS). An IMS network is a network for managing information for the purpose of call control, routing call control signal messages (Session Initiation Protocol (SIP) messages), and interconnecting with 3GPP legacy networks and networks other than 3GPP (see, for example, 3GPP TS 23.237 V12.5.0 “IP Multimedia Subsystem (IMS) Service Continuity”).
FIG. 8 is a flowchart illustrating an example of a procedure leading up to VoIP telephony using the 3GPP IMS. FIG. 8 illustrates an example flowchart for the case of a UE 100 calling a UE 102. As illustrated in FIG. 8, a SIP INVITE message is transmitted over the IMS network from the UE 100 to the UE 102 (ST11), and a SIP 183 Session Progress message is transmitted over the IMS network from the UE 102 to the UE 100 (ST12). In this way, the SIP INVITE message and the SIP 183 Session Progress message are exchanged between UEs, and a negotiation related to communication is conducted.
A Session Description Protocol (SDP) offer is attached to the SIP INVITE message. The SDP offer states information needed to conduct VoIP communication, such as supported speech codecs and candidates related to the payload format, for example. The UE 102, upon receiving the SIP INVITE message in ST11, selects appropriate media information such as a speech codec from among the multiple candidates stated in the SDP offer, and states the selected media information in an SDP answer. The UE 102 attaches the SDP answer to the SIP 183 Session Progress message and transmits to the UE 100 in ST12.
The media information selected by the UE 102 is analyzed by the IMS network, and instructions to allocate resources corresponding to the analysis result to the current communication session are output to the IP core network. If there is a communication pathway (General Packet Radio Service (GPRS)) corresponding to the requested resources, a PDP context (or an EPS bearer in the case of the Evolved Packet System (EPS)) is established. Following the instructions from the IMS network, a resource allocation process is conducted on the IP core network and the radio access network (ST13). After the resource allocation process is completed, a user call is conducted on the UE 102 (ST14). If the user responds, a 200 OK message is transmitted to the UE 100 (ST15), and telephony is initiated between the UE 100 and the UE 102 (ST16).
FIG. 9 illustrates an example of an SDP offer and an SDP answer. In FIG. 9, with the SDP offer, the UE 100 is offering the four modes of (the payload format of) the Adaptive Multi-Rate-Wideband (AMR-WB) bandwidth-efficiency mode, (the payload format of) the AMR-WB octet-align mode, the AMR bandwidth-efficiency mode, and the AMR octet-align mode. The UE 102 has selected the AMR-WB bandwidth-efficiency mode.
Additionally, in the case of changing the speech codec or mode during telephony, it is necessary to exchange IMS signaling messages again, or in other words an SDP offer and an SDP answer, and conduct renegotiation.
At this point, some speech codecs have multiple modes. The Enhanced Voice Service (EVS) standardized by the 3GPP is one such example. According to literature of the related art (S4-130778: EVS-4 Design Constraints), EVS has AMR-WB interoperable modes (hereinafter, interoperable modes), as well as AMR-WB non-interoperable modes (hereinafter, non-interoperable modes or native modes), which include a narrowband (NB) mode, a wideband (WB) mode, a super wideband (SWB) mode, and a full band (FB) mode.
In some cases, these modes may need to be switched in the middle of a session. For example, in some cases, the speech codec changes for one of the UEs during communication due to Single Radio Voice Communication Continuity (SRVCC) or SRVCC with ATCF enhancement described in 3GPP TS 23.216 V12.0.0 “Single Radio Voice Call Continuity (SRVCC)”. The case of SRVCC will be described using FIG. 10.
For example, suppose that while two terminals UE 100 and UE 102 are communicating using the EVS SWB mode, the UE 100 performs an SRVCC handover from an LTE coverage area (PS) to a circuit-switching network (CS). Since EVS is not supported on the circuit-switching network, the speech codec used by the UE 100 changes to a speech codec supported by the circuit-switching network (such as AMR or AMR-WB, for example). At this point, it is necessary to conduct IMS session renegotiation to also change the speech codec on the UE 102 side to AMR or AMR-WB, or not conduct the session renegotiation and instead perform transcoding at an intermediate gateway (see Japanese Unexamined Patent Application Publication No. 2013-12855 and Japanese Unexamined Patent Application Publication No. 2013-12856). In the case of transcoding, it is desirable from a quality perspective for the bandwidth of both terminals to be aligned. Thus, for example, when the speech codec of the UE 100 switches to AMR, it is desirable to switch to EVS NB mode on the UE 102 side without a session renegotiation. Likewise, when the speech codec of the UE 100 switches to AMR-WB, it is desirable to switch to EVS WB or AMR-WB interoperable mode on the UE 102 side without a session renegotiation.
Next, a mode switching method will be described. As one method of switching among these modes in the middle of a session without renegotiating by IMS signaling messages, there is a method of including all mode-related information in the RTP payload.
FIG. 11 is a diagram illustrating the structure of an RTP packet. An RTP packet is made up of an IP header, a UDP header, an RTP header, and an RTP payload. The RTP payload is the data portion (payload) carried by the RTP packet. In other words, according to the above method, information about modes may be obtained by checking the content of the RTP payload.
However, AHEVS-272 3GPPSA4-EVS SWG Conference Call #29 (Aug. 29, 2013) proposes a method of putting information in the RTP header rather than putting information in the RTP payload. In other words, as in FIG. 11, first, separate payload type (PT) numbers (97, 98, 99, 100) are assigned to the NB, WB, and SWB AMR-WB non-interoperable modes as well as the AMR-WB interoperable mode and stated in the SDP offer. Next, all modes are selected as a group in the SDP answer. Subsequently, when switching modes becomes necessary, the payload type number corresponding to the mode after the switch is stated in the payload type (PT) field of the RTP header, thereby switching the mode. This method is called payload type (PT) switching.