The network entities defined in the H.323 recommendation considered here are endpoints (EP) and gatekeepers (GK). Endpoints comprise terminals, gateways (GW), multipoint control units (MCU), or more generally any entity capable of generating or receiving calls and processing the associated information streams.
The procedures for admitting and setting up calls used in the networks and entities defined in the ITU-T H.323 and H.225 recommendations are particularly sensitive because they use signaling necessary for sending or receiving an audiovisual call.
The current versions of the standards governing the call admission and call set-up functions cover different call set-up models and different organic architectures.
Where call set-up models are concerned, the messages defined in the H.225 and H.323 recommendations are segmented into two functional groups: the Registration, Admission, and Status (RAS) group and the Call Signaling (CS) group. To each of these groups there corresponds a network communication channel. The messages concerned are:
a) For the RAS functions:
                Admission ReQuest (ARQ), Admission ConFirm (ACF), Admission ReJect (ARJ)        Location ReQuest (LRQ), Location ConFirm (LCF), Location ReJect (LRJ)b) For the CS functions:        Setup.        
The H.323 recommendation defines two main call set-up modes according to the routing of the messages of the CS group.
A first mode is the Gatekeeper Routed (GKR) mode. Referring to FIG. 1, to call an endpoint EP2, a calling endpoint EP1 sends (step 1) an ARQ message to a gatekeeper GK1 with which it is registered. The gatekeeper GK1 sends (step 2) LRQ messages to and receives LCF responses from the gatekeeper GK2 with which the endpoint EP2 is registered, possibly via other intermediate gatekeepers GK1. Following reception of the LCF response, GK1 sends (step 3) an ACF message to EP1. On reception of the ACF message, EP1 sends (step 4) a Setup message to GK1. The Setup message is then relayed (step 5) between the various gatekeepers until it reaches GK2, which forwards it (step 6) to EP2. On reception of the Setup message, EP2 initiates an admission phase (step 7) with the gatekeeper GK2 with which it is registered. The final phase (step 8) informs EP2 that it is authorized to accept the call.
A second mode is the Direct (DIR) mode. Referring to FIG. 2, the first three phases (steps 1, 2, and 3) are identical to those of the GKR mode. The difference is in step 4 in which the Setup message is sent directly to the called endpoint EP2, which is made possible by the destination address information returned in step 3 in the ACF message. The last two phases (steps 7 and 8) are identical to steps 7 and 8 of the GKR mode.
In both the GKR mode and the DIR mode:                The choice of the GKR or DIR mode is the responsibility of the gatekeeper and not of the endpoint. The endpoint can express a preference, however.        The gatekeepers GK1 and GK2 can optionally be interconnected via other gatekeepers (GK1) according to the organic architecture. In the simplest case, EP1 and EP2 are registered with the same gatekeeper GK1 with the result that GK2 coincides with GK1.        Sending LRQ messages is a dynamic mechanism for locating the destination gatekeeper; this mechanism is optional and can be replaced by management plane functions. Moreover, it is of use only if the two endpoints are not registered with the same gatekeeper.        Although defined in the standard, the case where an endpoint is not registered with a gatekeeper is not taken into account because it does not correspond to an operational reality for the audiovisual services deployed at present.        A third call set-up mode is defined in the standard: the hybrid mode. In hybrid mode, certain gatekeepers of the network apply the GKR mode while others apply the DIR mode. Because at least one network element is involved in call set-up, the hybrid mode is functionally similar to the GKR mode.        
According to the architecture choices and operational constraints, the functions that process messages of the RAS group and those that process Setup messages can be distributed between diverse network elements or equipments. As this distribution potentially impacts on the security of the audiovisual service, three organic architectures are distinguished.
Referring to FIG. 3, in a first architecture AI the RAS and CS-processing functions are integrated into the same equipment 1 with the possibility of complete correlation between fields of RAS and/or CS messages. In this architecture, the equipment 1 is generally a gatekeeper GK.
Referring to FIG. 4, in a second architecture AR the RAS processing functions are implemented in the equipment 1 and the CS processing functions are implemented in an equipment 2. CS messages are sent directly to the equipment 2 or optionally pass in transit via the equipment 1. When they pass in transit via the equipment 1, CS messages can optionally be processed by a CS pre-module in the equipment 1. That pre-module typically provides filtering, address translation, syntax verification or partial correlation with the RAS functions.
Referring to FIG. 5, in a third architecture AF the RAS processing functions are implemented in the equipment 2 and the CS processing functions are implemented in an equipment 3. The equipment 1 serves as a front-end vis-à-vis the endpoint by relaying RAS messages to the equipment 2 and CS messages to the equipment 3. Processing by an RAS pre-module or a CS pre-module can optionally be performed in the equipment 1, such as filtering, address translation, syntax verification or partial RAS/CS correlation. The main role of the equipment 1 is still to protect the equipments 2 and 3 by preventing direct access to them from the endpoint, i.e. the client.
Conforming with the H.323 and H.225 recommendations in their current state is insufficient to immunize H.323 networks at call set-up time against certain attacks seeking to misappropriate an identity and obtain free access to the service. These attacks have been demonstrated by the inventors in a laboratory environment on certain systems conforming to the recommendations from the security point of view.
Where identity theft is concerned, for a given subscriber to an audiovisual service, the attack consists in sending a call with the identity modified, typically by changing the call number assigned by the operator to another number, which may optionally be assigned to a third-party client. In addition to the integrity problem faced by the audiovisual service, this attack can also have consequences on billing if the attacker uses a source number that is the number of an existing client.
By way of illustration, two attack modes A1 and A2 may be mentioned. The A1 mode modifies the calling number in the ARQ message. This is possible in systems in which the RAS function does not correlate the fields [endpointIdentifier] and [srcInfo] of the ARQ message enabling an endpoint to use a caller number that does not correspond to their registration number. This misappropriated number must then be transferred by the attacker into the [sourceAddress] field of the Setup message. The A2 mode sends an ARQ message that is correct insofar as the [srcInfo] field is concerned and then changes this number to a misappropriated number during sending of the Setup message, since this message is the basis for call set-up. This attack is possible if the CS function does not correlate the [sourceAddress] field of the Setup message with the [srcInfo] field of the ARQ message. Although easier to implement in the AR and AF architectures, this attack is also encountered in the AI architecture.
With regard to free access to the service, for an Internet user who does not subscribe to an audiovisual service, the attack sends a call on the network of the operator without being billed for that call. If successful in this, the attacker can generally use any caller number, which can be either an unassigned number or a number assigned to a regular client. This attack generates a loss of revenue for the operator and poses a billing problem if it uses a calling number that has already been assigned.
By way of illustration, two modes A3 and A4 of this attack may be mentioned. In the A3 mode, an attacker sends a Setup message directly to the CS function of the network that is responsible for call set-up. This is the simplest case and generally enables the attacker to set the caller number and the called number to any value. This attack assumes that the CS function does not verify if a previous ARQ message has been sent and that the Internet user actually has the right to send a Setup message. Although easy to implement in the AR and AF architectures, this attack is also encountered in the AI architecture. In the A4 mode, the attacker sends a Setup message following an ARQ/ACF exchange initiated by a corrupted client, i.e. a client that subscribes to the service but collaborates with one or more attackers. Thus the corrupted client sends an ARQ message and then, on reception of the ACF message, forwards to the attacker the information necessary for the attacker to be able to send the Setup message. The network correlates the Setup message with the ARQ message as belonging to the same dialogue, whereas these messages come from two different entities. Moreover the Setup message can contain a misappropriated calling number with the same consequences already referred to. Although easier to implement in the AF architecture, this attack is also encountered in the AR and AI architectures.
There are five main known protection mechanisms, called M1 to M5 below, for combating attacks of type A1 to A4.
The mechanism M1 uses a firewall which, placed on the upstream side of the equipments providing the H.323 RAS and CS functions, primarily filters packets on the basis of information on addresses and ports. In certain cases, this mechanism applies state filtering to H.323 messages, for example authorizing a Setup message only if it is preceded by an ARQ/ACF exchange. This mechanism therefore counters the A3 type of direct attack, but is generally ineffective against more sophisticated attacks that exploit the insufficient correlation between certain fields of ARQ and Setup messages.
The mechanism M2 is based on using dynamic ports. The attack A3 often being made possible by the use of a fixed listening port (usually the port 1720) for the CS function of the network, the idea here is to use a different port that can be more dynamic. That new port is communicated to the regular clients via the ACF message, which assumes that those terminals are registered for the service, unlike an illegitimate Internet user, who would send the Setup message directly to the port 1720 of the H.323 CS equipment. This mechanism is effective against the A3 attack but ineffective against more sophisticated attacks. Moreover, it is easy for a corrupted client to send the new value of the port used to one or more attackers, even if that port is dynamic and changes frequently.
The mechanism M3 analyses the degree of correlation between ARQ and Setup messages. The idea is to verify strictly certain fields detected as sensitive within ARQ and Setup messages. A first level of rules verifies intramessage (ARQ and Setup) fields while a second level of rules verifies intermessage (between ARQ and Setup) fields. Thus patent application FR05/02110 proposes improvements to the standard on the security plane by proposing verification rules for ARQ and Setup messages.
Those rules combat attacks A1 to A4 in the context of an AI architecture. However, they are not easy to implement in the AR and AF architectures, in which the RAS and CS functions of the network are not co-located, which makes the intermessage verification level much more complex.
The M4 mechanism is based on a witness token that consists essentially in an information element sent and/or relayed between entities. According to this mechanism, the RAS entity that receives the ARQ message associates with the ACF message a random token that has a limited validity period and then verifies that the endpoint regenerates this token correctly in the Setup message. This mechanism is beneficial for combating the A3 attack. However, in the present state of the art, it has a substantial weakness, namely enabling a corrupted client to communicate this token to one or more attackers who will intrinsically gain the right to send Setup messages. Moreover, this token being generally random, or even fixed, in the prior art, it is not satisfactory for making the correlation between the sensitive fields shown up in the mechanism M3.
The mechanism M5 uses an authentication process. Sensitive messages such as ARQ and Setup messages must include authentication of the client. There are many and varied authentication processes, generally based on an identifier and a password specific to each client. The operational drawback of this mechanism is that implementing it requires authentication servers. Moreover, this mechanism gives rise to the problem of having to be able to support an increase in the load if the number of clients is high. In practice, and for performance reasons, it is difficult if not impossible to authenticate all messages. Another drawback, of a more theoretical kind, is that authentication is not linked to the sensitive parameters of the messages, and so it is always possible, even for an authenticated client, to modify certain parameters of the dialogue and thereby create vulnerabilities in the network. Finally, a corrupted client divulging its authentication parameters to Internet users who do not subscribe to the service, in an extension of the A4 attack, could be imagined.