Generally, on information devices such as PC (Personal Computer) and WS (Work Station) there exists a control unit such as CPU (Central Processing Unit) which executes various types of data processing by sequentially executing program codes loaded in a working area of a storage unit such as a RAM (Random Access Memory). In recent years, illegal accesses to the information devices has been occurring by way of unauthorized execution using program codes created by malicious users.
Data managed by the information devices are generally assigned with access rights. Accordingly, the illegal accesses will not occur as long as the user has no access right. The illegal accesses will be possible if an unauthorized operation is made using a general program code assigned with the access rights to data. One known technique of this sort is using the so-called buffer overflow, allowing data at the time of execution of the program codes overflow out from a predetermined area allocated in the RAM or anything similar to RAM.
The illegal access using the buffer overflow will now be described in detail by showing comparison between the case where the program codes are properly ran and the case where the program codes are ran with an illegal action.
First, a model case where the program codes are properly ran will be explained referring to FIG. 9 and FIG. 10.
FIG. 9 is an illustration of an example source code of a program code. The source code in FIG. 9 starts with main( ) function, wherein strcpy_helloworld( ) function is called in main( ) function. In the source code, a character string “Hello!” is prepared in strcpy_helloworld( ) function, and is displayed on a monitor, computer screen and the like with the aid of printf( ) function in main( ) function. Note that the function herein means a subroutine or a subprogram having predetermined functionalities gathered into a module, and can be called in the main program as required.
The information about running the program code illustrated in FIG. 9 is stored in a call stack area allocated in the RAM. In the call stack area, the data are stored according to a LIFO (Last In First Out) or FILO (First In Last Out) structure.
FIG. 10 is an illustration example of an outline of the call stack area when the program code is running. As illustrated in FIG. 10, information about running the program code is stored downward into sequentially allocated areas.
For example, in the program code based on the source code illustrated in FIG. 9, Return Address1, ebp backup 1, char buf[8] and so forth are stored as a single stack frame, when main( ) function is processed. Next, when strcpy_helloworld( ) function in main( ) function is called, ReturnAddress2, ebp backup2, int i and so forth are also stored as a single stack frame.
ReturnAddress (return address) indicates an address value to which the process is to return upon the completion of a program. Return address is also an address value to which the process is to return upon completion of the subprogram or a function being called. ReturnAddress is automatically stored in the stack area by the CPU immediately after the execution of the program or immediately after the subprogram, i.e. function, is called.
Ebp is one type of CPU register, and indicates an address which is located just before the address of a temporary memory area currently being used. In the example illustrated in FIG. 10, ebp indicates addresses before char buf[8] and int i, and ReturnAddresses in the stack frame correspond to ebp. Accordingly, in the stack frame, by backing up the register value as ebp backup, an area in the stack frame can be easily used as a memory area for storing temporary data (variables and arrays).
In char buf[8], buf which is an array used in main( ) function is stored. The above-mentioned buf is an array which can store eight char-type (1 byte) variables, and 8-byte data can be stored in the above-mentioned buf. Int i is an int-type variable used in strcpy_helloworld( ) function.
Accordingly, by executing the above-mentioned program code, a stack frame for main( ) function is allocated in the stack area and a stack frame for strcpy_helloworld( ) function is allocated in the stack area. Then, the value of int i is sequentially incremented and the 6-byte character string “Hello!” is to be stored in char buf[8]. Next, upon completion of strcpy_helloworld( ) function, ReturnAddress2 is read out to return to main( ) function, and upon completion of main( ) function, ReturnAddress1 is read out and the program is terminated normally.
Next, a case where an illegal act took place when the program code is executed will be described with reference to FIG. 11 and FIG. 12.
FIG. 11 is a schematic diagram illustrating an example of source code of the program code. The program code illustrated in FIG. 11 is different from the program code illustrated in FIG. 9, in that a 12-byte data (11 characters+character string null terminator) “Hello World” is written in the array buf.
FIG. 12 is a schematic diagram illustrating an outline of the stack area when the program codes are executed. As illustrated in FIG. 12, in the stack area, an 8-byte area is allocated as an area for storing the buf array. Accordingly, when strcpy_helloworld( ) function is executed, data having a size larger than the allocated size to be is written. In such way, buffer overflow occurs when data having a size larger than the allocated size of memory is to be written in the memory. Note that the buffer overflow which occurs in the stack area is referred to as the stack-based buffer overflow.
In the stack area illustrated in FIG. 12, the allocation of areas is sequentially carried out downward, from the top to the bottom, and the writing of data into the allocated areas is carried out upward, from the bottom to the top. Accordingly, ReturnAddress1 and ebp backup are illegally overwritten with the 12-byte data “Hello World”, if strcpy_helloworld( ) function is executed. When the content of ReturnAddress1 is rewritten with an address where an executable program code is located by the above mentioned overwriting process, ReturnAddress1 is read out upon completion of main( ) function and the program code is executed.
According to the source code illustrated in FIG. 11, the overflow is caused by the character string named “Hello World” which is defined in advance. However, in practice, there is a case where data received through a network port of a mail server or Web server, or data entered through a console or files, is stored into the buf array. In such case, an arbitrary program code may be executed by a malicious user through network, entry through a console, or entry through files, and thereby illegal access such as data theft and falsification may occur.
As a method of detecting such illegal access, one possible method is by detecting whether any illegal access is executed or not by inspecting the return addresses stored in the stack area and comparing the arrangement pattern of the return addresses under execution of the normal program and the arrangement pattern of the return addresses under operation of an attack code. For example, the program may be judged as normal, if a destination memory attribute pointed by the return address has a program code attribute, or if the memory areas have a non-writable attribute.
However, return-to-libc attack (see Non-Patent Document 1) may not be detected by the above-described method of detecting illegal access. The return-to-libc attack refers to a technique of making an illegal access by calling a function preliminarily stored in a computer without adding any malicious code into the program code.
FIG. 13 is a schematic diagram illustrating an example of an outline of the stack area under the return-to-libc attack, when executing the program codes illustrated in FIG. 11. As illustrated in FIG. 13, in the return-to-libc attack, the value pointed by the above-described return address, that is, the content of ReturnAddress1, is set to the top address of the normal function such as OS (Operating System) standard functions. If the return addresses in the stack area indicate memory areas for the program code, or if the memory areas have a non-writable attribute, the program including the return-to-libc attack may not be distinguishable from the normal program.
A possible attack could be an attack using the return-to-libc attack in which a predetermined program code is executed by combining program codes of programs which are stored in a storage unit of an information device in advance. Also in this method of attack, the return addresses in the stack area seem to have similar pattern as those in the return-to-libc attack, so that the program including the attack using the return-to-libc attack may not be distinguishable from the normal program.
Patent Document 1 discloses a technique using branch trace which is one of the CPU functions so as to detect an action of returning back to the top address of the OS standard function, when returning from the function which is being executed to the process of the function at a caller.