In a current computer, a computer service is attained by installing programs and data into a main memory, and executing a data-processing while carrying out data input/output from/to an external storage apparatus such as a hard disk. Here, the program describes an operating procedure by using commands with a format operable in the computer. Actually, a plurality of processings is operated by using the same program. That is, a plurality of operations is carried out along the same procedure. A unit at which they are processed along this program is referred to as a process.
The program for controlling the operations of the computer is an operating system (OS). On the OS, many processes are operated to input/output the data from/to the external storage apparatus. Here, in order to operate the process, the processing of reading the program body from the external storage apparatus to the main memory is also the input of the data referred to as the program body from the external storage apparatus, and also a part of the data input/output from/to the external storage apparatus.
The input/output processing of the data has great influence on the responsibility of the process. The responsibility of the process implies the performance until some kind of a result or response is replied, when a request is inputted from the process and then the processing in response to the request is carried out. For example, in a case of a reading process of a word processor software, the responsibility of the process is the time until a document file is opened. In a case of an input process, it is the time until one character is input-processed and displayed after key input.
Also, as another example, in a case of a page access in the Internet, the responsibility of the process is the responsibility of an access process from the viewpoint of a user, and is a time until a corresponding page is displayed on a web browser after the input of an URL. However, from the viewpoint of the web browser, the page access is the access to a plurality of files or documents, and it is configured from a request of “accessing each file or document” and a result of “completing a reception of transferred data corresponding to the request”. Thus, the page access is composed of a plurality of responses.
In this way, the responsibility of the process exists at various granularities.
There is a case that the data input/output requires the input/output from/to the external storage apparatus. Its input/output requires a long time, as compared with a case that the computer uses a processing unit and a main memory to carry out a calculation processing. For this reason, the input/output processing of the data tends to be dominant in a process execution, and this has great influence on the responsibility. Thus, in order to improve the responsibility of one process, the input/output processing of the data, namely, a data transfer speed to the external storage apparatus, the storage speed and the like are required to be improved.
Usually, as the continuous data are inputted or outputted in a burst mode from or to the external storage apparatus, the transfer speed is improved. This is because an overhead such as a seek time of a hard disk can be reduced and a bus can be fully used when the data are transferred. However, when one process inputs/outputs the vast quantity of the data at a time and prioritizes the data input/output, another process cannot carry out the input/output and then proceeds to a waiting state, and the processing is interrupted. As a result, the responsibility of the other process is decreased, which brings about a problem that the entire average responsibility is decreased. For this reason, the OS has an input/output scheduler, and while adjusting the tradeoff between the attainment of the faster speed resulting from the continuous data input/output and the responsibility improvement to each process, carries out the input/output control.
For example, “I/O Scheduling Algorithms”, which is written by DANIEL P. BOVET, MARCO CESATI, Understanding the LINUX KERNEL, U.S., O′REILLY, November 2005, Pages 580 to 583 discloses an input/output scheduling mechanism of the Linux. That is, the Linux has the input/output scheduling mechanism referred to as the I/O elevator and can set the Noop scheduler, the Deadline scheduler, the Anticipatory scheduler, the CFQ (Complete Fairness Queueing) scheduler and the like, on the basis of a difference of application.
The input/output scheduler carries out an ability distribution based on a priority, for an input/output request, at a step of carrying out an input/output control. At this time, as parameters to determine the priority of the input/output, for example, there are an ID of a process group and an ID of a process, and a priority of an input/output set for its process, and an identifier of an external storage apparatus of an input/output destination, and the like.
For an input/output request, with reference to the process ID, the input/output scheduler, for example, uniformly distributes time for a request processing, the number of the request processings and the like for each process and performs a several fold request processing for a particular process ID.
Also, the OS employs a method that apparently makes the speed of the input/output processing faster by reserving a temporary storage area in the main memory and holding the data of the external storage apparatus. This temporary storage area is referred to as a cache, a file cache, an I/O cache, a buffer, an I/O buffer and the like (hereafter, they are collectively referred to as the cache).
In the current computer, typically, the data transfer between the processing unit and the main memory is faster than the data transfer between the external storage apparatus and the main memory. For this reason, the OS, when inputting or outputting data from or to the external storage apparatus, inputs or outputs the data from or to the cache for the meanwhile, and then chooses a proper timing and carries out the data input/output between the cache and the external storage apparatus. From a viewpoint of the process, the input/output processing has been completed when the data can be inputted from or outputted to the cache. When the data area is often reused, the data input/output from/to the cache is repeated, which can avoid the data input/output from/to the external storage apparatus. Thus, the input/output can be made faster.
In data-reading, a time when data does not exist on the cache area becomes timing when the data is inputted (read) from the external storage apparatus. The OS reads the data from the external storage apparatus to the cache at that timing. In data-writing, in order to reflect data updated on the cache area to the external storage apparatus, a predetermined event becomes a trigger of data output (writing). The OS writes the data of the cache area to the external storage apparatus at that output trigger. In this way, an operation for writing (outputting) the data updated on the cache area to the external storage apparatus at certain timing is referred to as a write-back.
Unless the data is reflected, the update data held on the cache is not reflected to the external storage apparatus. Thus, this causes a data inconsistency and the like. As the output trigger from the cache to the external storage apparatus, one example is every certain time interval. Also, another example is a time when an unused area in the main memory becomes small. Also, still another example is a time when the amount of the updated data reaches a certain quantity or more. Yet still another example is a time when the input/output processing from/to the external storage apparatus is not carried out.
A cache managing mechanism carries out control of the cache. The cache managing mechanism manages allocation and deallocation of an area of the cache and a use situation of the cache (a relation between the cache and a target file and the like). The cache managing mechanism exists while being mixed with a file system or a main memory managing system, in many cases. Also, the write-back of the cache is executed by a write-back process.
For example, “Writing Dirty Pages to Disk”, which is written by DANIEL P. BOVET, MARCO CESATI, Understanding the LINUX KERNEL, U.S., O'REILLY, November 2005, Pages 622 to 630 discloses a write-back process of the Linux. That is, in the Linux, the write-back process referred to as the bdflush or the pdflush executes the write-back. The write-back process retrieves the cache area on which update is performed, namely, the cache area to which writing is executed, at the above-mentioned timing as a trigger. Then, a data output request to output (write) the data in the update cache area to the external storage apparatus is generated and issued to the input/output scheduling mechanism. At this time, in order to improve a retrieving speed, the update cache area can be also managed in advance in an update list and the like.
The write-back process, when issuing the data output request, selects the proper update cache area and issues the output request, instead of outputting all of the update data at a time. As a method of selecting the update data, one example is an LRU (Least Recently Used) method that prioritizes the data having the oldest data use time. Also, another example is a method that executes retrieving at a file system order. Also, still another example is a method that selects only the cache area for the particular external storage apparatus. Moreover, yet still another example is a method in which the methods of selecting the update data as mentioned above are combined. Also, as the data amount when the update data is selected, some examples are a file unit, a constant page size (including a single page) and the like.
The data output request to the external storage apparatus includes: at least an identifiers (for example, a PCI device number and a device number, a major number and a minor number in the Linux, and the like) to specify the external storage apparatus of an output destination; and information (for example, an address of the data, a page number and the like) to indicate the selected cache area to be reflected into the external storage apparatus.
FIG. 1 is a conceptual view describing input/output using a cache in the conventional technique. A process 1 and a process 2 repeat data input/output and consequently execute programs. The processes 1, 2 request an OS 1000 to input or output the data. Then, the OS 1000 uses a cache 1300 and makes the input/output from/to the processes 1, 2 fast. A cache managing mechanism 1320 allocates the cache 1300 in a main memory 1200. The cache managing mechanism 1320 allocates a cache area 1310 inside the cache 1300 for handling the input/output requests from the processes 1, 2. In FIG. 1, the numeral following “PROCESS” is defined as a process ID, and a y-th data input/output request of the process ID=x is represented as “x-y”, for the input/output request.
At first, an example of an input case will be described.
For example, for an input request “1-1” from the process 1, the cache managing mechanism 1320 allocates a cache area “1-1” in the cache area 1310. Since data does not exist on the cache area “1-1”, the OS 1000 issues a data input request to an input/output scheduler 1400, in its original state of a context of the process 1. The input/output scheduler 1400 requests the input request “1-1” to an external storage apparatus 1600 and transfers the data from the external storage apparatus 1600 to the cache area “1-1” in the cache area 1310. The OS 1000 replies the read data to the process, namely, copies or maps to a process space. Then, the reading request is completed.
In the foregoing input case, when the OS 1000 issues the input request in its original state of the context of the process 1 to the input/output scheduler 1400, the OS 1000 holds process information (e.g.: Process ID) of the process currently being operated, whose input request has been issued. Thus, the input/output scheduler 1400, when receiving its input request, can specify its process information. For example, in the foregoing example (input request “1-1”), it is possible to specify that the process ID of a request source of the input request is 1 (one). Here, the input/output scheduler 1400 can execute the input/output control that uses the priority and the like based on the process (e.g.: Process ID), as mentioned above. Thus, in the case shown in FIG. 1, the processing with regard to the input request can be executed on the basis of the priority and the like of the process. Consequently, it is possible to improve the responsibility of the processing with regard to the input executed by the input/output scheduler.
With regard to the same input request (read request) “1-1” on and after a next time, the cache area “1-1” corresponding to its input request exists on the cache 1300. Thus, the processing for the input request is completed when the data is copied or mapped from its cache area “1-1” to the process space. For this reason, the read speed can be improved.
An example of an output case will be described below.
For an output request “1-3” from the process 1, the cache managing mechanism 1320 allocates a cache area “1-3” in the cache area 1310. The processing for the output request is completed when a write data is copied from the process space to the cache area “1-3” in the cache area 1310. From the viewpoint of the process, the output processing is ended only for the output to the cache 1300. Thus, the output processing can be executed at a high speed. On the other hand, actually, a write-back process 1500 reflects the output result, namely, the update data to the external storage apparatus 1600.
In FIG. 1, a data output request to the external storage apparatus 1600 from the write-back process with respect to a data output request “x-y” from the process is represented as “RB(x-y)”. The write-back process 1500 is activated at the foregoing timing as the trigger and begins the write-back. The write-back process 1500 finds the cache area “1-3” in the updated cache area 1310 to carry out the output to the external storage apparatus 1600. That is, the write-back process 1500 issues the output request to the external storage apparatus 1600 for the cache area “1-3” in the updated cache area 1310 to the input/output scheduler 1400. At this time, the output request is issued as a context of the write-back 10, process 1500. For this reason, as the output request “RB(1-3)” to the external storage apparatus 1600, it is treated in the input/output scheduler 1400. The foregoing write-back can attain consistency between the update data of the cache area 1310 and the data of the external storage apparatus 1600.
However, the attainment of the input/output performance improvement by using the cache brings about a problem that performance and responsibility of the computer are dropped. This is because the cache managing mechanism 1320 uniquely treats the cache 1300 on the OS 1000 and does not take care of the process 1100. The cache managing mechanism 1320 does not use the process information. Thus, for example, when a certain process 1100 begins to indefinitely use the cache 1300 and monopolizes the main memory 1200, a new cache area cannot be allocated on the main memory 1200 at the time when input/output of another process 1100 is generated. For this reason, in order to allocate a necessary cache area, the cache area currently being used is required to be deallocated, which leads to a significant drop of computer performance.
As one means to solve such a computer performance drop, a method of limiting cache capacity for each process is described in Japanese Laid-Open Patent Application JP-P 2006-350780A (corresponding to U.S. Patent Application US2006288159). A cache managing mechanism determines cache capacity for each process, based on an obtainment setting parameter set for cache management information. Then, the cache managing mechanism controls allocation and deallocation of a cache area so that the process does not exceed the determined cache capacity of the cache area whose allocation is newly requested. This prevents a cache area for a particular process from being depleted.
However, even a case of using the technique of Japanese Laid-Open Patent Application (JP-P 2006-350780A (corresponding to U.S. Patent Application US2006288159), the fact that the processing for reflecting the update data to the external storage apparatus 1600 is executed by the write-hack process is not changed.
In this output case, a write-back process 1500 issues an output request, as the context of the write-back process 1500, to the input/output scheduler 1400. The OS 1000 holds the information of the write-back process, as the process currently being operated, which issues the input request. For this reason, the input/output scheduler 1400, when receiving the output request, specifies the process information as the write-back process. That is, it is impossible to specify process information of an original source that issues the output request. For example, in the foregoing example (output request “RB (1-3)”, the process of the request source of the output request is specified as the write-back process, and the process ID=1 of the original source cannot be specified. Moreover, the data itself stored in the cache area 1310 does not include the process information (e.g.: Process ID) indicating which process the data belongs to. For this reason, the process ID=1 of the original source cannot be specified from even the data itself.
In this way, the output request RB (x-y) is specified as the output request executed by the write-back process. Thus, the input/output scheduler 1400 cannot execute the input/output control that uses the priority based on the process (e.g.: Process ID) and the like as mentioned above. Thus, in the case as shown in FIG. 1, the processing with regard to the output request cannot be executed on the basis of the priority of the process and the like. That is, it is impossible to improve the responsibility of the processing with regard to the output executed by the input/output scheduler.
FIG. 2A and FIG. 2B are diagrammatic views showing an example of a concept of a processing with regard to input and output in the conventional technique. FIG. 2A shows a case of the input, and FIG. 2B shows a case of the output, respectively. With reference to FIG. 2A, in the input case, the OS 1000 issues input requests from processes 1, 2 and 3 to the input/output scheduler 1400. At that time, the input/output scheduler 1400 can specify process IDs of the processes requesting the input, as request source information of the input. For this reason, the input/output scheduler 1400 can execute input control using priority based on the process ID and the like. On this drawing, as the input control, input processings are uniformly assigned and executed.
On the other hand, with reference to FIG. 2B, in the output case, the write-back process 1500 unifies output requests from the processes 1, 2 and 3 and issues as the output request from the write-back process 1500 to the input/output scheduler 1400. At that time, the input/output scheduler 1400 specifies a process requesting the output as the write-back process, for request source information. For this reason, the input/output scheduler 1400 cannot know the process ID as the request source information and cannot execute the output control using priority based on the process ID and the like. On this drawing, for example, the output processing is executed in an order starting with an old output request or an order starting with a nearby address.
In this way, the conventional cache managing mechanism can protect the performance from the drop caused by the monopolization of the cache, by controlling the cache capacity for each process. However, the problem that the input/output control using the priority executed by the input/output scheduler and the like becomes invalid is not solved. This is because as mentioned above, the cache managing mechanism does not explicitly hold the process information for the cache area and further the data output from the cache to the external storage apparatus is unified as the output of the data to the write-hack process. That is, the input/output scheduler recognizes all as the data output requests from the write-back process. Thus, a parameter to be properly transmitted and to determine the priority of the input/output for the process that carries out the input/output using the cache is not transmitted to the input/output scheduler. In the input/output mechanism using the cache, a technique that the input/output scheduler can suitably determine the priority of the process is desired. A technique that can suitably transmit the information, which is required to determine the priority, to the input/output scheduler is desired.
As the related technique, Japanese Laid-Open Patent Application JP-P 2005-293205A (corresponding to U.S. Patent Application US2005223168) discloses a storage control apparatus, a control method and a control program. This storage control apparatus controls a plurality of storage apparatuses. This storage control apparatus includes: an LRU write-back means for carrying out a write-back to the plurality of storage apparatuses of data that are stored in a cache memory inside the storage control apparatus by using an LRU method; and a write-back schedule processing means for selecting a storage apparatus in which the number of the write-backs executed by the LRU write-hack means is small and then performing the write-back of the data on the selected storage apparatus.
WO99/40516 Gazette (corresponding to U.S. Pat. No. 6,748,487) discloses a disk cache control method, a disk array apparatus and a storage apparatus. This disk cache control method is a disk cache control method in the disk array apparatus that includes: a plurality of disk apparatuses in which data is divided and stored; and a disk cache, wherein a plurality of volumes are assigned to the plurality of disk apparatuses. Assignment of a new disk cache area to the data is carried out in an order starting with the disk cache area assigned to an area whose access frequency is lower, when an access frequency is determined for each area where the volume is divided at a certain fixed length.
Japanese Laid-Open Patent Application JP-P 2000-172562A (corresponding to U.S. Pat. No. 6,507,894) discloses an information processing apparatus. This information processing apparatus includes: a main memory; a cache memory holding the copy of a main memory; and a processor including a cache memory controller which, while referring to and updating control information and address information of the cache memory, manages the data inside the cache memory. This information processing apparatus includes a pre-fetching means transferring the data of the main memory to the cache memory without referring to and updating the control information and the address information.
Japanese Laid-Open Patent Application JP-A-Heisei, 8-328956 discloses a memory managing method of a multi processor system. In the memory managing method of this multi processor system, the multi processor system includes a virtual storage mechanism composed of a plurality of processors. The multi processor system includes a memory managing means, a process managing means, an address conversion cache, a bind control means and an invalidation control means. The memory managing means manages a correlation setting between a virtual address and an actual address. The process managing means manages a process executed by the processor. The address conversion cache is provided inside each processor and holds linkage information between the virtual address and the actual address, which are correlated by the memory managing means. About an access request to an external data executed by the process currently being executed in one of the processors, an operating system on the processor asks the memory managing means, and in response to this asking, the memory managing means passes a predetermined virtual address and actual address. At that time, the bind control means reports an instruction for binding its process to its processor until the process finishes the use of the memory page indicated by this actual address, to the process managing means. The invalidation control means invalidates the actual address correlated to the virtual address of the address conversion cache included in the processor, and updates to the passed actual address. The memory managing method of the multi processor system carries out the memory management without requiring the simultaneous invalidation of the address conversion caches included in all of the other processors, even if the correlation setting between the virtual address and the actual address, which is managed by the memory managing means, is updated by the process currently being executed by one of the processors.