1) Field of the Invention
The present invention relates to a technique applicable to a large-scale distributed storage system including a plurality of storage units, and more particularly to a technique capable of generalizing a plurality of storage units having diverse characteristics, such as large, medium and small scales and high, medium and low speeds, for achieving efficient and stable control of operations.
2) Description of the Related Art
For example, Japanese Patent Laid-Open No. HEI 5-334006 (Patent No. 276624) discloses, in a distributed network storage system in which a plurality of storage units are distributively disposed on a network, the utilization of a virtual storage technique called “logical volume”. In such a system, the techniques for the storage control on a disk array of each of the storage units are disclosed, for example, in Japanese Patent Laid-Open Nos. HEI 9-274544, 2001-67187 and 2002-157091.
In Japanese Patent Laid-Open No. HEI 9-274544, the “index for relocation” is defined and all data are re-stored in succession after information on access to each of the logical disk units (logical volume) is collected as the index so that the relocation is made on a physical disk unit (storage unit) of the logical disk unit on the basis of the access information. This discloses the technique in which, for example, the access frequency to data is collected as the aforesaid index (access information) and a logical disk unit high in access frequency is relocated at a higher-speed physical disk unit and, of the data distributed on an array, the data high in access frequency are put together in one location to enhance the sequential access performance.
Japanese Patent Laid-Open No. 2001-67187 discloses the technique in which a plurality of storage units are classified (sorted) into a plurality of classes with attributes for the management so that a maintainer can easily determine the data transferring location and the data receiving location for the relocation on the basis of the using situation of each of the classes and the aforesaid attribute to simplify the operations for accomplishing the location optimization through the physical relocation in a storage area. In particular, Japanese Patent Laid-Open No. 2001-67187 discloses a method of determining the class on the destination in a logical storage area for the relocation so that the time of use of each storage unit (disk unit) per unit time does not exceed an upper limit value set as the aforesaid attribute on a class basis. That is, it discloses a method in which the data access concentration point on the array is detected to carry out the load distribution for apparently preventing the access performance degradation of the entire array.
For realizing the method disclosed in Japanese Patent Laid-Open No. 2001-67187, Japanese Patent Laid-Open No. 2002-157091 discloses a technique for correctly totalizing the “occupied times (service times) in a logical storage area of each physical storage unit (disk unit)” even if the cache processing is conducted in each physical storage unit. That is, as in the case of Japanese Patent Laid-Open No. 2001-67187, Japanese Patent Laid-Open No. 2002-157091 also discloses a method in which the data access concentration point on the array is detected to carry out the load distribution for apparently preventing the access performance degradation of the entire array.
Meanwhile, in the conventional techniques including the aforesaid patent documents, it is assumed that, in handling the aforesaid distributed network storage system, the characteristics of the individual physical storage units, such as the maximum available (usable) total capacity and the speed performance, are equal to each other or almost same and have magnitudes sufficient to various types of requests. As far as this assumption comes into satisfaction, in the distributed network storage system, for example, when the data access concentration point is found, this concentration point is eliminable without problems by carrying out the common working capacity equalization or residual capacity equalization (which will be mentioned later) among a plurality of physical storage units.
However, a large-scale network storage system whose further development is expectable in the future is somewhat different in situation from the system that the aforesaid assumption comes into satisfaction, for that the system extensibility can be regarded as being infinite in effect and all the maintenance operations including the system expanding operation are conducted without interrupting the system service (guaranteeing one-year, 365-days, 24-hours continuous operation).
That is, if the distributed network storage system is placed into operation under the above-mentioned situation, the maximum available total capacities of the individual physical storage units pertaining to this system do not become constant. Probably, the storage capacity of a physical storage unit to be newly added exceeds twice the storage capacity of the unit installed half year to one year ago, and it is clear that the aforesaid assumption does not come into satisfaction.
In a system composed of physical storage units having different capacities, if the common working capacity equalization or residual capacity equalization is applied, as mentioned later, the deviation in data location state occurs in a plurality of physical storage units, and if the utilization factor of the entire system is low, the response performance of the system degrades while, if the utilization factor of the entire system is high, the system stabilization performance degrades.
Therefore, it is desired to achieve the efficient and stable operation control of the system even if the utilization factor (usage rate or activity ratio) of the entire system varies under a situation that the characteristics such as the maximum available total capacity and speed performance of individual physical storage units pertaining to the system are not made uniform unlike the case of the conventional technique.
A concrete description will be given herein below of the working capacity equalization, the residual capacity equalization and a situation when these equalizations are applied to a system including diverse storage units different in capacity from each other.
In the following description, Ti represents the total capacity of a storage node (physical storage unit) i, Ui depicts the working capacity of the storage node i, Ri denotes the residual capacity of the storage node i, and these variables Ti, Ui and Ri show the following relationship expressed by the following equations (1) and (2).Ti=Ui+Ri  (1)T1≧T2≧T3≧T4≧T5≧T6  (2)
An examination/description will first be given of the working capacity equalization.
For the working capacity equalization, the total capacities Ti different from each other are handled so that the working capacities Ui become the same level U wherever possible. The system resource disposition, satisfies the following equations (3) and (4), becomes a common “ideal” with respect to this case, and further “improving effort” is not done.Ui=Ū≡U  (3)Ri=Ti−U  (4)                where Ū signifies an average (mean) value of Ui.        
In the case of the resource disposition other than the aforesaid “ideal”, the “improving effort” index value ΔUi with respect to each node i can be defined by the following equation (5). In this case, the user data flows into node showing ΔUi>0.ΔUi=Ū−Ui  (5)
In the case of the working capacity equalization, the data movement (equalization) is made according to ΔUi of the equation (5).
Secondly, an examination/description will first be given of the residual capacity equalization.
For the residual capacity equalization, the total capacities Ti different from each other are handled so that the residual capacities Ri become the same level R wherever possible. The system resource disposition, satisfying the following equations (6) and (7), becomes a common “ideal” with respect to this case, and further “improving effort” is not done.Ri= R≡R  (6)Ui=Ti−R  (7)                where R signifies an average value of Ri.        
In the case of the resource disposition other than the aforesaid “ideal”, the “improving effort” index value ΔRi with respect to each node i can be defined by the following equation (8). In this case, the user data flows into the node showing ΔRi>0.ΔRi=Ri− R  (8)
From the above-mentioned equations, the following equation (9) is obtainable through simple arithmetic.ΔRi=(Ū−Ui)−( T−Ti)  (9)                where T signifies an average value of Ti.        
In the case of the residual capacity equalization, the data movement (equalization) is made according to ΔRi of the equation (8) or (9).
Moreover, in a system in which the sizes Ti of the nodes are identical to each other, the “working capacity equalization” and the “residual capacity equalization” based upon the identical operations.
On the other hand, if the difference between the sizes Ti of the nodes reaches several times to several ten times, the following situations take place according to the usage rate (utilization factor) of the entire system.
In a case in which the usage rate of the entire system is relatively small, in the right side of the aforesaid equation (9), the absolute value of the second term is considerably larger than the absolute value of the first term. Therefore, for the “residual capacity equalization”, it is seen that a new storage area is selectively secured on a large-capacity storage while almost no new storage area is secured on a small-capacity storage. That is, the data is located on only the large-capacity storage and, hence, there is a possibility that the degradation of the response performance occurs.
In addition, in a case in which the usage rate of the entire system is relatively large (that is, when the capacity is tight), since the size Ti of the node does not appear in the aforesaid equation (5), in the case of the “working capacity equalization”, it is seen that a portion of the system falls into failure at the tightness (shortage) of the capacity of the system. That is, the available total capacity of the small-capacity storage already runs out and the small-capacity storage completely loses a portion of the function.