1. Field of the Invention
The present invention relates to a computer system, more specifically to a method and an apparatus for implementing extensible secondary storage, suitable for application programs.
2. Description of the Related Art
Primary components in a modern computer system include computers (one computer comprises a processor, memory, and peripheral devices), a network, and a secondary storage. The storage heretofore was considered as a device attached to a computer, however the situation has been changed recently.
At first, there become common in recent years that a plurality of computers shares a single storage, because of the widespread usage of network. The processing power of the computer attaching the storage may become a bottleneck to cause storage input/output (I/O) from another computer through a network to be slowed down.
Secondly, The storage capacity and the throughput required for a storage increase from year to year. “Greg's law” anticipates that “the demand of the storage capacity for a data warehouse becomes double in nine months”. As a result, the number of storages attached to a single computer may grow up so that the computer may become the bottleneck of the storage I/O.
Thirdly, since the number of transistors integrated in a hard disk controller LSI has been increased rapidly, the opportunity to realize a high-function storage has been increased.
By keeping in mind such situation as mentioned above, there has been proposed the addition of some new features to the storage controller LSI. Some candidates of these new features comprise a network interface and advanced function for the application programs.
By providing the storage with a network interface, the storage may be directly connected to the network. The storage thereby will be able to receive and process I/O requests from a plurality of computers without any hosting computer.
At present, the most popular interface between the storage and the computer is block I/O. By providing the storage with advanced features for every application such as sorting, image processing, and basic operations in a database system (such as selection processing, projection processing, concatenation, aggregation processing, and so on) instead of conventional block I/O, the storage will be able to cover part of processes performed by the processor in a computer.
Some exemplary storage proposed to feature a network interface and some part of a filesystem includes the system described in the paper by Garth A. Gibson et al., “A Cost-Effective, High-Bandwidth Storage Architecture” (Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998, published by ACM; hereinafter reference #1), and the system described in the paper by Steven R. Soltis et al., “The Global Filesystem” (Proceedings of the Fifth NASA Goddard Space Flight Center Conference on Mass Storage Systems and Technologies, 1996, published by NASA; hereinafter reference #2).
Some exemplary high-function storage proposed in the assumption of plural applications includes the system described in the paper by Erik Riedel et al., “Active Disks: Remote Execution for Network-Attached Storage” (Technical Report CMU-CS-97-198, 1997, published by Carnegie Mellon University; hereinafter reference #3), the one described in the paper by Anurag Acharya, “Active Disks: Programming Model, Algorithms and Evaluation” (Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998, published by ACM; hereinafter reference #4), and the one described in the paper by Kimberly Keeton et al., “A Case for Intelligent Disks (IDISKs)” (SIGMOD Record, Volume 27, Number 3, 1998, published by ACM; hereinafter reference #5).
The references #3 to #5 described the downloading from a computer to a storage of program modules that may achieve high function through the network. The languages proposed appropriate for writing such modules includes the language described in the paper by J. Gosling et al., “The Java Language Specification” (1996, Addison-Wesley; hereinafter reference #6).
Now, there will be described a server-attached disk (SAD), the conventionally exemplary storage, with reference to the accompanying FIG. 2.
A SAD 203 is usually connected to one computer 201 through an I/O cable 202. The storage may occasionally be connected to a plurality of computers. SAD 203 is comprised of a storage controller 204 and a disk 209, and the storage controller 204 constitutes of an interface control part 205, a buffer management part 206, buffer memory 207, and a disk controller 208.
The disk 209 is a storage medium (secondary storage) that can save data even after the shutdown of power. The control interface part 205 receives I/O requests and other transmissions sent from external devices through the I/O cable 202 and transmits the response to the requests, and other transmissions to the I/O cable 202. Buffer management part 206 controls the buffer memory 207. The buffer memory 207 temporarily saves data obtained from the disk 209. The disk controller 208 controls the disk to block read from and block write to the disk 209.
Interface 210 between the SAD 203 and the computer 201 provides input/output of block basis.
Reference is made to FIG. 3, the arrangement of a recently emerged storage, network-attached storage (NAS) will be described.
One or more of NAS 303 may be connected through a network 302 to one or more of computers 301, 301′, etc. NAS 303 is comprised of a storage controller 304 and disk 309, and the storage controller 304 is comprised of a network controller 305, a buffer management part 306, buffer memory 307, and a disk controller 308.
The network controller 305 receives I/O requests and other transmissions sent from external devices through the network 302 and transmits the response to the requests, and other transmissions to the network 302. The disk 309, the buffer management part 306, the buffer memory 307, the disk controller 308 have their functions similar to the disk 209, the buffer management part 206, the buffer memory 207, and the disk controller 208, respectively.
NAS interface 310, the interface between NAS 303 and computers 301, 301′, . . . , provides input/output of block basis.
Reference is now made to FIG. 4, the arrangement of Advanced SAD storage, which is the extended version of conventional SAD storage, will be described.
One or more of Advanced SAD storages 403 may usually be connected to one computer 401 through an I/O cable 402. In some cases it may be connected to a plurality of computers. The Advanced SAD storage 403 is comprised of a storage controller 404 and a disk 409, and the storage controller 404 is comprised of an interface control part 405, a buffer management part 406, buffer memory 407, a disk controller 408, and application-oriented function part 411.
The I/O cable 402, the interface control part 405, the buffer management part 406, the buffer memory 407, the disk controller 408, and the disk 409 provide the same functionality as the I/O cable 202, the interface control part 205, the buffer management part 206, the buffer memory 207, the disk controller 208, and the disk 209, respectively. The application-oriented function part 411 provides some advanced functions for specific applications, such as sorting, image processing, basic operations of a database system such as selection processing, projection processing, concatenation, and aggregation processing, and so on. High-function SAD interface 410 may have, in addition to block I/O, interface for making use of advanced processing provided by the application-oriented function part 411.
Reference is now made to FIG. 5 to describe the arrangement of a high-function NAS storage, which has been recently proposed.
One or more of high-function NAS storages 503 may usually be connected to one or more of computers 501, 501′, etc. through a network 502. The high-function NAS storage 503 is comprised of a storage controller 504 and a disk 509, the storage controller 504 is comprised of a network controller 505, a buffer management part 506, buffer memory 507, a disk controller 508, and an application-oriented function part 411.
The network controller 505, the disk 509, the buffer management part 506, the buffer memory 507, and the disk controller 508 may have the same functionality as the network controller 305, the disk 309, the buffer management part 306, the buffer memory 307 and the disk controller 308, respectively.
The application-oriented function part 511 provides some advanced functions for specific applications, such as sorting, image processing, basic operations of a database system (such as selection processing, projection processing, concatenation, and aggregation processing, and so on). The high-function NAS interface 510 may have, in addition to block I/O, interface for making use of advanced processing provided by the application-oriented function part 411. The system described in reference #4 and the one described in reference #5 may download functions of the application-oriented function part 411 from external devices.
In order to achieve a storage which may directly connect to a network and provide high function, the most fundamental problem to be solved is that the storage is to contiguously provide functions effective to a vast range of application. If the range of application is sufficiently vast, that may lead to larger market, and then larger market may lead to some decrease of development cost and to some increase of development speed. Although there have been proposed a number of machines proprietary for database usage, these machines was difficult to have sufficient competitive power to survive among other multi-purpose machines using versatile devices, because these database-specific machines lacked sufficient range of versatility so that enough development cost was not invested.
In order to achieve storage contiguously providing functionality effective to a vast range of applications, there are three keys: higher extensibility of functionality, lower development cost of functionality, and higher cost performance ratio in the light of Total Cost of Ownership (TCO). More specifically these are the objects to be solved by the present invention.
In the Prior Art the extensibility has been considered as to downloading of advanced functions for every applications to the storage as needed, however, the development cost and TCO have not been sufficiently considered.
In the systems described in the references #1 and #2, the filesystem is created on the block access so that the application range will be limited. Although the system described in the reference #3 is considered as to provide a plurality of advanced functions, the way to provide these functions is unknown. The reference #4 proposes a plurality of advanced functions achieved on the software layer in the proximity of conventional operating system (OS). However the structure of the software corresponding to the conventional OS is different in every application. For example, a relational database management system (RDBMS) do not use filesystem provided by OS. Thus RDBMS do not require a filesystem. This means that even if a conventional software layer was directly applied to a storage, it might be difficult to address such vast range of application to be covered by the storage. The system described in the reference #5 is still in its designing step, however intends to use with RDBMS with its limited range of application.
The requirements of development cost and of TCO will be further considered hereinbelow, which have not been sufficiently taken into account in the Prior Art.
Concerning the development cost, if each of advanced functions for every application is developed separately, the development cost thereof will be increased, resulting in the weakened competitiveness. Thus program modules (referred to as “module” hereinafter) should be developed with lower development cost and should achieve advanced functions designated for each of applications. Also the modules for achieving advanced functions are different in every application. If the common part of the advanced functions are extracted and shared, overlapped development of the common part of modules will not be required as well as the debugging of the common part of module will be eliminated, so that lower development cost will be realized. In addition, it is anticipated that the development cost will be further reduced in case where some mechanism, which may run a developed module in high speed, is provided, since the development time required for tuning of the module may be shortened.
With respect to the requirement of development cost, as can be seen, there are problems needed to be solved, as follows:                Providing storage with common part of sophisticated functions for a plurality of applications;        Achieving said common part with lower development cost;        Using said common part for achieving advanced functions for a plurality of applications;        Implementing protection when a common part is called by the advanced functions for a plurality of applications;        Implementing mutual exclusion when said common part is called by the advanced functions for a plurality of applications; and        Providing a mechanism for faster execution of modules.        
With respect to the requirement of the total cost of ownership, in order to take into account the fact that a plurality of storages may coexist on a network, there is problem to be solved as follows:                Distributing modules to plural storages if they exist.        