FIG. 1 shows a typical storage area network 100 serving client computer 102 and computer file server system 104. Client 102 and server 104 are in communication via network 106.
Client computer 102 can include a processor 108 coupled via bus 110 to network port 112, fiber port 114 and memory 116. Processor 108 can be, for example, an Intel Pentium® 4 processor, manufactured by Intel Corp. of Santa Clara, Calif. As another example, processor 108 can be an Application Specific Integrated Circuit (ASIC). An example of bus 110 is a peripheral component interconnect (“PCI”) local bus, which is a high performance bus for interconnecting chips (e.g., motherboard chips, mainboard chips, etc.), expansion boards, processor/memory subsystems, and so on.
Network port 112 can be an Ethernet port, a serial port, a parallel port, a Universal Serial Bus (“USB”) port, an Institute of Electrical and Electronics Engineers, Inc. (“IEEE”) 1394 port, a Small Computer Systems Interface (“SCSI”) port, a Personal Computer Memory Card International Association (“PCMCIA”) port, and so on. Memory 116 of client computer 102 can store a plurality of instructions configured to be executed by processor 108. Memory 116 may be a random access memory (RAM), a dynamic RAM (DRAM), a static RAM (SRAM), a volatile memory, a non-volatile memory, a flash RAM, polymer ferroelectric RAM, Ovonics Unified Memory, magnetic RAM, a cache memory, a hard disk drive, a magnetic storage device, an optical storage device, a magneto-optical storage device, or a combination thereof.
Client computer 102 can be coupled to server computer 104 via network 106. Server 104 can be, for example, a Windows NT server from Hewlett-Packard Company of Palo Alto, Calif., a UNIX server from Sun Microsystems, Inc. of Palo Alto, Calif., and so on. Server 104 can include a processor 118 coupled via bus 120 to network port 122, fiber port 124 and memory 126. Examples of network port 122 include a Wide Area Network (WAN), a Local Area Network (LAN), the Internet, a wireless network, a wired network, a connection-oriented network, a packet network, an Internet Protocol (IP) network, or a combination thereof.
As used to describe embodiments of the present invention, the terms “coupled” or “connected” encompass a direct connection, an indirect connection, or any combination thereof. Similarly, two devices that are coupled can engage in direct communications, in indirect communications, or any combination thereof. Moreover, two devices that are coupled need not be in continuous communication, but can be in communication typically, periodically, intermittently, sporadically, occasionally, and so on. Further, the term “communication” is not limited to direct communication, but also includes indirect communication.
Embodiments of the present invention relate to data communications via one or more networks. The data communications can be carried by one or more communications channels of the one or more networks. A network can include wired communication links (e.g., coaxial cable, copper wires, optical fibers, a combination thereof, and so on), wireless communication links (e.g., satellite communication links, terrestrial wireless communication links, satellite-to-terrestrial communication links, a combination thereof, and so on), or a combination thereof. A communications link can include one or more communications channels, where a communications channel carries communications. For example, a communications link can include multiplexed communications channels, such as time division multiplexing (“TDM”) channels, frequency division multiplexing (“FDM”) channels, code division multiplexing (“CDM”) channels, wave division multiplexing (“WDM”) channels, a combination thereof, and so on.
In accordance with an embodiment of the present invention, instructions configured to be executed by a processor to perform a method are stored on a computer-readable medium. The computer-readable medium can be a device that stores digital information. For example, a computer-readable medium includes a compact disc read-only memory (CD-ROM) as is known in the art for storing software. The computer-readable medium is accessed by a processor suitable for executing instructions configured to be executed. The terms “instructions configured to be executed” and “instructions to be executed” are meant to encompass any instructions that are ready to be executed in their present form (e.g., machine code) by a processor, or require further manipulation (e.g., compilation, decryption, or provided with an access code, etc.) to be ready to be executed by a processor.
Storage area network 100 includes a plurality of networked storage devices 128 accessible via fiber router 130. Networked storage devices 128 may include, for example, one or more hard disk drives 132, 134, and 136, optical storage device 138, removable storage device 140, or other such storage devices. Fiber router 130 may be, for example, Chaparal FVS113, Crossroads 4250, ATTO Fiber Bridge 3200. Information stored on storage devices 128 may be accessible to client computer 102 and server computer 104 as if the devices were directly attached to the computers. For example, storage area on disk 132 may be “mounted” on server 104 and storage area on disk 134 may be mounted on client 102. From the perspective of applications running on those computers, the storage areas will appear as if they are directly attached to the respective computer system.
In typical client-server environments, a client computer may need to read data stored on the server system or may need to write data to the server system. Conventional systems and methods for accomplishing such tasks have not been optimized to take advantage of storage area networks such as those shown in FIG. 1. For example, a conventional process for writing data from the client into a file on a server follow a communications flow shown in FIG. 2. In this example, client 102 has data stored on disk 132 that needs to be transferred for storage by server 104. In FIG. 2, the transactions that are represented by solid lines consist of messages or data that is sent between the client and server computers. The dashed lines represent the actual interaction between client 102 and server 104 and networked storage devices 128 accessed via router 130.
In step 201, client 102 initiates a data write request by informing server 104 that the client has data to be written to a file maintained by server 104. In step 202, server 104 creates a new empty file on one of the networked storage devices 128, such as hard disk 134. In step 203, server 104 sends a message to client 102 informing client 102 that a file has been created. In steps 204 and 205, client 102 retrieves data from hard disk 132. In step 206, client 104 sends the data to server 104 with instructions to write the data to the new file. In step 207, server 104 writes the data to the new file on hard disk 134. In steps 208-215, client 102 retrieves data and server 104 writes data as described until all of the data has been transferred from client 102 to server 104.
This conventional method of data transfer does not result in an efficient file transfer between the two systems. Particularly, as shown in FIG. 2, the communications flow is not optimized because data that only needs to be moved from one physical location to another physical location within a single storage area network 100 is instead transferred out of the storage area network. Specifically, the data flows from storage area network 100 to client 102 via router 130. Client 102 then transfers the data to server 104 via network 106. Server 104 finally transfers the data back to storage area network 100 via router 130.
Another inefficiency problem associated with conventional file transfer systems is that the server cannot optimize its storage of the data because it does not have enough information to manage the data transfer operation. This is applicable to storage area networks such as those shown in FIG. 1, as well as client-server systems wherein data is stored in locally-attached storage devices. Initially, the client requests that the server create a new, empty file. The server responds when it has done so. From that point onward, the client writes a subset of the file's total data in each of a sequence of write operations. The server may or may not acknowledge the receipt of the data, depending on the specifics of the protocol used. Similarly, when the client has written all the file's data to the file on the server, it may issue a final request or not, depending on the protocol used.
FIG. 3 illustrates the above-described inefficiency problem in more detail. In step, client 300 initiates a request to transfer data to server 301. In step 303, server 301 responds to the request by indicating that a new empty file has been created. In steps 304-305, client 300 sends one or more data packets until the entire file has been transferred from client 300 to server 301. Because server 301 does not have complete information about the data being transferred, the data is subsequently written to the new file in pieces of varying size. This may result in an inefficient utilization of available disk space. If multiple files are to be transferred, then steps 302-306 must be repeated, as shown in steps 307 and 308.
The conventional method as described is widely used for populating the data space of a file, and is effective when the number and content of the data cannot be known in advance. However, because the server is only exposed to a subset of the total set of write data operations at any given time, the server's opportunities for optimization are limited. Particularly, the server cannot determine which available storage locations within a storage medium would best be suited for storage of the file, because the file's ultimate size is unknown. Further, the server cannot specify the order that the client should send the data, or, in cases where the client will ultimately send more than one file to the server, the sequence of the files. This deficiency is particularly pronounced in storage area networks, where it is typical for a client to transfer numerous files having particular contents and sizes known only to the client. In such environments, the transfer of file contents on a piecemeal basis results in a diminished data transfer rate.
Another serious limitation in utilizing the conventional methods of transferring data as illustrated in FIGS. 2 and 3 arises when files are to be moved from a client to one or more removable-media devices on a server. In such systems, a server may manage a series of pieces of media, each of which has finite capacity. As data is placed on these media, each piece may have a different amount of space remaining. When these method are employed, and data is written in a piecemeal manner, a server may store a file's data on a piece of media where it will ultimately not fit. In such a situation, it may be necessary to later move the partially-written file to a new location so that further write operations may take place.
Accordingly, there is a need for a system and method for providing improved file transfer rates and efficient data placement on a data storage medium.
The general process for transferring data between client and server systems, described above, is also used in common network file sharing protocols, such as Network File Systems (NFS) and Common Internet File System (CIFS), wherein a client computer creates an empty file on a server, then writes data piecemeal to the file via the server.