1. Field of the Invention
This invention generally relates to digital data storage and, more particularly, to a system and method for directly copying file data into disk on NAS, without creating an intermediate copy of the data in kernel memory.
2. Description of the Related Art
OS: Operating System.
FS: File System
TCP/IP: Transmission Control Protocol/Internet Protocol—The most widely used protocol for data networking.
NAS: Network Attached Storage: A system comprising hard disk drives, specialized software running the CIFS protocol. This system exports data-storage over a network to which multiple clients can share and store their data.
CIFS: Common Internet File System: The commonly used protocol for sharing data storage in NAS.
SMB/Samba: Server Message Block Protocol: A popular implementation of the CIFS protocol used in the data storage industry.
CIFS Server: The software module that runs on a NAS subsystem and exports File System Shares to be accessed by CIFS clients over a TCP/IP Network.
DRAM: Dynamic Random Access memory.
DMA: Direct Memory Access. Data transfer using a separate block without CPU intervention.
As noted in Wikipedia, a conventional computer operating system usually segregates virtual memory into kernel space and user space. Kernel space is strictly reserved for running the kernel, kernel extensions, and most device drivers. In contrast, user space is the memory area where all user mode applications work and this memory can be swapped out when necessary. Each user space process or application normally runs in its own virtual memory space, and, unless explicitly requested, cannot access the memory of other processes. This is the basis for memory protection in today's mainstream operating systems, and a building block for privilege separation.
In computing, the kernel is the central component of most computer operating systems; it is a bridge between applications and the actual data processing done at the hardware level. The kernel's responsibilities include managing the system's resources (the communication between hardware and software components). Usually as a basic component of an operating system, a kernel can provide the lowest-level abstraction layer for the resources (especially processors and I/O devices) that application software must control to perform its function. It typically makes these facilities available to application processes through inter-process communication mechanisms and system calls.
The kernel is also understood to be the part of the operating system that is mandatory and common to all other software applications. The existence of a kernel is a natural consequence of designing a computer system as a series of abstraction layers, each relying on the functions of layers beneath it. The kernel, from this viewpoint, is simply the name given to the lowest level of abstraction that is implemented in software. A kernel does not typically execute directly, but only in response to external events (e.g., via system calls used by applications to request services from the kernel, or via interrupts used by the hardware to notify the kernel of events).
The kernel's primary purpose is to manage the computer's resources and allow other programs to run and use these resources. Typically, the resources consist of a Central Processing Unit (CPU) or processor, which is responsible for running or executing programs. The kernel takes responsibility for deciding at any time which of the many running programs should be allocated to the processor or processors. Another resource managed by the kernel is the computer's memory. Memory is used to store both program instructions and data. The kernel is responsible for deciding which memory each process can use. Another managed resource are Input/Output (I/O) devices present in the computer, such as keyboard, mouse, disk drives, printers, displays, and NAS. The kernel allocates requests from applications to perform I/O to an appropriate device (or subsection of a device, in the case of files on a disk or windows on a display) and provides convenient methods for using the device (typically abstracted to the point where the application does not need to know implementation details of the device).
A main task of a kernel is to allow the execution of applications and support them with features such as hardware abstractions. A process or application defines which memory portions the application can access. To run an application, a kernel typically sets up an address space for the application, loads the file containing the application's code into memory, sets up a stack for the program, and branches to a given location inside the program, thus starting its execution.
The kernel has full access to the system's memory and must allow processes to safely access this memory as they require it. Often the first step in doing this is virtual addressing, usually achieved by paging and/or segmentation. Virtual addressing allows the kernel to make a given physical address appear to be another address, the virtual address. Virtual address spaces may be different for different processes; the memory that one application accesses at a particular (virtual) address may be different memory from what another application accesses at the same address. This allows every application to behave as if it is the only one (apart from the kernel) running and thus prevents applications from crashing each other.
In many systems, an application's virtual address may refer to data which is not currently in memory. The layer of indirection provided by virtual addressing allows the operating system to use other data stores, like a hard drive, to store what would otherwise have to remain in main memory (RAM). As a result, operating systems can allow programs to use more memory than the system has physically available. When a program needs data which is not currently in RAM, the CPU signals to the kernel that this has happened, and the kernel responds by writing the contents of an inactive memory block to disk (if necessary) and replacing it with the data requested by the program. The application can then be resumed from the point where it was stopped. This scheme is generally known as demand paging.
Virtual addressing also allows creation of virtual partitions of memory in two disjointed areas, one being reserved for the kernel (kernel space) and the other for the applications (user space). The applications are not permitted by the processor to address kernel memory, thus preventing an application from damaging the running kernel. This fundamental partition of memory space has contributed much to the current design of actual general-purpose kernels and is almost universal.
To perform useful functions, applications need to access peripherals connected to the computer, which are controlled by the kernel through device drivers. As device management is a very OS-specific topic, these drivers are handled differently by each kind of kernel design, but in every case, the kernel has to provide the drivers access to peripherals through some port or memory location.
To actually perform useful work, an application must be able to access the services provided by the kernel. This is implemented differently by each kernel, but most provide a C library or an API, which in turn invokes the related kernel functions. Some examples of invocation include a software-simulated interrupt, a call gate, a special system-call instruction, or a memory-based queue.
In computer networking, an Internet socket or network socket is an endpoint of a bidirectional inter-process communication flow across an Internet Protocol-based computer network, such as the Internet. The term Internet sockets is also used as a name for an application programming interface (API) for the TCP/IP protocol stack, usually provided by the operating system. Internet sockets constitute a mechanism for delivering incoming data packets to the appropriate application process or thread, based on a combination of local and remote IP addresses and port numbers. Each socket is mapped by the operating system to a communicating application process or thread.
A socket address is the combination of an IP address (the location of the computer) and a port (which is mapped to the application program process) into a single identity, much like one end of a telephone connection is the combination of a phone number and a particular extension.
An Internet socket is characterized by a unique combination of the following:
a local socket address: Local IP address and port number;
a remote socket address: Only for established TCP sockets, which is necessary since a TCP server may serve several clients concurrently. The server creates one socket for each client, and these sockets share the same local socket address;
a transport protocol (e.g., TCP, UDP), raw IP, or others.
Within the operating system and the application that created a socket, the socket is referred to by a unique integer number called socket identifier or socket number. The operating system forwards the payload of incoming IP packets to the corresponding application by extracting the socket address information from the IP and transport protocol headers and stripping the headers from the application data.
A file system is a method of storing and organizing computer files and their data. Essentially, it organizes these files into a database for the storage, organization, manipulation, and retrieval by the computer's operating system. Most file systems make use of an underlying data storage device that offers access to an array of fixed-size physical sectors, generally a power of 2 in size (512 bytes or 1, 2, or 4 KB are most common). The file system is responsible for organizing these sectors into files and directories, and keeping track of which sectors belong to which file and which are not being used. Most file systems address data in fixed-sized units called “clusters” or “blocks” which contain a certain number of disk sectors (usually 1-128). This is the smallest amount of disk space that can be allocated to hold a file. The file system is typically an integral part of any modern operating system.
FIGS. 1A and 1B are a schematic diagram of a file system writing data into a NAS (prior art). Conventional file systems write into NAS using 6 DRAM accesses, which reduces performance and increases system resource utilization. The data, in the form of Ethernet frames, enters the NAS subsystem through the Ethernet port and is written to DRAM, which results in a memory write (1W). A NAS/samba server listens on a TCP socket port to receive data (from a client) that is to be written to the file system on the NAS subsystem. After this; the TCP stack performs re-assembly and re-ordering of the data into TCP segments. These TCP segments are copied to DRAM socket buffers in user space in the context of the NAS/samba server application. This involves a read access and a write access of DRAM (2R and 3W). The samba/NAS server then performs a file system write of this data at the specified offset within the file denoted by the file descriptor. This involves a read from the user buffers and a write to the kernel file system buffers in DRAM (4R and 5W). Finally, the data from file system buffers in DRAM is asynchronously written to file system on disk. This involves a read of the file system buffers in DRAM (6R).
FIGS. 2A and 2B are a schematic diagram of a conventional zero-copy NAS write mechanism (prior art). In the system of FIGS. 2A and 2B the number of DRAM accesses is reduced to 4, which significantly improves performance and at the same time reduces the amount of system resources required to perform the file system write. As data, in the form of Ethernet frames, enters the NAS subsystem through the Ethernet port, it is written to DRAM, which results in a memory (1W). After this, the TCP stack performs re-assembly and re-ordering of the data into TCP segments. The NAS zero-copy logic copies data from multiple TCP segments to the file system buffers in DRAM at the specified offset within the file denoted by the file descriptor. This involves a read from TCP segments in DRAM and a write to file system blocks in DRAM (2R and 3W). Finally, the data from the file system buffers in DRAM is asynchronously written to file system on disk (i.e. block device or NAS subsystem). This involves a read of the file system buffers in DRAM (4R).
The NAS file method of FIGS. 1A and 1B involves 6 accesses to DRAM, while the method shown in FIGS. 2A and 2B involves 4 DRAM accesses. In both of these methods NAS file system write performance is degraded due to multiple accesses of DRAM.
It would be advantageous if a NAS file system could be made to operate with a reduced number of DRAM accesses.