Data networks continue to evolve with ever-higher speeds and more extensive topologies. In order to improve performance of such networks and troubleshoot problems, it is well known to monitor performance of networks through various techniques.
A Storage Area Network (SAN) is a data network that allows servers to access block level data from storage devices. The storage devices usually consist of hard disk arrays or other storage devices which communicate to servers via the Small Computer System Interface (SCSI). In SAN, the idea of one server directly accessing one storage device is expanded upon so that many servers can share disk arrays through multiple connections via switches and other network hardware.
Using the standard SCSI protocol, servers send read and write requests to the storage arrays via the switches, and get sent back read and write responses which include the data requested or status of completion. Many servers can make read and write requests to one storage array and, inversely, one server can make read and write requests to many different storage arrays. The Fiber Channel (FC) protocol is a technology used for high speed optic communications in the SAN which are used to deliver the commands which are encoded in the SCSI protocol. A server's SCSI request is encapsulated and converted to the optical signal via a Host Bus Adapter (HBA), travels along the network, being forwarded by one or several switches, and is decoded by the storage device back into the SCSI form for processing.
Built into a conventional storage array is a queuing system that allows the storage array to hold multiple requests (from various sources) and prioritize them for efficiency. The increase in efficiency stems from the fact that the storage array makes reads and writes from different physical regions on a physical hard drive. Every time the array needs to make a read from a different area of the physical hard disk, physically repositioning the read/write head is required. This takes a certain amount of time for movement and stabilization. If every read or write is done serially and in a different location, the movement and stabilization time delay is between every individual read/write job. On the other hand, if a large queue of reads and writes is held, the storage array has the ability to rearrange the reads and writes and group them by the physical hard disk location. This makes it possible to minimize repositioning of the read/write head between subsequence requests, decreasing the number of time delays. For every concurrent access against the same physical disk, the increase in efficiency decreases the service time (the time the disk is working on the request.) Decreasing the service time is generally preferable, as it allows the entire SAN network to run faster.
When a read or write request is received and placed in the queue, the queue size increases by one; every time a read or write request is processed the size of the queue decreases by one. The maximum size this queue depth can reach is called the “Target Port Queue Depth.” A common value of this Target Port Queue Depth is 2048. If the Storage Array's Target Port Queue Depth is 2048 and the current target port queue size is 2048, then a subsequent request would force the storage array to bounce back the request (unprocessed and unqueued) to the server with a “QFULL” message or Task Set Full message. This tells the server that request will not be fulfilled and the server is forced to try again the request at a later time. Any server that sends a request at that point would have the request denied and be sent a QFULL message. The Target Port Queue could have been filled by one server that sent 2048 requests or 2048 servers that sent one request each. In fact, the storage array's queue can be filled with requests from any number of servers. Since receiving QFULL messages forces an HBA to send the same request more than once and has other potentially more significant impacts depending on the implementation, filling the Target Port Queue to capacity adds to network inefficiency and typically should be avoided.
One metric that is helpful in measuring the network inefficiency is a request response time. The request response time is the service time added to the waiting time for a particular request. Whereas decreasing the service time is generally preferable, increasing or decreasing the response time needs to be evaluated on a case by case basis based on application requirements. A real time data processing system might need a very fast response time whereas a data backup might not be slowed by a very long response time (in the latter case it is the total job time that matters not the response time of an individual read/write request.)
SAN-attached servers have a connected HBA which has a built in configuration setting that can toggle its ability to queue its data requests. This queuing of data requests refers to whether the HBA allows multiple requests to the same target before a response has been received for the previous outstanding requests. Not only can the HBA be configured to allow or prohibit this feature, but also the maximum amount of unanswered requests per target and LUN can be set. This is called the maximum queue depth of the HBA or the logical unit number (LUN) queue depth. The LUN queue depth determines how many commands the HBA is willing to accept and process per Target—LUN pair. From this point in this documentation, LUN queue depth, maximum queue depth and queue depth will be used interchangeably, all referring to the same setting. A maximum queue depth of one means that one request outstanding is the maximum amount of queue available (this is the equivalent to no queue). A maximum queue depth limit of two means that two outstanding requests may be made to the target before a request is responded to. When the initiator (server) receives a completed response, that request becomes answered and then only one request is outstanding. The server then can make another request increasing the queue size again (provided the new size is less than the queue depth limit maximum).
The target queue depth and LUN queue depth are variables that directly affect the service time and the response time. The service time and response time are inversely proportional, i.e. lowering the service time raises the response time and vice-versa. To exemplify this relationship consider that as a server sends more jobs simultaneously the queue of the storage array will eventually grow to the point that the array cannot handle the requests as they received. This enlargement of the storage array queue will lower the service time as discussed for the storage array globally. The server, though, will experience a larger response time for each request because there are potentially other requests being serviced by the disk first. Even though the individual requests are being satisfied by disk faster, there are more items in the queue waiting to be serviced.
The disk service queue is not FIFO (first-in, first-out); the command order may be optimized to improve service times. For example, consider the case where the server sends 5 read requests that reside on the same physical spindle, denoted as A, B, C, and D & E. All of these requests will be entered in the disk's queue which subsequently decides, based on its own logic and understanding of the logical block addresses, to process B,C,D,E and A, in the very front of the storage array queue, in that exact order. The response time of B is essentially the service time and no waiting time. The response time of C is the service time plus the time waiting for B to be processed. The response time of the D is the service time plus the waiting time for B and C and so forth. The average response time is than the average of all the response times for B through A. Since A had to wait until the end of the queue, its response time is large and raises the average response time for all those jobs. Without a queue, the response time, for an otherwise empty storage array, would be just the service time. This is why the LUN queue depth acts as an optimization condition with the total effect being to lower the service time, maximize the number of requests satisfied and raise the response time. This creates a balance between the positive aspects of decreasing the service time and the potentially (sometimes but not always) negative aspect of raising the response time. Because of this the maximum queue depth limit can be thought of as a knob which can continually tune the performance of a SAN.
Not all SAN administrators have modified the maximum queue depth limit configuration on the HBA (often sufficient information is unavailable on the optimal queue depth, and so in the absence of guidance it is left at default). Those who do alter the maximum queue depth limit usually do so following an extremely simplistic mathematical approach. This simplistic approach is to say that Queue Depth Limit of an HBA should be the Target Port Queue Depth divided by the number of paths connected to the target port and further divided by the number of LUNs the host can access from that particular port:QDLUN=QDTARGET/(Paths to Target*LUNS Accessible)
This treats all servers equally and divides up the resources equally between all servers. This might be acceptable if the network was perfectly balanced, meaning that every server had the same amount of read/write requests to every LUN, consistently at all times and, moreover, that the performance of every server was equally important. This, however, is a gross approximation. First, network traffic is rarely constant in all times, meaning as one server gets busy other servers might be totally idle. Resources reserved for idle servers are wasted. Second, different servers may be more or less important depending on the applications that the servers host. Therefore, equally distributing the resources to servers deemed less important means that resources are wasted. Third, and possibly most importantly, this simplistic method does not take into account any timing metrics (e.g., response time or service time.) Calculating available resources and dividing says nothing on how raising a particular server's queue depth limit will affect the response time of a particular server in the network. This means that the mathematic approach is unsatisfactory but, unfortunately for SAN management, has been the best option available.
A major roadblock in improving this method is that the SAN operates in many ways as a black box. The response times and the service times are neither recorded nor measured—nor is the size of the queue. This leaves guesswork as the basis for further improvements. Thus, a challenge remains as to how further improvements in SAN performance can be achieved.
Thus, challenges of network performance monitoring include minimizing the disruption to the network caused by the monitoring. Further challenges come from the complexity of issues relating to network performance and corresponding ways to manage such performance.