The application relates generally to test equipment and relates more particularly to test equipment for networks, packet data, and communications links.
Historically, and more often before the time of the Internet, many networking and communications protocols were proprietary or at least administered by a controlling entity under circumstances limiting the extent to which third parties could attempt to make compatible equipment and software. In the case of a protocol or standard that was tightly controlled by a particular entity (typically a large corporation), it commonly happened that that entity took it upon itself to certify whether particular third-party equipment was or was not “compatible” with the protocol or standard. In the case of a proprietary standard or protocol it might develop that only the proprietary entity had the ability to make equipment intended to be “compatible,” perhaps due to patents or due to an unpublished “standard” known only to the entity. One natural consequence was that compatibility problems and standards-compliance problems occurred relatively rarely, and another was that barriers to entry often reduced or eliminated competition in particular markets.
In more recent times, there has been a trend toward “open” standards, perhaps paralleled to some extent by “open-source” and general-public-license software. Some open standards have come to be widely adopted, in some cases providing levels cross-platform connectivity and interoperability which, as recently as twenty-five years ago, would have been difficult to imagine. For example, in recent years it has come to be possible to exchange data between almost any two operating systems with the help of IP and TCP protocols. Likewise it has come to be possible to exchange email between almost any two email systems, regardless of the designer or the underlying operating system and hardware, with the help of the RFC (request for comments) 822 standard. These trends have led to presence of multiple suppliers in some markets and improved performance and reduced prices in some markets.
If an IP packet is lost, nothing about the IP standard will detect the loss. Likewise if two IP packets happen to arrive in a different sequence than they were sent, nothing about the IP standard will detect the change in sequence. To the extent that a system designer wishes to detect and deal with lost or reordered packets, there is no choice but to do it at some protocol level above IP. IP packets do have CRC (cyclic redundancy check) checksums and thus corruption of a packet will nearly always be detected within the IP protocol level.
It is convenient to define some terms that characterize things that can go wrong in communications networks. These include the following:                Packet loss. This is simply the disappearance of a packet that was transmitted or ought to have been transmitted.        Some media have mechanisms to recover lost packets (generally with some additional delay.) For purposes of this note, packets recovered by such media are not considered lost.        Packet Delay. Delay is the amount of time that elapses between the time a packet is transmitted and the time it is received. There is always delay—the speed of light in a vacuum places a lower limit on how small the packet delay can be. The actual propagation time of a packet across a telecommunication link varies considerably depending on the media.        Some media have complex access mechanisms. For example, CSMA controlled media, such as Ethernets, have fairly intricate access procedures that cause delays often larger than the raw propagation time.        In addition to propagation time, packets are delayed as the bits exit and enter computer or switch/router interfaces, as packets spend time in various queues in various switching and routing devices, and as software (or hardware) in those devices examines the packets, including any option fields, and deals with them.        Packets are sometimes delayed simply because a computer, router, or switch needs to do something else first, such as doing an ARP handshake to obtain the next hop's MAC address or looking up forwarding information.        Jitter. Jitter is a measure of the the variation in the packet delay experienced by a number of packets. Some packets may experience an unimpeded ride through the net while others may encounter various delays.        Jitter is often, but not always, accompanied by some degree of packet loss.        Jitter is important because it has a significant impact on how long software can expect to wait for data to arrive. That, in turn, has an impact on software buffering requirements. And for media streams, such as voice or video, in which late data is unusable, jitter affects algorithms that are used to create elastic shock-absorbing buffers to provide a smooth play-out of the media stream despite the packet jitter.        There are various formulas used to compute jitter. These formulas vary depending on the emphasis that one wants to put on more recent versus older transit variations.        
One protocol level that is overlaid above IP and that deals with lost and reordered packets is TCP (transmission control protocol). With TCP, a connection is agreed upon between two nodes. The connection has an explicit start and (barring some extreme timeout or loss of connectivity) an explicit end. For the duration of the connection, all packets that are sent are numbered and acknowledged. For the sending node, there is an obligation to preserve a copy of each packet sent until it has been acknowledged. (There is, by definition, some upper limit on how many unacknowledged packets will be permitted to exist at any given moment.) For the receiving node, there is a need to keep track of any gap in the numbers of the received packets, so that when a missing packet finally arrives it can be placed in proper sequence relative to those numbered above and below it. (There is also, by definition, some upper limit as to how many out-of-sequence packets may be stored while the missing packet or packets are awaited.) As a matter of terminology these functions (and others, such as congestion avoidance) are carried out by what is called a “TCP stack.”
With the above-mentioned trends come potential problems. Any would-be supplier of a node or network device or system (hereinafter often referred to as a “node”) could offer it to the public as supposedly complying with relevant standards or RFCs, with no choke-point controller (such as a large corporation controlling a standard) to block it. This led to a natural concern as to whether a particular node or device or system was, in fact, compliant with the standard or RFC. As described by Jon Postel in RFC 1025 (September 1987):                In the early days of the development of TCP and IP, when there were very few implementations and the specifications were still evolving, the only way to determine if an implementation was “correct” was to test it against other implementations and argue that the results showed your own implementation to have done the right thing. These tests and discussions could, in those early days, as likely change the specification as change the implementation.        There were a few times when this testing was focused, bringing together all known implementations and running through a set of tests in hopes of demonstrating the N-squared connectivity and correct implementation of the various tricky cases. These events were called “Bake Offs.”        
With the growth of the Internet, it became impossible to carry out an “N-squared” demonstration in which each of N devices could be tested for interoperability with the other N−1 devices (and indeed with a device of its own kind), due, among other things, to N becoming very large. In the particular case of TCP connectivity, it became clear that while most TCP stacks performed their desired functions fairly reliably when presented with a connection made up of “ordinary” data, some of them did not perform well in the event of connections made up of inputs that, while standards-compliant, were somewhat out of the ordinary.
The traditional approaches to these problems include the following:                Design reviews. Designers, engineers, and programmers review a design as it is developed, hoping to assure standards compliance.        Code reviews. The actual code written by one or more programmers is discussed with additional programmers, for example with a “walk-through” of the program flow.        Protocol test suites. A standardized body of data (e.g. a data file) is fed into a system again and again to test the ability of the system to handle the data. An effort is made to include, within this data file, some fairly wide range of possible inputs.        Traffic generators. A device may be created that generates traffic (e.g. packets) according to some particular protocol, thereby testing (among other things) the bandwidth of the node being tested as well as its ability to operate for long periods of time (e.g. to test for certain categories of memory leaks).        
None of these approaches suffices by itself to test fully the standards compliance of a node, and even these approaches taken together do not lead to complete confidence as to standards compliance. A design review or code review, for example, are performed by humans and thus may completely miss something important that was omitted; it is well known that humans are better at catching something that is visibly incorrect than they are at noticing that something is missing entirely. If a design or body of code is simply lacking a way to test for a boundary condition, for example, this is easy to miss.
A distinction can also be drawn between systems and code that are deterministic (that always have the same outputs given certain highly predictable inputs) and systems and code that must deal with a variety of inputs at various times (e.g. asynchronous inputs) and that must deal with potential race conditions among various circuits or data flow paths. An example of a deterministic code body is software to generate, say, daily credit card statements in batches. In the universe of computer programmers and software designers and electrical engineers and systems engineers, almost all are competent to create and to review deterministic systems and code sets. But the fraction of this universe composed of persons who are very good at reviewing the latter types of systems (systems with timing issues, race conditions, asynchronous inputs) turns out to be extremely small. Experience suggests that the need for such people far exceeds the supply. There are not enough of them, for example, to do even a small proportion of the design reviews and code reviews that would be needed to for comprehensive standards-compliance reviews of Internet-related products.
While protocol test suites are an important part of testing nodes, they cannot test all or even most of the ways in which a node may be improperly designed. Traffic generators can also be important but again are unlikely to detect subtle design errors. Some prior-art known traffic generators do not, for example, lose packets, reorder packets, corrupt packets, modify packets, create new packets, or delay transfer of packets. Prior-art known traffic generators furthermore do not do these things based upon past traffic. Such traffic generators are not designed to act as an intermediary between two nodes, are not designed to receive packets from multiple sources, do not modify packets based on predetermined or user-controlled criteria, or resend packets which have been intercepted and then modified.
One prior-art approach is described by Postel (id.):                Some tests are made more interesting by the use of a “flakeway.” A flakeway is a purposely flakey gateway. It should have control parameters that can be adjusted while it is running to specify a percentage of datagrams [packets] to be dropped, a percentage of datagrams to be corrupted and passed on, and a percentage of datagrams to be reordered so that they arrive in a different order than sent.        
While such flakeways have been devised and actually used for limited testing of TCP stacks, their function has been confined to testing a single connection over a single protocol (e.g. TCP). five-tuple
Further, as their function has been limited to simple manipulations only within a single protocol, they do not fully test all of the ways in which a node may fail to handle standards-compliant but infrequent events. With some more recent protocols, there are many options and variants which are within the protocol and yet which are not actually implemented in a first wave of node designs, and which may only come to be implemented in later node designs. This gives rise to a concern that a node from among the first wave may function as expected at first, but may fail to function properly as later-designed nodes commence being put into service.
As an example, in fairly recent times it has been proposed to communicate voice information over IP (VoIP). Large portions of the Internet, and many nodes on the Internet, predate VoIP and this raises the natural question whether the existing enterprise network can support VoIP with acceptable voice quality. This raises another natural question namely under what conditions voice quality will be impaired. For example, dynamic routing protocols such as BGP can lead to abrupt changes in the routing of packets during a particular voice conversation, raising the question of how the terminal equipment will handle such changes. Protocols used for VoIP permit the use of any of a number of “codecs” (coders/decoders) which define the manner in which analog signals are converted to digital and later converted back to analog; the protocols further permit shifting dynamically (during a particular voice conversation) from one codec to a different codec. Will such shifts be handled properly? Prior-art test devices do not provide full answers to these questions.
Yet another problem of long-standing duration arises from the development of standards which define more than one connection running in parallel (simultaneously). Consider a protocol which defines one connection to pass audio data and another connection to pass video data. Each connection, somewhat analogous to TCP, has mechanisms for detection of and dealing with missing packets. (Depending on the type of data being passed, such as audio data, the protocol may not actually bring about a retransmission of a dropped packet but may instead take some other action such as interpolating the audio signal for the interval represented by the dropped packet.) But it is not enough for each connection, taken by itself, to deal with dropped or delayed or out-of-order packets. There is an additional need, in the example of an audio path and a video path, for the rendered video (perceived by a human) and the rendered audio (again perceived by the same human) to be synchronized. Existing test devices do not fully test this requirement and there is a long-standing need for test devices which would fully test this requirement.
Yet another problem of long standing arises from the fact that there are whole categories of design mistakes that are simply not detected if the suite of tests is limited to dropped, delayed, corrupted, duplicated or reordered packets. There is a long-standing need for test devices which would detect more nearly all of the possible design mistakes in modern nodes that are intended to be compliant with present-day standards.