Apparatuses and methods for a test and measurement instrument are desirable for providing a scalable test and measurement instrument capable of handling the acquisition, transfer, analysis, and display of large quantities of waveform data as well as complex waveforms. Demand for new oscilloscope application features is growing, especially the ability to process ever-greater quantities of waveform data, because signals are becoming increasingly complex. Analyzing complex waveforms generates more intermediate data, which in turn requires more system memory access instances.
Most software applications have enjoyed regular performance gains for several decades, even without significant modifications, merely because of increases in computer hardware performance. Central Processing Unit (CPU) manufacturers and, to a lesser degree, memory manufacturers have reliably increased processing speeds and lowered memory access times. However, performance gains through increasing CPU clock speeds are seriously inhibited by heat generation, electron leakage, and other physical limitations, while system memory speeds have historically doubled only every 10 years.
Since major processor manufacturers and architectures can no longer easily boost straight-line instruction throughput, performance gains in test and measurement instruments, such as oscilloscopes, will have to be accomplished in fundamentally different ways. Because CPU manufacturers have adopted dual core and multicore processors to increase performance, oscilloscope applications will have to enable concurrent processing in order to exploit the CPU performance gains that are becoming available. What is therefore needed is a practical apparatus and a realizable method that provides a scalable test and measurement instrument capable of handling large quantities of waveform data as well as complex waveforms.
The use of oscilloscopes is known in the prior art. For example, oscilloscopes currently manufactured by Tektronix, Inc. of Beaverton, Oreg. ship with a single core 3.42 GHz Pentium® processor from Intel. These prior art oscilloscopes cannot have their performance boosted through use of a faster single CPU because CPUs with higher clock speeds do not presently exist. Furthermore, mere replacement of the single core CPU with a dual core or multicore CPU offers minimal benefit because many of the important operations of an oscilloscope application are not CPU constrained. In an instrument that moves and processes a large quantity of data, system memory access times and/or system bus performance often are the instrument's performance bottleneck.
Existing high-end oscilloscopes, such as those currently manufactured by Tektronix, Inc., already incorporate a sizable system memory (2 GB of system RAM is typical). Because of increasing quantities of data to be processed and stored, next-generation oscilloscope architectures will undoubtedly require additional memory. Since increases in main memory speeds are realized infrequently, the time required to access system memory is likely to continue to dominate many applications' performance. Therefore, the addition of a multicore processor to existing oscilloscope architectures provides minimal benefit because system memory cannot provide data as fast as the processors can process it.
Furthermore, the data acquisition process is an inherently sequential four-step process presenting additional challenges to the adoption of multicore CPU technology in oscilloscope applications. FIG. 1 depicts a single core processor prior art oscilloscope architecture 100 that acquires and stores waveform data from four channels 120-126 into four data records in the system memory 114. Conventionally, waveforms are stored in the local memory 130 of the acquisition hardware 118 in a first step and subsequently transferred serially to the system memory 114 via a Peripheral Component Interconnect (PCI) or Peripheral Component Interconnect Express (PCIe) system bus 116 and bridge 112 in a second step. The CPU 110 then analyzes the waveform data in a third step and causes the results to be shown on a display screen 128 in a fourth and final step. The acquisition hardware 118 may be embodied in a peripheral device attached to the system bus 116 that is operable by the operating system.
This four-step process is not easily amenable to parallelization. These four subtasks cannot be run at the same time on four CPU cores with this prior art architecture because each must be completed before the next can begin. Nor can these four subtasks be pipelined either. In this context, a pipeline is a set of data processing elements connected in series so that the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in a time-sliced fashion. However, because three of the steps require access to the system memory to run and store intermediate data generated as data moves through the pipeline, parallel processing is impossible. Therefore, the inherently sequential nature of the data acquisition process prevents taking full advantage of multicore processor technology.
The system memory also creates a bottleneck because it is used for waveform storage data and shared by several clients, including Analysis, General Purpose Interface Bus, Display, Acquisition, Math, Save/Recall, and Applications. Because these clients must access the data serially from the shared system memory, it is impossible to create parallelism among the clients and run them at the same time. The architecture's data transfer rate and system bandwidth also pose limiting factors, which are likely to worsen. Next-generation real-time data acquisition hardware will have very large record lengths per channel. Existing oscilloscope architectures cannot transfer, analyze, and display that much data in real-time.
An initial prior art attempt to address some of these problems was the TDS7000-series oscilloscope manufactured by Tektronix, Inc. whose architecture 200 is depicted in FIG. 2. This architecture employed a dual core processor 210, 212. Although each processor could access the other's memory 214, 228, this was accomplished using the Direct Memory Access (DMA) 230 process over a PCI bus 216, a relatively slow computer bus. An inability to transfer data sufficiently rapidly to continuously occupy both processors left the oscilloscope unable to take full advantage of the presence of two processors.
FIG. 3 shows a prior art oscilloscope system architecture employing a quad core CPU 300 developed by one of the inventors (M. Sedeh) of the current invention. A quad core CPU 310, 328, 330, and 332 is the dominant high-performance computer architecture in industry, known as Symmetric Multiprocessor (SMP) architecture. While the SMP architecture performs adequately in many respects, it unfortunately exhibits architectural limitations. In an SMP-based system, all processors access a shared pool of memory 314 over a central memory bus. While this limited the effectiveness of the dual core system depicted in FIG. 2, an even greater problem with memory access occurs when quad core or higher multicore CPUs are utilized. Because the processors are often fighting each other for access to the single memory bus, a serious bottleneck develops. This occurs because the time to move data back and forth between the processors 310, 328, 330, and 332 and the system memory 314 increases. This major bottleneck is especially severe in an instrument like a high-end oscilloscope. High-end oscilloscopes require the movement of large amounts of data and utilize processor-intensive applications that create considerable traffic between the processors 310, 328, 330, and 332 and the system memory 314. Data sets in modern high-end oscilloscopes can be so large that they are not entirely cacheable, resulting in many system memory access instances. This problem with memory access times is aggravated by use of the same system bus and memory bus for Input/Output (I/O) and DMA transfer of waveform data from the acquisition hardware's 318 local memory 334.
Another architectural problem with SMP architecture is that the memory system does not scale up with increasing numbers of processor cores. Memory access occurs via a single memory controller 422 (shown in FIG. 4) for the entire system, no matter how many processor cores 410,412 are present. This serious problem prevents taking full advantage of multicore CPUs because they cannot obtain enough data in a timely fashion to always remain busy because memory is a shared resource. Thus, performance of applications with large memory requirements remains largely constrained by memory access times.
Preliminary performance testing on dual core and quad core high performance oscilloscopes using the architectures depicted in depicted in FIGS. 2 and 3 showed no significant performance gains over single core instruments. The lack of performance gains was not surprising because the prior art data acquisition process is sequential in nature. All processor cores must share the system memory, and applications tend to be highly memory intensive. Because the memory system cannot provide data as fast as the application needs it to keep all of the processor cores busy simultaneously, very little parallel processing can occur, making the additional processor cores only marginally utilized.
Therefore, a need exists for a new and improved apparatus and method for a test and measurement instrument that can be used for providing a scalable test and measurement instrument capable of handling the acquisition, transfer, analysis, and display of large quantities of waveform data as well as complex waveforms. In this regard, the various embodiments of the present invention substantially fulfill at least some of these needs. In this respect, the apparatus and method for a test and measurement instrument according to the present invention substantially departs from the conventional concepts and designs of the prior art, and in doing so provides an apparatus primarily developed for the purpose of providing a scalable test and measurement instrument capable of handling the acquisition, transfer, analysis, and display of large quantities of waveform data as well as complex waveforms.