The present invention relates to the field of multiprocessing in a shared memory computer architecture, particularly for performing Input/Output (IO) applications. specifically, it pertains to a method for interfacing transactional and non-transactional IO systems in the multiprocessing environment.
Within the ever evolving world of computer systems, a particular change has arisen with respect to the design of better and faster systems. Originally, systems were implemented in a uni-processor environment, whereby a single Central Processing Unit (CPU), hereafter referred to as processor, was responsible for all computer performance, including computations and IO. Unfortunately, uni-processor designs have built-in bottlenecks, where the address and data buses restrict data transfer to a one-at-a-time trickle of traffic, and the system program counter forces instructions to be executed in strict sequence. Rather than designing better, faster uni-processor machines which will never overcome the bottleneck limitation, a different computer system design was realized in order to effect real improvements in computer performance, specifically the multiprocessor system.
The multiprocessing environment may be a shared memory (tightly coupled) system or a distributed system, and involves the use of more than one processor, also referred to as Processing Element (PE), where these processors share resources, such as IO channels, control units, files and devices. Within a particular distributed multiprocessor system, the processors may be in a single machine sharing a single bus or connected by other topologies (e.g. crossbar, grid, ring), or they might be in several machines using message-passing across a network. In the case of a shared memory multiprocessor system, the processors may be connected to shared memory by a crossbar topology, or they may be using a network. An important capability of the multiprocessor operating system is its ability to withstand equipment failures in individual processors and to continue operation. Although there are different basic operating system organizations for multiprocessor systems, one example is symmetric multiprocessing, where all of the processors are functionally equivalent and can perform IO and computation. In this case, the operating system manages a pool of identical PEs, any one of which may be used to control any IO device or reference any storage unit. Note that the same process may be run at different times by any of the PEs.
The evolution to a multiprocessing environment has brought about a number of changes with respect to the Input/Output system, where this IO system typically provides the interface between programmer applications and IO hardware. It is responsible for attending to individual requirements of the IO devices and for servicing their requirements in an efficient and reliable manner. Furthermore, the IO system hides the details of IO specific implementation from applications, while offering to these applications various IO services, such as mass storage, proprietary messaging, a reset interface and high speed interfaces.
A multiprocessor, shared memory system as disclosed in co-pending U.S. patent application Ser. No. 08/774,548, entitled Ashared Memory Control Algorithm for Mutual Exclusion and Rollback@, by Brian Baker and Terry Newell, and incorporated herein by reference, effects certain permanent system changes in Atransactions@. In this system, multiple processors execute processes that may modify shared memory. Memory changes made by a process executing on a processor do not permanently affect the shared memory until the process successfully completes. During process execution, memory used by a process is Aowned@ by that process; read and write access by other processes is locked out. If a process does not successfully complete or attempts to access memory owned by another process, the process is aborted and memory affected by the process is Arolled back@ to its previous state. Memory changes are only made permanent (or Acommitted@) upon successful process completion. In this context, Atransactions@ may be considered those intervals between initial system accesses that may ultimately permanently affect the system state, and the Acommittal@ of the state changes to the system. This shared memory system is referred to as a transactional system.
Further, a multiprocessor, shared memory computing system is disclosed in co-pending U.S. patent application Ser. No. 08/997,776, entitled AComputing System having Fault Containment@, by Barry Wood et al and assigned to Northern Telecom Limited, the contents of which are also herein incorporated by reference. The multiprocessor system comprises a plurality of processing element modules, input/output processor modules and shared memory modules interconnected with the processing elements and input/output processors. The modules are interconnected by point to multi-point communication links. Shared memory is updated and read by exchanging frames forming memory access transactions over these links.
Specific to the IO system for the novel multiprocessor, shared memory computing systems disclosed by Wood et al. and Baker and Newell, multiple Input Output Processors (IOPs) must share access to certain IO data structures with the various IO software applications running on one or more PEs. Within such a multiprocessor, shared memory architecture, operations on the shared memory data are transactional, meaning that changes are not considered permanent and globally visible until they are committed by an application (i.e. the completion of a transaction). Therefore, IO operations such as sending a message can not be considered permanent until the IO software application has committed its data. On the other hand, IO events are handled by dedicated IO firmware (where firmware consists in programming instructions stored in a read-only memory unit rather than implemented through software) via exceptions, in which the IO firmware is expected to service the exception to completion before continuing. This is characterized as non-transactional since the event is permanent and non-repeatable.
Unfortunately, problems arise when interfacing transactional and non-transactional systems in the multiprocessing, shared memory environment. A non-transactional system will service an event without waiting for the committal of state changes to the system, thus eliminating the possibility of a state roll-back as required in a transactional system. For example, if an IO event was received and ultimately serviced by an IO software application running on a PE, this IO software would be expected to completely handle the event. However, since the IO software application is transactional, the completion of the event is dependent on the transaction completing. Given that the IOP relies on firmware implemented code, and thereby is non-transactional and non-repeatable, the IOP firmware will simply assume that the event was properly serviced In the case where the to software transaction did not complete, the IO event will be lost.
The background information provided above shows that there exists a need in the industry to provide a method and apparatus for interfacing Input/Output transactional and non-transactional systems in a multiprocessing, shared memory environment.
In summary, the present invention provides a machine readable storage medium containing a program element to implement a queuing system, also referred to as a Flexible Input/output Queuing System (FIQS). Such a queuing system may be used to exchange data elements such as IO commands or application data between application software and an IO service layer. The queuing system includes a queue data structure, preferably a circular queue data structure and two pointers that control the enqueuing of data elements to the queue and the dequeuing of data elements from the queue. One of the pointers is a write pointer and the other pointer is a read pointer. The circular queue data structure allows for running process elements to be inserted into and removed from the queue without blocking each other.
In a specific example, application software takes ownership of the write pointer to enqueue a data element for service by the IO service layer. Since access to the queue is serialized for a minimal amount of time to get ownership of an individual queue element, the application is required to record the progress of a transaction and release the write pointer, thereby allowing other applications to enter the queue. The write pointer becomes available to the same application software again or to a different application software. A queue element is only considered enqueued if the application has written and committed the data to the element. The read pointer controls the dequeuing of data elements from the queue for processing by either the IO service provider or the software application. Specifically, the read pointer sequentially processes the data elements previously enqueued by operation of the write pointer.
The queuing system provides for numerous application programmable features, making it very flexible for the software application. In a specific example, one such feature ensures that if the read pointer encounters a data element in the queue that is not dequeuable, it skips over it. This prevents the queuing system from becoming blocked. A data element may not be dequeuable for a number of reasons, such as the application software may have written but not yet committed the data to the queue, the application software may have stopped running, or any other reason.
This novel queuing system is particularly useful for multiprocessor computing platforms. By forcing the IO service layer and the IO software applications to communicate via the queuing system data structure, non-blocking access between the IO software applications is permitted in a multiprocessor system. The queuing system can support multiple IO service calls from different software applications in a robust manner and may avoid becoming blocked by a single software application.
The invention also extends to a method and a system for performing IO services in a multiprocessing, shared memory environment.