The following co-pending applications of common assignee contain some common disclosure:
xe2x80x9cDirectory-Based Cache Coherency System Supporting Multiple Instruction Processor and Input/Output Cachesxe2x80x9d, filed Dec. 31, 1997, Ser. No. 09/001,598, and incorporated herein by reference in its entirety.
1. Field of the Invention
This invention relates generally to an improved memory management system and method for use in a data processing system; and, more particularly, relates to an improved lock management system and method.
2. Description of the Prior Art
Data processing systems are becoming increasing complex. Some systems, such as Symmetric Multi-Processor computer systems, couple two or more Instruction Processors (IPs) and multiple Input/Output (I/O) Modules to shared memory. This allows the multiple IPs to operate simultaneously on the same task, and also allows multiple tasks to be performed at the same time to increase system throughput.
As the number of units coupled to a shared memory increases, more demands are placed on the memory and memory latency increases. To address this problem, high-speed cache memory systems are often coupled to one or more of the IPs for storing data signals that are copied from main memory or from other cache memories. These cache memories are generally capable of processing requests faster than the main memory while also serving to reduce the number of requests that the main memory must handle. This increases system throughput.
While the use of cache memories increases system throughput, it causes other design challenges. When multiple cache memories are coupled to a single main memory for the purpose of temporarily storing data signals, some system must be utilized to ensure that all IPs are working from the same (most recent) copy of the data. For example, if a data item is copied, and subsequently modified, within a cache memory, another IP requesting access to the same data item must be prevented from using the older copy of the data item stored either in main memory or the requesting IP""s cache. This is referred to as maintaining cache coherency. Maintaining cache coherency becomes more difficult as more cache memories are added to the system since more copies of a single data item may have to be tracked.
Another problem related to that described above involves providing a way to ensure continued access to shared data resources. In a shared memory system, various IPs may require access to common data stored in memory. A first IP that has copied such data within its cache memory may be forced to relinquish control over that data because another IP has requested that same information. If the first IP has not completed processing activities related to that data, the IP is required to re-gain access to it at a later time. In some instances, this is an acceptable way of performing processing activities. In other situations, losing control over a data item in the middle of program execution may result in errors.
The type of errors that are alluded to in the foregoing paragraph can best be understood by example. Consider a transaction processing system that is transferring funds from one bank account to another. The transaction is not considered complete until both bank account balances have been updated. If the instantiation of the software program, or xe2x80x9cthreadxe2x80x9d, which is processing this transaction loses access to the data associated with the account balances at a time when only half of the updates have been completed, the accounts may be in either an under- or over-funded state. To prevent this situation, some mechanism must be used to xe2x80x9clockxe2x80x9d, or activate, access rights to the data until the thread has completed all necessary processing activities. The thread then xe2x80x9cunlocksxe2x80x9d, or deactivates, sole access rights to the data.
Various types of locking mechanisms have been introduced in the prior art. Many of these locking mechanisms use a lock cell or semaphore. A lock cell is a variable that is used to control a software-lock to an associated shared resource such as shared memory data. The state of the lock cell indicates whether the software-lock and the associated, protected shared resource is currently activated by another thread. Generally, a thread activates the software-lock using a lock-type instruction. As is known in the art, this type of instruction first tests the state of the lock cell. If the state of the lock cell indicates the shared resource is available, the instruction then sets the lock cell to activate the software-lock to the executing thread. These testing and setting operations are performed in an atomic operation by a single instruction to prevent multiple processors from inadvertently gaining simultaneous access to the same lock cell.
The lock cell is generally stored within main memory. As noted above, this lock cell may be a software-lock associated with, and protecting, shared data. By software convention, the shared data must not be accessed without first gaining authorization through the software-lock. Many prior art systems store the lock cell and associated data in a same cacheable entity of memory, or xe2x80x9ccache linexe2x80x9d. As a result, when an IP attempts a lock-type operation on a lock cell, both the lock cell and at least some of the protected data are transferred to the IP""s cache. However, if this attempt is made when the software-lock had already been activated by another thread, the transfer of the lock cell and protected data to the new requester""s cache temporarily disrupts the processing activity of the IP executing the thread that had activated the software-lock. This reduces execution throughput.
A prior art solution for preventing the foregoing problem was to separate the lock cell into one cache line and the protected data into another cache line. While this solution prevents the thrashing of protected data when another thread attempts the software-lock, the solution causes the IP to acquire two cache lines. First, the IP copies the cache line that contains the lock cell in an exclusive state to attempt the software-lock. If the software-lock is successfully activated, the IP copies the cache line for the protected data upon the first reference to the data. The IP must temporarily suspend processing activities, or xe2x80x9cstallxe2x80x9d, during the time the protected data is copied from memory. Therefore, while this scheme may prevent the thrashing of data during attempted lock activation, it results in cache line access stalls after the software-lock is successfully activated.
A related problem to the foregoing involves acquiring cache lines using specific cache line states. Some processing systems such as the ES7000(trademark) platform commercially-available from the Unisys Corporation copy data to cache in a variety of states according to the first type of reference to the data or according to how the data was last used. For example, the data may be cached in a xe2x80x9csharedxe2x80x9d state such that the associated processor can read, but not update, this data. When the IP copies data to the cache in a shared state, a subsequent write operation causes the IP cache to acquire an xe2x80x9cexclusivexe2x80x9d state for the cache line so that the write can be completed. Acquiring the exclusive state after already having the shared state takes nearly as long as initially copying the data. Other data may be initially cached with exclusive state. Prior art locking mechanisms do not take into consideration how the data will be used when acquiring protected data from main memory, resulting in unnecessary disruption of processing activities.
Yet another drawback associated within prior art software-lock mechanisms involves the time associated with retrieving software-lock-protected data from main memory once a lock has been activated. The software-lock may be associated with one or more cache lines of data that must be copied from memory during subsequent transfer operations. These operations may each be relatively time consuming, especially if a multi-level caching hierarchy is involved. Some prior art systems force the requesting processor to wait in a stalled state as one or more cache lines of data associated with the software-lock are copied into its cache memory upon first reference to the data. Other prior art systems require that separate instructions be executed to pre-fetch the protected data cache lines once the software-lock has been activated. Since each transfer of this nature may require many instruction cycles to complete, throughput is dramatically impacted.
In view of the foregoing deficiencies in prior art locking systems, an improved locking mechanism is needed.
The current invention provides an improved system and method for locking shared resources. The invention may operate in a data processing environment including a main memory system coupled to multiple instruction processors (IPs). A novel lock-type instruction is included within the hardware instruction set of ones of the IPs. This lock-type instruction is executed to activate an addressed one of various lock cells stored at predetermined locations within the main memory.
After a software-lock has been activated by an executing thread, one or more addresses in the lock cell cache line are used as pointers to retrieve cache lines protected by the software-lock. In one embodiment, three cache lines are retrieved, although fewer or more such cache lines may be protected and automatically retrieved by activating the software-lock. Requests for the protected cache lines are issued automatically by the hardware on behalf of the executing thread. The IP continues instruction execution of the locking thread without stalling for the protected data cache lines. In this way, cache lines associated with the software-lock are automatically pre-fetched from main memory so that they will be available when a subsequent access is made by the thread. These pre-fetched data signals are copied to a cache memory associated with the requesting IP.
According to one aspect of the invention, each of the addresses contained in the lock data packet points to a different cache line of data. The cache lines of the protected data are different from the cache line of the lock cell. This allows an IP to cache the lock cell separately from the protected data so that an activation attempt may be made without disrupting the execution of another IP that has already activated the lock.
According to another aspect of the current invention, each of the pointer addresses is associated with an indicator for the type of access that is to be used when retrieving the protected data that is pointed to by that address. In one embodiment, either shared or exclusive access may be acquired.
In one embodiment of the invention, the lock-type instruction may be either a Lock-and-Pre-Fetch-and-Skip instruction or a Lock-and-Pre-Fetch instruction. The Lock-and-Pre-Fetch-and-Skip instruction allows the executing program to determine if the software-lock was successfully activated, or whether the software-lock was already activated by another thread. If an unsuccessful lock attempt occurs, the executing program determines if, and how, multiple attempts to activate the software-lock will be performed. In contrast, the Lock-and-Pre-Fetch instruction may involve only a single activation attempt that generates an interruption to an operating system in the event the attempt fails. When processing the interruption, the operating system may suspend the thread.
The invention may further include a novel Unlock-and-Flush instruction that unlocks, or deactivates, the software-lock after a thread has completed the processing associated with the protected data. This instruction, which may be included as part of the hardware instruction set of ones of the IPs, may flush the predetermined cache lines of the pre-fetched data to main memory. The instruction also unlocks the software-lock and flushes the lock-cell cache line to main memory. In one embodiment, all cache lines of protected data are flushed back to the main memory. In another embodiment, cache lines may be flushed on an individual basis. For example, all cache lines of pre-fetched protected data that are associated with exclusive access may be flushed to main memory, whereas all other pre-fetched cache lines obtained with shared access may be retained in cache memory.
According to yet another aspect of the invention, a method for performing locks within a data processing system is disclosed, wherein the data processing system includes a memory system coupled to multiple IPs. The method includes the steps of activating, on behalf of a requesting thread, a software-lock stored within the memory system. The method further includes issuing a request to retrieve cache lines that are pointed to by at least one address associated with the software-lock, and allowing the IP to continue execution without waiting for the requested data to be returned by the memory system.
According to yet another aspect of the invention, a method is provided for performing locks within a data processing system having a memory system and at least two IPs coupled to the memory system. The method includes executing a lock-type instruction that is part of the hardware instruction set of at least one of the IPs. If successful in activating the software-lock, the lock-type instruction uses at least one address associated with the software-lock to issue a request to pre-fetch data from main memory. In one embodiment, this data is stored within one or more cache lines. The IP is allowed to continue execution without waiting for the requested data to be returned from the main memory.
In yet a further embodiment of the invention, a system is provided for managing protected data within a data processor, wherein the data processor includes a memory coupled to multiple IPs. The system includes storage locations within the memory to store a software-lock. An inventive lock circuit is included within at least one of the IPs to execute a hardware lock instruction to activate the software-lock. The circuit further initiates a pre-fetch of associated, protected data from the memory while the IP continues instruction execution. The storage locations of the lock cell and the address of associated protected data may further contain indicators for a type of access rights to be associated with the pre-fetched protected data.
According to another embodiment of the invention, a system for managing protected data is provided. The system includes storage locations within a main memory that store a software-lock, and additional storage locations that store addresses of protected data. A cache memory coupled to the main memory is provided to temporarily cache the protected data after the software-lock has been activated. An inventive unlock circuit included within at least one of the IPs is provided to execute a hardware unlock instruction to deactivate the software-lock and to flush both the pre-fetched protected data and the lock cell from the cache memory back to the main memory.
Still other objects and advantages of the present invention will become readily apparent to those skilled in the art from the following detailed description of the preferred embodiment and the drawings, wherein only the preferred embodiment of the invention is shown, simply by way of illustration of the best mode contemplated for carrying out the invention. As will be realized, the invention is capable of other and different embodiments, and its several details are capable of modifications in various respects, all without departing from the invention. Accordingly, the drawings and description are to be regarded to the extent of applicable law as illustrative in nature and not as restrictive.