1. Field of the Invention
This invention relates to the field of data processing systems. More particularly, this invention relates to the storage of validity status information in respect of data words stored within a data processing system.
2. Description of the Prior Art
It is known from WO-A-00/75785 to provide a hierarchical arrangement for storing valid bits corresponding to cache lines within a data processing system. Within such an arrangement, a single bit may be used to indicate whether or not a word containing a plurality of lower level valid bits is itself valid. This approach is particularly well suited for use in synthesised circuit applications in order to provide a global invalidate function.
A problem arises in such systems when the size of the memory for which validity data is being stored is variable. As an example, a single synthesisable microprocessor core may be implemented with different sizes of cache memory. As the cache memory varies in size, so does the amount of validity data needing to be stored associated with the cache lines in that cache memory. In a situation with a valid memory storing valid words having bits representing the validity of individual cache lines, flip-flop circuits may be provided to represent the validity of the valid words themselves. Thus, when a fast high level global clear was desired, such as on a post-boot context switch, then all that need be done would be the resetting of all the flip-flop circuits to an invalid state which would consequently indicate that the entire contents of the valid memory was itself invalid. When the size of the valid memory can vary in dependence upon the size of the corresponding cache memory, there is also a need for a variable number of flip-flop circuits.
One simple approach to this situation would be to provide a number of flip-flop circuits within the synthesisable design that was sufficient to cope with the largest envisaged valid memory. This approach would have the disadvantage of including many redundant flip-flop circuits within implementations having a valid memory smaller than the maximum size resulting in a disadvantageous increase in circuit size, cost etc. A further more subtle problem is that when dealing with a large number of flip-flop circuits the data from which requires evaluation in parallel, there typically arises a need for disadvantageously wide multiplexers to select the appropriate signal values for controlling other operations. Wide multiplexers tend to introduce a comparatively large signal propagation delay and this can have a detrimental impact when such circuits find themselves upon critical timing paths within the system as a whole.
Viewed from one aspect the present invention provides apparatus for processing data, said apparatus comprising:
(i) a valid word memory operable to store a plurality of valid words, each valid word having bits representing whether or not corresponding data storage locations in a further memory are storing valid data; and
(ii) a plurality of flip-flop circuits operable to store values indicative of whether or not corresponding valid words within said valid memory are themselves valid; characterised in that
(iii) a flip-flop circuit is operable to store a value indicative of validity of a number of valid words which varies in dependence upon how many valid words may be stored in said valid memory.
The invention recognises that the number of valid words which correspond to a given flip-flop circuit need not be constant and could be varied in dependence upon the size of the valid memory. Thus, a relatively manageable number of flip-flop circuits may be provided to cope with the majority of valid memory sizes using one flip-flop circuit per valid word, but situations with larger valid memory sizes may be dealt with by arranging for a single flip-flop circuit to correspond to multiple valid words within the valid memory. The increase in circuit complexity needed to deal with configurations having different valid memory sizes is more than offset by the saving in circuit area achieved by not having to provide a large number of flip-flop circuits to cope with the worst case scenario of the largest possible valid memory.
Whilst the above technique is useful in a wide variety of situations, the arrangement of flip-flop circuits representing the validity of corresponding valid words is itself particularly well suited to embodiments providing a global invalidate operation whereby all valid words within the valid memory may be indicated as being invalid by forcing appropriate values into what will be a much smaller number of flip-flop circuits.
It will be appreciated that the memory architecture could take a wide variety of forms, but the invention is particularly well suited to situations in which the further memory is a cache memory. Such situations usually require the storage of and control by valid data corresponding to the validity of the data held within particular cache lines.
It will be appreciated that in situations where a single flip-flop circuit corresponds to a plurality of valid words, the changing to a valid status of the flip-flop circuit may well require changes in multiple corresponding valid words. Since the valid words will usually be accessible sequentially, particularly in the case of a synthesised design in which the valid memory is a synthesised RAM memory, multiple clock cycles may be needed to make all the changes to the valid words consequential upon a change in a value stored within a flip-flop circuit.
Particularly preferred embodiments of the invention perform such changes to the valid words in parallel with cache line fill operations. Cache line fill operations themselves, by their very nature, are generally slower than operations that are able to be serviced without a cache line fill and accordingly tend already to spread over multiple clock cycles. Thus, the overhead involved in sequentially performing multiple writes to valid words may effectively be hidden within the time that is typically already taken in servicing a cache line fill.
When a cache line fill occurs, it may be that only a single valid word is being changed to indicate the storage of a valid cache line, but in other embodiments it is possible that a cache refill operation may return multiple cache lines which need to be marked as valid within multiple valid words beneath a single flip-flop circuit.
Preferred circuit arrangements logically combine valid words with values stored in a plurality of flip-flop circuits both having been read in parallel. Such arrangements often require the wide multiplexers discussed previously and so are ones in which the present invention is particularly well suited.
Viewed from another aspect the present invention provides a method of processing data, said method comprising the steps of:
(i) storing a plurality of valid words within a valid word memory, each valid word having bits representing whether or not corresponding data storage locations in a further memory are storing valid data; and
(ii) storing within a plurality of flip-flop circuits values indicative of whether or not corresponding valid words within said valid memory are themselves valid; characterised in that
(iii) a flip-flop circuit stores a value indicative of validity of a number of valid words which varies in dependence upon how many valid words may be stored in said valid memory.
The above, and other objects, features and advantages of this invention will be apparent from the following detailed description of illustrative embodiments which is to be read in connection with the accompanying drawings.