The client/server model which has emerged in the late 1980s is a versatile and modular software architecture that was devised to improve usability, flexibility, interoperability, and scalability as compared to centralized, mainframe, time sharing computing that was the norm at that time. The client/server architecture has since progressively completely replaced the previous mainframe software architectures where all intelligence was within the central host computer and where users interacted with the host through dumb terminals. If mainframes are still however in use it is only as powerful servers in various client/server architectures where dumb terminals have also been replaced by intelligent graphical user interfaces (GUI) capable of self processing the received and transmitted data from/to servers.
In modern data processing systems, a client/server architecture largely in use and capable of supporting a large number of remotely located clients is the so-called 3-tier architecture. An example of such architecture is illustrated in FIG. 1. The data tier 100 is traditionally built around a master database system 120, possibly a large or very large repository of all the data necessary to the daily operation of any business organization, company or enterprise in order to conduct all sorts of commercial and administrative operations. Database is mostly of the relational type, i.e., is under the control of a relational database management system or RDBMS. It is typically administrated through one or more master servers 112 by administrators of the data processing system from GUI's 140. Administrators are generally the sole users of the system authorized to update directly database contents.
The intermediate or middle tier of the exemplary 3-tier system of FIG. 1 is the application tier 200 from where all the specific software applications 240 of the organization, owner of the data processing system, are run. This collection of specific applications, often globally referred to as the middleware software, is the proprietary software of the organization. It is used to serve all organization's remote clients from its repository of data 120 through the master servers 110. Remote clients form the third tier 300 of the 3-tier architecture. Queries from client tier 300 are thus processed and responded by the specific applications of the intermediate tier 200 on data fetched from the data tier 100.
In a 3-tier architecture, when a larger number of remote clients need to be served, scalability of the system to maintain global performances is obtained by adding independent processing nodes in the middle tier so as to increase the overall processing power of the data processing system. Hence, the application tier 200 is generally comprised of several independent processing nodes that are referred to, in the following description, as slave nodes 210. Then, a common practice to prevent data tier 100 from being overwhelmed by too many data requests from an increasing number of slave nodes, is to have the applicative processes 240 working on pieces of data brought from the master database and stored in each application node as long as necessary. In the exemplary system of FIG. 1 this takes the form of cache files 250 on which the applicative processes 240 can work without having to incur long delays to get them from the master database through the master servers each time they are needed. In such a data processing system processing power and software applications are thus distributed, i.e., replicated, on as many nodes 210 as necessary to reach the level of processing power necessary to serve all remote clients 300 of the system. So are the distributed cache files 250. In each node cache files 250 are typically shared between all applicative processes 240 running on the node. To this end, cache files are stored as memory-mapped files in shared-memory 230 in order to let all applicative processes have a fast access to the pieces of data, coming from the master database, on which they have to work.
The slave node operating system imposes that memory-mapped files be given their size when created. Thus, the file size remains the same during the whole life of a cache file. As shown in FIG. 2, cache files 250, implemented as memory-mapped files 10, are structured in two parts. First part is a data area 20 that stores all applicative data content of a memory-mapped file while second part is a control area 30 which holds the control data. The data area is further split in two parts organized as two linked lists of data blocks. One of the linked lists 23 holds a previous level of data, i.e., the old data, under the form of inactive data blocks 24. The other linked list 21, stores the current level of data in active data blocks 22. However, active and inactive linked data blocks share the same memory-mapped file area, i.e.: the data area 20.
The control area 30 indicates which linked list contains the active data. A flip-flop mechanism, part of the control area, allows toggling between active 31 and inactive 32 pointers to the linked lists of data blocks so that any applicative process reading a memory-mapped file is always given access to up-to-date data. Hence, during the replication of data from the master database, the data blocks of the inactive part are first cleared to be filled with the new incoming data. At completion of new data insertion, control area is edited so that the above flip-flop mechanism flips between active and inactive parts within the data area of a memory shared file. However, the above current mechanism raises two issues:                A first issue deals with the amount of new data to be stored versus the actual size of the memory-mapped file. As already mentioned above, the memory-mapped file size cannot be changed dynamically to follow rise or reduction of the data to store. Hence, if the amount of data to store grows beyond the available size, a memory-mapped file cannot be actually updated. Thus, content of corresponding cache file becomes outdated. A manual action is then required to correct the problem. The size of the memory-mapped file must be increased before resuming replication. Conversely, much memory resources are wasted when memory-mapped files are over-sized. Also, when the amount of data to store decreases the current mechanism cannot take advantage of it in reducing the size of the memory-mapped file.        A second issue occurs during replication if, for any reason, process fails completing normally. Since active and inactive data blocks share a same data area, writing into the inactive part of the data area can also possibly corrupt the active part of the memory-mapped file. If the replication process fails writing the full list of data blocks the corresponding linked list is indeed corrupted. Unpredictable results must then be expected like addressing blocks of the active part thus breaking the data area division between active and inactive parts.        
The two above issues are critical for the client applications impacted because of the inevitable service interruptions they trigger when occurring. In order to be able to notify the impacted clients and to prevent data corruption from further propagating, a standard practice of the replication process consists in locking the memory-mapped file before writing the data. The lock is released at the end of the replication unless replication does not end normally. Even though the lock mechanism prevents data corruption from further propagating and provides to clients the possibility of viewing the corrupted files, it does not however help to recover in an automated manner.
The current replication process thus suffers from a lack of resiliency and requires manual actions because, as discussed above:                size of memory-mapped files is a static parameter that needs to be set manually and which leads to waste much memory resources if over sized for the application;        flip-flop mechanism between files within the shared data area does not prevent corruption of linked lists from happening;        and, lock mechanism is not recovered in an automated manner.        
It is therefore a general object of the invention to bring a solution to at least some of the above drawbacks and limitations of the current replication mechanism of cache files into a shared-memory of a middleware processing node from a master database. It is a particular object of the invention to obtain that replication of cache files be unconditionally a corruption-free process for existing cache files even though a replication operation may occasionally fails or does not complete as expected.
It is another object of the invention to provide a new cache structure where data versions can be controlled in an easier way.
Further objects, features and advantages of the present invention will become apparent to the ones skilled in the art upon examination of the following description in reference to the accompanying drawings. It is intended that any additional advantages be incorporated herein.