An abstract representation of the system architecture of the software components of a transaction processing system in a computing system is shown in FIG. 1. A number of clients (that is, a number of users who submit transaction requests to the computing system) interact with a server process via a gateway process. It will be understood that in the present context, a “user” or “client” may denote a person engaging with a terminal or other input device to submit a transaction request, or it may denote another computing system engaging with a server to submit a transaction request, or it may denote a separate software module within the computing system which sends requests to a server process.
It will further be understood that the three software components termed “client”, “gateway” and “server” may refer to a separate software modules or software applications, or may be integrated into a single module or application and that each of the three components may reside on the same item of hardware, on separate items of hardware, or any combination thereof.
The gateway process acts as an intermediary between the client and the server process. That is, the gateway performs the function of routing requests from a client to an appropriate server, and receives processed transactions from a server and re-routes the completed transaction to the appropriate client. In other words, the gateway functions as a “housekeeping” process.
The server processes are arranged to take a transaction request and carry out the appropriate operations necessary to fulfill the transaction request.
In the context of the present specification, the server process is used to denote a software process, not a hardware computing system or component. The server process may interact directly with any appropriate hardware. In FIG. 1, there is shown an abstract representation of a transaction processing system architecture. A server process is shown, which interacts with a database to obtain and store relevant data. However, the server process may also interact with other hardware components, such as a central processing unit, a network interface, a printer, or any other suitable device.
In a transaction processing system such as the type shown in FIG. 1, a “software bottleneck” may be created in certain circumstances. A software bottleneck occurs where transactions take a relatively long time to be processed, due to an inefficiency in the design of the architecture of a computing system, or due to an error or inefficiency in the implementation of a particular architecture.
A simple type of software bottleneck may occur where a transaction processing system is designed with only a small or limited number of server processes. Server processes are generally single threaded, so only one request may be processed by each server process at any given time, which limits the speed at which a transaction request can be processed. Therefore, if there are many transaction requests and few server processes, a large number of transaction requests will become queued, resulting in a bottleneck. The bottleneck is termed a “software bottleneck” due to the fact that the bottleneck is caused by a deficiency in the software design or implementation.
In the simple example given, the bottleneck may be ameliorated by simply adding more server processes.
However, software bottlenecks may also be introduced at a more fundamental level. For example, a programming error, such as the incorrect or inappropriate use of semaphores may result in the introduction of a software bottleneck. Large scale transaction processing systems and/or applications are generally multi threaded, which results in a need to synchronise between different software modules. In order to prevent clashes between different software modules, multitasking systems provide a set of mechanisms (system calls) which allow a process to gain exclusive access to a system resource (eg. a serial port, a physical disc, etc.).
In the systems described herein, the mechanism by which a system gains exclusive access to a system resource is called a semaphore. In other computing systems, they may variously be called locks, mutexes, or critical regions.
It will be understood that the term semaphore is used herein solely as a useful abbreviation for any type of mechanism which allows a process to gain exclusive access to a system resource, and use of the term should not be construed as limiting the invention or any embodiment of the invention to a particular type of computer hardware, operating system, or software application.
It is quite common for software engineers or programmers to introduce serialisation problems unwittingly by unwise use of synchronisation code (ie. semaphores or similar entities) at incorrect places in the code. Alternatively, software engineers may use semaphores or similar entities in a sub-optimal manner.
Such errors or suboptimal program code do not affect the logic of the application, as all transactions continue to be logically valid. The errors simply decrease the overall system processing rate, and in many cases, such effects will only appear under high loads.
In the case where a software bottleneck is primarily caused by inefficient programming, it is not easily detected by the end user or the system administrator of the computing system, as it's effects do not become apparent at normal to medium operating loads. Even at high operating loads, as it is common for transaction processing systems to slow considerably, such serialisation problems are commonly misdiagnosed as being due to inadequate processing ability.
In response to the perceived inadequate processing ability, some system administrators will increase the number of server processes or database connections to ameliorate the problem. Such a solution, while seeming to ameliorate the problem, only serves to temporarily mask the problem. Furthermore, increasing the number of server processes places an extra burden on the computing system hardware, which may lead to further problems, or may not be practical, as the ability of the hardware to handle additional server processes may also be limited. For example, there may not be adequate memory or network bandwidth available to support a larger number of server processes or database connections.
An ideal solution would be to decouple the incorrectly placed semaphore calls, but such incorrect code must be brought to the attention of the software engineer. As the maintenance of incorrect code is a costly exercise, this solution is generally not undertaken without the positive identification of a software bottleneck.