The present invention relates to improvements of system availability of networked computer systems. In particular, it relates to method and system for operating an input/output circuit for driving peripheral devices within an embedded system.
The present invention is generally applicable in computer networks comprising a plurality of computers. Particular additional advantages can be taken of it when said plurality of computers has some inner structure of xe2x80x98competence distributionxe2x80x99 exist, in particular a structure in which a first type of server computers and a second type of more or less dedicated control computer, in particular embedded controllers, exist which have only a reduced technical equipment, as e.g., no hard disk, or display unit, or keyboard, etc.
Although the present invention has a very broad scope implied by its inherent technical abstractness it will be discussed in here with reference to a larger enterprise computer network which is schematically depicted in FIG. 1.
Such a multi-server/multi-user networked environment comprises a huge number of peripheral devices 36, e.g. terminals, printers, storage devices, sensors, actuators and the like, which are connected with and controlled by a server cluster 10 having a plurality of CPUs 11, a memory controller 22 cooperating with a cache device 14 and a plurality of memory cards 12 via a respective system bus or a adapted switching device.
To supervise the communication between said server cluster 10 (left) and said peripheral devices 36 (right), so called embedded systems 18 are used to sense and control the so-called Input/Output devices 26 e.g., so-called I/O cards. These embedded systems are hosted on the so-called power/controller cards 18 and are dedicated computing units, for example a so-called Power PC which is used without the usual man/machine interface.
For the purpose of version consistency required for operating the peripheral devices 36 without major problems as well as for cost reasons neither said I/O cards 26 nor embedded controllers 18 do possess an own persistent software storage, like a hard disk, in which multiple versions of a software could be stored and executed.
Instead, and with additional reference to FIG. 2, a more detailed, schematic representation of a prior art I/O card is given. There is provided a controller interface 9 connecting to an ASIC chip 28 in which the control logic is implemented for controlling the operation of the individual drive devices 42, 32 for driving the peripheral devices. In an example depicted in FIG. 2, this is an electrical-to-optical and vice versa, Signal Converter 42 cooperating with a plurality of optical device drivers 32. Thus, said system comprises a controller means 22 and an input/output circuit (26) with an ASIC 28 and sensor response means 32 for driving said device 42.
Further, a clock 41 is provided for supplying said converter and the ASIC with a respective clock signal.
Via the functional interface 14 the operational signals are transferred which are required to use the peripheral devices.
Via said controller interface the ASIC 28, the so-called FGA, receives data signals and a clock signal. This implementation allows to even communicate with the FGA (ASIC) 28 when the clock on the card is defective or powerless due to a short somewhere on the card. In this case the sense and control lines of the FGA can still be used to identify the root cause of the problem.
In order to focus now on the disadvantages of prior art, the system availability in computer system environments like those described above is addressed now in more detail:
Although, a variety of efforts is made to absolutely minimize the duration where a computer system environment or a subsystem is not able to perform its task due to a software or a hardware failure, e.g., redundant controllers, redundant peripheral devices, driver code runs in only one, unique version, etc., the system availability is not sufficiently provided yet in prior art.
From other computer system environments that have real-time requirements, and wherein consequently the system availability is extremely important, various techniques like keeping persistent states, trace points, etc., are known to improve system availability.
This, however, is not applicable to the embedded systems due to the specific hardware configuration of said embedded systems, and the intended absence of e.g., a hard disk and a respective tracing logic in the I/O card itself.
It would be desirable to apply such techniques like keeping persistent states, trace points, etc. to other computer system environments or subsystems as well, for example to profit from them in the above mentioned embedded systems in order to increase their system availability.
It is thus an object of the present invention to improve the system availability in an environment comprising embedded systems.
The foregoing and other objects are achieved by the present invention comprising a method and system for operating an input/output circuit for driving peripheral devices controlled by an embedded system. For increasing the overall system availability the invention proposes to add some limited repeatedly performed status storing functionality preferably into a register storage of the I/O devices. The information can be easily exploited, (i.e., read out from external of the input/output devices) via the controller of the embedded system.
Said additional logic xe2x80x9cadd-onxe2x80x9d, which is for example implemented in an ASIC in the embedded system, repeatedly generates status information reflecting the status of an associated input/output device, continuously stores said status information in an input/output storing means, for example, in a register included in said ASIC, and keeps said status information available to be requested by a controller communicating with the ASIC logic in the input/output circuit.
Said regular storing of status information then enables the controller, for example in case of a controller reboot or when a redundantly provided controller takes over the job of a first controller which had a breakdown before, to initiate a helpful response to be issued by a sensor response means. For example the response will be from an Optical-to-Electrical signal converter, in a case when an optical peripheral device is to be operated or when a fibre-optic signal transmission is performed by said converter.
The helpfulness for the purposes of improved system availability is that said response reflects the current drive status of said exemplary converter device.
Thus when the controller software reads the (current) status information from said input/output storage means of said input/output circuit, it is enabled to comparing said response with the stored status information. Thus the controller is enabled to continue the operation of said sensor response means dependent of the compare result.
When for example, the freshly sensed status is the same as that one read out from the register, then the rebooted controller or the stand-by redundant controller can continue operation without restarting/rebooting/reinitializing the sensor response circuit which in the worst case would terminate a running communication between server and peripheral devises.
By the foregoing implementation, time is saved and the system availability is increased. The solution profits from the fact that it is possible for the operating system, and thus for the controller, to read and write the I/O address space. Thus an I/O register or the like can be used as a normal RAM for storing said important status information.
Advantageously, a register is used for storing the status information, because a power drop then has the same effect on the register content as on the current sense information at the sensor response device, (i.e., such that there is no defined status which can be relied on) and thus the logical conclusion that a restart of the device is required is easy and error-free to reach.
Advantageously, a cold-start indicator flag is additionally provided which is comprised of said status information. This flag can be evaluated by the controller in the above situation prior to any other information. When the flag is xe2x80x98onxe2x80x99, then the controller must initialize the dependent device. In this single case a reboot of the device is required.
The present invention is thus advantageously applicable when increased system availabilityxe2x80x94nearly permanentxe2x80x94of the components is required. This is in particular the case in the above-mentioned type of systems when the controllers are configured redundantly.