1. The Field of the Invention
This invention relates to software programming for controlling behavior of hardware devices, and, more particularly, to novel systems and methods for inseparably welding a software layer to a hardware device interface, thus precluding insertion of any other software layer therebetween.
2. Background of the Invention
Computers are now used to perform functions and maintain data critical to many organizations. Businesses use computers to maintain essential financial and other business data. Computers are also used by government to monitor, regulate, and even activate, national defense systems. Maintaining the integrity of the stored data is essential to the proper functioning of these computer systems, and data corruption can have serious (even life threatening) consequences.
Most computer systems include media drives, such as floppy diskette drives for storing and retrieving data. For example, an employee of a large financial institution may have a personal computer that is attached to the main system. In order to avoid processing delays on the mainframe, the employee may routinely transfer data files from a host system to a local personal computer and then back again, temporarily storing or backing up data on a local floppy diskette or other media. Similarly, an employee with a personal computer at home may occasionally decide to take work home, transporting data away from and back to the office on a floppy diskette.
Data transfer to and from media, such as a floppy diskette, is controlled by a device called a Floppy Diskette Controller (xe2x80x9cFDCxe2x80x9d). The FDC is responsible for interfacing the computer""s Central Processing Unit (xe2x80x9cCPUxe2x80x9d) with a physical media drive. Significantly, since the drive is spinning, it is necessary for the FDC to provide data to the drive at a specified data rate. Otherwise, the data will be written to a wrong location on the media.
The design of an FDC accounts for situations occurring when a data rate is not adequate to support rotating media. Whenever this situation occurs, the FDC aborts the write operation and signals to the CPU that a data under run condition has occurred.
Unfortunately, however, it has been found that a design flaw in many FDCs makes impossible the detection of certain data under run conditions. This flaw has, for example, been found in the NEC 765, INTEL 8272 and compatible Floppy Diskette Controllers. Specifically, data loss and/or data corruption may routinely occur during data transfers to or from diskettes (or tape drives and other media attached via the FDC), whenever the last data byte of a sector being transferred is delayed for more than a few microseconds. Furthermore, if the last byte of a sector write operation is delayed too long then the next (physically adjacent) sector of the media will be destroyed as well.
For example, it has been found that these faulty FDCs cannot detect a data under run on the last byte of a diskette read or write operation. Consequently, if the FDC is preempted or otherwise suspended during a data transfer to the diskette (thereby delaying the transfer), and an under run occurs on the last byte of a sector, the following occur: (1) the under run flag does not get set, (2) the last byte written to the diskette is made equal to either the previous byte written or zero, and (3) a successful Cyclic Redundancy Check (xe2x80x9cCRCxe2x80x9d) is generated on the improperly altered data. The result is that incorrect data is written to the diskette and validated by the FDC. Herein, references to a floppy diskette may be read as xe2x80x9cany mediaxe2x80x9d and a floppy diskette drive is but a specific example of a media drive controllable by an FDC.
Conditions under which this problem may occur have been identified in connection with the instant invention by identifying conditions that can delay data transfer to or from the diskette drive. In general, this requires that the computer system be engaged in xe2x80x9cmultitaskingxe2x80x9d operation or in overlapped input/output (xe2x80x9cI/Oxe2x80x9d) operation. Multi-tasking is the ability of a computer operating system to simulate the concurrent execution of multiple tasks.
Importantly, concurrent execution is only xe2x80x9csimulatedxe2x80x9d because only one CPU exists in a typical personal computer. One CPU can only process one task at a time. Therefore, a system interrupt is used to rapidly switch between the multiple tasks, giving the overall appearance of concurrent execution.
MS-DOS and PC-DOS, for example, are single-task operating systems. Therefore, one could argue that the problem described above would not occur. However, a number of standard MS-DOS and PC-DOS operating environments simulate multi-tasking and are susceptible to the problem.
In connection with the instant invention, for example, the following environments have been found to be prime candidates for data loss and/or data corruption due to defective FDCs: local area networks, 327x host connections, high density diskettes, control print screen operations, terminate and stay resident (xe2x80x9cTSRxe2x80x9d) programs. The problem also occurs as a result of virtually any interrupt service routine. Thus, unless MS-DOS and PC-DOS operating systems disable all interrupts during diskette transfers, they are also highly susceptible to data loss and/or corruption.
The UNIX operating system is a multi-tasking operating system. It has been found, in connection with the instant invention, how to create a situation that can cause the problem within UNIX. One example is to begin a large transfer to the diskette and place that transfer .task in the background. After the transfer has begun then begin to process the contents of a very large file in a way that requires the use a Direct Memory Access (xe2x80x9cDMAxe2x80x9d) channel of a higher-priority than that of the floppy diskette controller""s DMA channel. These might include, for example, video updates, multi-media activity, etc. Video access forces the video buffer memory refresh logic on DMA channel 1, along with the video memory access, which preempts the FDC operations from occurring on DMA channel 2 (which is lower priority than DMA channel 1).
This type of example creates an overlapped I/O environment and can force the FDC into an undetectable error condition. More rigorous examples include a concurrent transfer of data to or from a network or tape drive using a high priority DMA channel while the diskette transfer is active. Clearly, the number of possible error producing examples is infinite, yet each is highly probable in this environment.
For all practical purposes the OS/2 and newer Windows operating systems can be regarded as UNIX derivatives. They suffer from the same problems that UNIX does. Two significant differences exist between these operating systems and UNIX.
First, they both semaphore video updates with diskette operations tending to avoid forcing the FDC problem to occur. However, any direct access to the video buffer, in either real or protected mode, during a diskette transfer will bypass this feature and result in the same faulty condition as UNIX.
Second, OS/2 incorporates a unique command that tends to avoid the FDC problem by reading back every sector that is written to the floppy diskette in order to verify that the operation completed successfully. This command is an extension to the MODE command (MODE DSKT VER=ON). With these changes, data loss and/or data corruption should occur less frequently than otherwise. However, the FDC problem may still destroy data that is not related to the current sector operation.
A host of other operating systems are susceptible to the FDC problem just as DOS, Windows, Windows 95, Windows 98, Windows NT, OS/2, and UNIX. However, these systems may not have an installed base as large as DOS, Windows, OS/2 or UNIX, and may, therefore, receive less motivation to address the problem. Significantly, as long as the operating systems utilize the FDC and service system interrupts, the problem can manifest itself. This can occur in computer systems that use virtually any operating system.
Some in the computer industry have suggested that data corruption by the FDC is extremely rare and difficult to reproduce. This is similar to the argument presented during the highly publicized 1994 defective INTEL Pentium scenario. Error rate frequencies for the defective Pentium ranged from microseconds to tens-of-thousands of years! The FDC problem is often very difficult to detect during normal operation because of its random characteristics. The only way to visibly detect this problem is to have the FDC corrupt data that is critical to the operation at hand. However, many locations on the diskette may be corrupted, yet not accessed. In connection with the instant invention, the FDC problem has been routinely reproduced and may be more common than heretofore believed.
Computer users may, in fact, experience this problem frequently and not even know about it. After formatting a diskette, for example, the system may inform the user that the diskette is bad, although the user finds that if the operation is performed again on the same diskette everything is fine. Similarly, a copied file may be unusable, and the computer user concludes that he or she just did something wrong. For many in this high-tech world, it is very difficult to believe that the machine is in error and not themselves. It remains typical, however, that full diskette back-ups are seldom restored, that all instructions in programs are seldom, if ever, executed, that diskette files seldom utilize all of the allocated space, and that less complex systems are less likely to exhibit the problem.
Additionally, the first of these faulty FDCs was shipped in the late 1970""s. The devices were primarily used at that time in special-purpose operations in which the FDC problem would not normally be manifest. Today, on the other hand, the FDCs are incorporated into general-purpose computer systems that are capable of concurrent operation (multi-tasking or overlapped I/O). Thus, it is within today""s environments that the problem is most likely to occur by having another operation delay a data transfer to a diskette. The more complex a computer system, the more likely it is that one activity will delay another, thereby creating an FDC error condition.
In short, the potential for data loss and/or data corruption is present in all computer systems that utilize the defective version of this type of FDC, presently estimated at about 50 million personal computers. The design flaw in the FDC causes data corruption to occur and manifest itself in the same manner as a destructive computer virus. Furthermore, because of its nature, this problem has the potential of rendering even secure databases absolutely useless.
Moreover, more recent FDC devices may be affected by the use of first-in-first-out (FIFO) devices that alter the usual operation. Whenever the FIFO is enabled, detection of the defective controller may be well nigh impossible. Nevertheless, when the FIFO is not enabled, due to the vagaries of some particular operating system or device driver, the defect may appear and cause data corruption.
Various conventional ways of addressing the FDC problem, such as a hardware recall, have significant associated costs, risks and/or disadvantages. In addition to a solution to the FDC problem, an apparatus and method are needed to accurately, rapidly, reliably, and correctly, identify any defective FDC. The identification of defective FDCs is the first step in attempting to solve the problem of defective FDCs. A solution method and apparatus for repairing a defective FDC are disclosed in U.S. Pat. No. 5,379,414 incorporated herein by reference.
In view of the foregoing, it is a primary object of the present invention to provide a method and apparatus for detecting defective Floppy Diskette Controllers (xe2x80x9cFDCsxe2x80x9d).
It is another object of the present invention to provide a software (programmatic) solution that may be implemented in a general purpose digital computer, which eliminates the need for visual inspection and identification of the defective FDCs as well as the need for any hardware recall and replacement.
Consistent with the foregoing objects, and in accordance with the invention as embodied and broadly described herein, an apparatus and method are disclosed in one embodiment of the present invention as including data structures, executable modules, and hardware, implementing a detection method capable of immediately, repeatedly, correctly, and accurately detecting defective FDCs.
The apparatus and method may rely on 1) determining whether or not the FDC under test is a new model FDC (potentially non-defective), and 2) if the FDC under test is not a new model FDC, installing an interposer routine to force the FDC to delay a transfer of a last data byte of a sector either to or from the floppy diskette whose controller is tested. A test condition is thus created in the hardware to cause defective FDCs to corrupt the last data byte of the sector. A second portion of an apparatus and method may confirm a diagnosis. Thus the apparatus and method may ensure that old-model non-defective FDCs are not wrongly identified as defective.
Chips manufactured in recent years may have the data corrupting defect originally identified. Nevertheless, the defect may manifest itself in other ways. Meanwhile, the three needs remain. The failure of the chip needs to be detected, including navigation of the masking features that may limit an ability to detect FIFO-enabled chips. Second, correction of the defect in hardware, by using a software solution is needed. Finally, detection of corruption resulting from previous failures of a FIFO-enabled chip to detect errors will be required.
A system and method in accordance with the invention may be implemented to provide a software override capability for enforcing a predetermined state for an otherwise hardware-programrnable device. Software that may think it knows what it is doing may try to control a hardware device, but may not know about a hardware issue, such as another process or a defect requiring that the device remain in a certain state.
The technique programmatically maintains a persistent hardware state independent of any other control software. To other software, the software layer of the invention is indistinguishable and inseparable from hardware. Nothing can slip in between. Anything insertion attempt will be detected and disallowed. Features of the processor or system chips actually weld the software to the hardware, which disallows any software intervention between the welded software layer and the hardware.
Various uses for this method may include making hardware persistently behave in a given fashion, in spite of ongoing requests from other software to reconfigure the underlying hardware behavior. This may provide a software-only solution to a hardware defect. One may extend hardware capability without replacing hardware, and without concern for insertion of other software layers that would program performance impermissibly if allowed to obtain conventional access, such as I/O port commands, memory-mapped I/O commands. Monitoring capability of access and control of an underlying hardware interface is also available.
One application of a method in accordance with the invention may provide a complete software implementation of overriding a detection process that is capable of detecting defective Floppy Diskette Controllers (xe2x80x9cFDCsxe2x80x9d) without visual hardware inspection or identification. The approach taken includes a multi-phase strategy incorporating programmatic FDC identification, software DMA shadowing, defect inducement, and use of a software decoding network, all of which allows the implementation of the invention to adjust to a wide range of computer system performance levels.
A method and apparatus for detecting and preventing floppy diskette controller data transfer errors in computer systems is also provided. The approach taken may involve software DMA shadowing and the use of a software decoding network.
In certain embodiments, an apparatus for detecting a defective floppy diskette controller may comprise a computer readable medium storing executable and operational data structures. The data structures may include a determination module for identifying a hardware resource associated with a computer system, a welding module for inseparably connecting a persistent software layer to the hardware resource, and a defense module for resisting attempts by other software to unweld the persistent software layer from the hardware resource.
The apparatus may store data structures including a function module for performing a desired function whenever the hardware resource is accessed by the computer system. The function module may be configured to control the hardware resource to provide a function otherwise unavailable from the hardware resource as manufactured. The data structures may include an unweld module for disconnecting the persistent software layer from the hardware resource. The unweld module may be configured to be embedded in the welding module.
In at least one embodiment, a computer readable medium storing data structures may embody steps for effecting a method providing a computer system comprising a processor operably connected to a first hardware resource. It may include installing a driver corresponding to the first hardware resource, and including a resource identifier for identifying available hardware resources. Further the method may include identification of the processor, by the resource identifier, the first hardware resource and executing, on the processor, a welder to inseparably connect a persistent software layer.
The method may include accessing, by the processor, a first hardware interface and automatically engaging the persistent software layer upon accessing, by the processor, to the hardware interface. The method may provide a defense module for responding to attempts to unweld the persistent software layer from the first hardware interface, and provide a controller for controlling the first hardware resource.
The persistent software layer may have a function module, executable to perform an extension function, the extension function being beyond the inherent functionality of the controller. The extension function may have a function lock for overriding requests from other software to reconfigure the functionality of the first hardware resource. The function module may be configured to perform a function selected from detection and correction of a hardware defect in the controller.
The function module may be configured to extend the functional capability of at least one of the first hardware resource and the controller, without replacement thereof. The function module may be configured to monitor at least one of access and control of at least one of the first hardware device and the controller.
In one embodiment, a method for welding a software layer to a hardware layer in a computer system having hardware interfaces may include providing a computer system comprising a processor operably connected to a first hardware resource., with a first hardware interface corresponding to the first hardware resource. Then, the method may install a driver corresponding to the first hardware resource, and including a resource identifier for identifying available hardware resources. After identifying the first hardware resource, the method may execute on the processor a welder for inseparably connecting a persistent software layer.
The method may include accessing, by the processor, the first hardware interface; and automatically engaging the persistent software layer upon accessing, by the processor, the hardware interface. The persistent software layer may have included a function module configured to monitor at least one of access and control of at least one of the first hardware device and the controller. Inseparably connecting may result in rendering the connection unbreakable by other than the welder. Typically it will render substantially impossible an insertion of an executable between the first hardware resource and the persistent software layer.