This invention relates to reducing the possibility of corruption of critical information required in the operation of a computer system. In particular, the invention is aimed at preventing boot-sector computer viruses and protecting critical executable code from virus infection.
The process of starting up a computer, i.e., booting or boot-strapping a computer is well known, but we describe aspects of it here for the sake of clarity and in order to define certain terms and outline certain problems which are solved by this invention.
FIG. 1 depicts a typical computer system 10 with central processing unit (CPU) 12 connected to memory 14. Display 18, keyboard 16, hard disk drive 17, and floppy disk drive 19 are connected to computer system 10.
A typical computer system such as shown in FIG. 1 uses a program or set of programs called an operating system (OS) as an interface between the underlying hardware of the system and its users. A typical OS, e.g., MS-DOS Version 5.0, is usually divided into at least two parts or levels. The first level of the OS, often referred to as the kernel of the OS, provides a number of low-level functions called OS primitives which interact directly with the hardware. These low-level primitives include, for example, functions that provide the basic interface programs to the computer system's keyboard 16, disk drives 17, 19, display 18, and other attached hardware devices. The OS primitives also include programs that control the execution of other programs, e.g., programs that load and initiate the execution of other programs. Thus, for example, if a user wishes to run a word-processing program or a game program, it is the primitives in the OS kernel which load the user's program from a disk in one of the attached disk drives 17, 19 into the computer system's memory 14 and begins executing it on CPU 12.
The second level of the OS typically consists of a number of executable programs that perform higher-level (at least from a user's perspective) system related tasks, e.g., creating, modifying, and deleting computer files, reading and writing computer disks or tapes, displaying data on a screen, printing data, etc. These second-level OS programs make use of the kernel's primitives to perform their tasks. A user is usually unaware of the difference between the kernel functions and those which are performed by other programs.
A third level of the OS, if it exists, might relate to the presentation of the OS interface to the user. Each level makes use of the functionality provided by the previous levels, and, in a well designed system, each level uses only the functionality provided by the immediate previous level, e.g., in a four level OS, level 3 only uses level 2 functions, level 2 only uses level 1 functions, level 1 only uses level 0 functions, and level 0 is the only level that uses direct hardware instructions.
FIG. 2 depicts an idealized view of a four level OS, with a level for hardware (level 0) 2, the kernel (level 1) 4, the file system (level 2) 6, and the user interface (level 3) 8.
An OS provides computer users with access and interface to a computer system. Operating systems are constantly evolving and developing to add improved features and interfaces. Furthermore, since an OS is merely a collection of programs (as described above), the same computer system, e.g. that shown in FIG. 1, can have a different OS running on it at different times. For example, the same IBM personal computer can run a command-line based OS, e.g., MS-DOS V5.0, or a graphical, icon based OS, e.g., MS-Windows 3.0.
In order to deal with the evolution of operating systems (as well as to deal possible errors in existing operating systems) computer system manufacturers have developed a multi-stage startup process, or boot process, for computers. Rather than build a version of the OS into the system, the multi-stage boot process works as follows:
A boot program is built into the computer system and resides permanently in read-only memory (ROM) or programmable read-only memory (PROM) (which is part of memory 14) on the system. Referring to FIG. 4, a computer system's memory 14 can consist of a combination of Random Access Memory (RAM) 24 and ROM 26. The ROM (or PROM) containing the boot program is called the boot ROM 28 (or boot PROM). A boot program is a series of very basic instructions to the computer's hardware that are initiated whenever the computer system is powered up (or, on some systems, whenever a certain sequence of keys or buttons are pressed). The specific function of the boot program is to locate the OS, load it into the computer's memory, and begin its execution. These boot programs include the most primitive instructions for the machine to access any devices attached to it, e.g., the keyboard, the display, disk drives, a CD-ROM drive, a mouse, etc.
To simplify boot programs and to make their task of locating the OS easy, most computer system manufacturers adopt conventions as to where the boot program is to find the OS. Two of these conventions are: the OS is located in a specific location on a disk, or the OS is located in a specific named file on a disk. The latter approach is adopted by the Apple Macintosh.TM. computer where the boot program looks for a file named "System" (which contains, e.g., Apple's icon-based graphical OS) on disks attached to the computer. The former approach, i.e., looking for the 0S in a particular location, e.g., on a disk, is the one currently used by most I.B.M. personal computers (and clones of those systems). In these systems the boot program looks, in a predetermined order, for disks in the various disk drives connected to the system (many computer systems today have a number of disk drives, e.g., a floppy-disk drive, a CD-ROM, and a hard-disk drive). Once the boot program finds a disk in a disk drive, it looks at a particular location on that disk for the OS. That location is called the boot sector of the disk.
Referring to FIG. 3, a physical disk 9 is divided into tracks which are divided up into sectors 11 (these may actually be physically marked, e.g., by holes in the disk, in which case they are called hard-sectored, but more typically the layout of a disk is a logical, i.e. abstract layout). The boot sector is always in a specific sector on a disk, so the boot program knows where to look for it. Some systems will not allow anything except an OS to be written to the boot sector, others assume that the contents of the boot sector could be anything and therefore adopt conventions, e.g., a signature in the first part of the boot sector, that enables the boot program to determine whether or not it has found a boot sector with an OS. If not it can either give up and warn the user or it can try the next disk drive in its predetermined search sequence.
Once the boot program has determined that it has found a boot sector with an OS (or part of an OS), it loads (reads) into memory 14 the contents of the boot sector and then begins the execution of the OS it has just loaded. When the OS begins execution it may try to locate more files, e.g., the second level files described above, before it allows the user access to the system. For example, in a DOS-based system, the program in the boot sector, when executed, will locate, load into memory, and execute the files, IO.SYS, MSDOS.SYS, COMMAND.COM, CONFIG.SYS, and AUTOEXEC.BAT. (Similarly, in a multi-level system, each level loads the next one, e.g., the Hewlett-Packard Unix.TM.-like System HPUX has at least 4 levels which get loaded before the user is presented with an interface to the computer system.)
The process of booting a computer system is sometimes called the boot sequence. Sometimes the boot sequence is used to refer only to the process executed by the first boot program.
Computer viruses aimed at personal computers (PCs) have proliferated in recent years. One class of PC viruses is known as boot infectors. These viruses infect the boot-sectors of floppy or hard disks in such a way that when the boot sequence of instructions is initiated, the virus code is loaded into the computer's memory. Because execution of the boot sequence precedes execution of all application programs on the computer, antiviral software is generally unable to prevent execution of a boot-sector virus.
Recall, from the discussion above, that the boot program loads into memory the code it finds in the boot sector as long as that code appears to the boot program to be valid.
In addition to the boot infector class of viruses, there is another class of viruses called file infectors which infect executable and related (e.g., overlay) files. Each class of virus requires a different level or mode of protection.
File infector viruses typically infect executable code (programs) by placing a copy of themselves within the program; when the infected program is executed so is the viral code. In general, this type of virus code spreads by searching the computer's file system for other executables to infect, thereby spreading throughout the computer system.
One way that boot-sector viruses are spread is by copying themselves onto the boot-sectors of all disks used with the infected computers. When those infected disks are subsequently used with other computers, as is often the case with floppy disks, they transfer the infection to the boot-sectors of the disks attached to other machines. Some boot-sector viruses are also file infectors. These viruses copy themselves to any executable file they can find. In that way, when the infected file is executed it will infect the boot sectors of all the disks on the computer system on which it is running.
Recall, from the discussion above, that an OS may consist of a number of levels, some of which are loaded from a boot sector, and others of which may be loaded into the system from other files on a disk. It is possible to infect an OS with a virus by either infecting that part of it the resides in the boot sector (with a boot-sector virus) or by infecting the part of it that is loaded from other files (with a file-infector virus), or both. Thus, in order to maintain the integrity of a computer operating system and prevent viruses from infecting it, it is useful and necessary to prevent both boot-sector and file-infector viruses.
Work to develop virus protection for computers has often been aimed at PCs and workstations, which are extremely vulnerable to virus infection. The many commercial packages available to combat and/or recover from viral infection attest to the level of effort in this area.
Unfortunately, computer virus authors produce new versions and strains of virus code far more rapidly than programs can be developed to identify and combat them. Since viruses are typically recognized by a "signature", i.e., a unique sequence of instructions, new viral code may at times be difficult to identify. Existing signature-based virus detection and eradication programs require knowledge of the signature of a virus in order to detect that virus. Current systems employ different strategies to defend against each type of virus. In one of these strategies to protect against boot infectors, first a clean (uninfected) copy of the boot-sector is made and kept on a backup device, e.g., a separate backup disk. Subsequent attempts to write to the boot-sector are detected by the anti-viral software in conjunction with the OS and the user is warned of potential problems of viral infection. Since reading from and writing to a disk is a function performed by the 0S kernel, it knows when a disk is written to and which part of the disk is being written. Anti-virus software can be used to monitor every disk write to catch those that attempt to modify the boot sector. (Similarly, in systems which keep the OS in a particular named file, every attempt to modify that file can be caught). At this point, if the boot-sector has been corrupted the user can replace it with a clean copy from the backup disk.
To inhibit file infectors an integrity check, e.g., a checksum is calculated and maintained of all executables on the system, so that any subsequent modification may be detected. A checksum is typically an integral value associated with a file that is some function of the contents of the file. In the most common and simple case the checksum of a file is the sum of the integer values obtained by considering each byte of data in the file as an integer value. Other more complicated schemes of determining a checksum are possible, e.g., the sum of the bytes in the file added to the size of the file in bytes. Whatever the scheme used, a change in the file will almost always cause a corresponding change in the checksum value for that file, thereby giving an indication that the file has been modified. If a file is found with a changed checksum, it is assumed to be infected. It can be removed from the computer system and a clean copy can restored from backup.
Many viruses use the low-level primitive functions of the OS, e.g., disk reads and writes, to access the hardware. As mentioned above, these viruses can often be caught by anti-viral software that monitors all use of the OS's primitives. To further complicate matters however, some viruses issue machine instructions directly to the hardware, thus avoiding the use of OS primitive functions. Viruses which issue instructions directly to the hardware can bypass software defenses because there is no way that their activities can be monitored. Further, new self-encrypting (stealth) viruses may be extremely difficult to detect, and thus may be overlooked by signature recognition programs.
One approach to the boot integrity problem is to place the entire operating system in read-only memory (ROM) 26 of the computer 10. However, this approach has disadvantages in that it prevents modifications to boot information, but at the cost of updatability. Any upgrades to the OS require physical access to the hardware and replacement of the ROM chips. It is also the case that as operating systems become more and more sophisticated, they become larger and larger. Their placement in ROM would require larger and larger ROMs. If user authentication is added to the boot program, passwords may be difficult to change and operate on a per machine rather than a per user basis.
Some Operating Systems have so-called login programs which require users to enter a password in order to use the system. These login programs, whether stand-alone or integrated with an antiviral program, suffer from the same timing issues as previously mentioned. Also since most PCs provide a means of booting from alternate devices, e.g., a floppy disc drive, login programs can often be trivially defeated.