A typical computer, whether it is an off-the-shelf or a customized computer for personal, business, specialty, or other uses, has many components. Some of the common components are processors, memories, storage devices, input and output devices, and network interfaces. The operating system, such as Microsoft Windows®, Mac OS®, UNIX, Linux, and the like is responsible for controlling the components, their functions, and their interactions. In particular, the operating system handles file requests from applications or from the operating system itself. When a file request is received, the operating system attempts to supply the file from a local storage device or from a file server if the computer is connected to a network.
Because there are many different options available for storage devices and network interfaces, the operating system is typically programmed with the characteristics necessary to access almost every storage device and network interface that could be connected to the computer. For instance, most operating systems are programmed to accommodate different storage devices having various storage device types (e.g. magnetic, optical, etc.), interface types (e.g. IDE, ATA, SCSI, SATA, PATA, SAS, etc.), and physical parameters (e.g. number of cylinders, sectors, heads, tracks, etc.). If the computer is connected to a network, the operating system also has to be programmed with the various network interface types (e.g. Ethernet, Token Ring, ATM, etc.), the network protocols (e.g. TCP/IP, IPX/SPX, AppleTalk, SNA, etc.), and any particular methods needed to communicate with network resources (e.g. servers, printers, scanners, storage, etc.).
In addition, the operating system also has to be able to manage files on storage devices or on a file server over a network. On local storage devices, the operating system typically uses lookup tables or indices, usually referred as file allocation tables, to manage the files. On a network, the operating system has to be programmed to communicate with a file server and retrieve files. It is often advantageous for the operating system to translate the file information received from a file server into a format resembling the file allocation tables to simplify the file retrieval process.
The way in which an operating system manages components and their functions add complexity to a computer. The complexity is easily seen during the setup process of a computer. Typically, a computer goes through a setup process that would involve (a) booting from a bootable device that can initiate installation of the selected operating system, (b) partitioning and formatting local storage devices, (c) installing the full operating system, (d) configuring hardware components such as the display card and network interface, (e) installing and configuring applications and other software, until the computer is prepared for everyday general use.
The complexity is also evident while the computer is in use. Applications and the operating system may require periodic patches or updates, the installation of which frequently involves uninstalling the older versions of the applications or components of the operating system. Additionally, files may be corrupted due to program errors, user errors, or computer viruses. When this happens, the corrupted files need to be replaced or repaired, a process that may involve reinstalling the applications that use the corrupted files or even possibly reinstalling the operating system itself.
The complexity involved in using a computer usually results in high maintenance and support costs being incurred. In a business environment, the support costs can easily reach thousands of dollars per user or per computing device. Additionally, the cost of maintaining computers increases because work productivity tends to decrease significantly, often to zero, when computer related problems arise.
Problems related to hardware malfunction, except problems related to storage devices, may often be resolved within a short amount of time. For instance, a broken component, such as a video card or a network interface card, may be quickly replaced with an identical component. However, computer repair may become a lengthy process if the problems are related to storage devices or the files stored on the storage devices. When a hard drive in a computer malfunctions or corrupted files cause problems, the repair and recovery process might involve reinitializing the hard drive, reinstalling the operating system, and/or reinstalling applications.
Numerous methods are presently available to reduce the complexity of computers, shorten the recovery process when problems occur, or to minimize the need for a recovery process altogether. Some of the common methods are cloning the storage device, booting the computer over a network, utilizing specialized computer management software, and/or applying file level security.
By cloning the storage device, the installation process may be shortened. A computer is first completely set up with a full set of applications. Then the storage device is cloned or duplicated as an “image” file. The image file may then be used to reset the computer to its original condition or to set up identically equipped computers, Many consumer-oriented computers come with recovery CD's containing the factory default image that can be used to restore the storage device to its factory default condition. The drawback of this method is that a new image of the storage device has to be created whenever there is a change in the operating system, applications, or any other files stored on the storage device. Complications may arise using this method in instances when it is necessary to apply patches or updates to the installed software after the storage device is restored from an old image. Customization of final configuration for a user, for instance, might take several hours even if the image is kept up to date.
The network boot method is often used in conjunction with simple computers that download necessary files from a file server on a network. The computer usually uses a well-known network service, such as BOOTP, TFTP, or PXE, to download and execute a small basic portion of an operating system, which in turn can start downloading the rest of the operating system and any applications. The drawback of this method is that if the computer does not have a local storage device, it has to go through the same boot process of downloading needed files whenever it is powered on or reset. If the computer has a local storage device, this process can benefit by storing downloaded files locally. But then the operating system downloaded over the network is, once again, responsible for the complex tasks of managing hardware components and files stored on the local storage device.
The computer management software method is used to enhance the operating system by adding additional software components as agents, daemons, or services. One typical way of using this method is to use anti-virus software that constantly scans stored files for any computer virus or other malware infection. This method may also be implemented by adding a software component that constantly monitors important files on the local storage device and attempts to self-heal any damaged or corrupted files. An additional implementation adds a software component that handles file updates pushed out from a server as a part of a computer management tool. The drawback of this method is that the software components acting as agents, daemons, or services are highly dependent on the operating system. For instance, a software component written for Microsoft Windows would not work under Linux. Thus, the operating system has to provide necessary functions, such as managing local storage devices or network interfaces, for these software components to work properly.
Many operating systems can also apply file level or directory level security to provide certain level of protection against computer viruses, malware, unauthorized access, user errors, or application errors that can corrupt important files. The drawback of this method is that it is again operating system dependant, and a super user, an administrator, or a process running with full access privileges can accidentally modify, delete, or corrupt important files in the local storage. Moreover, many malicious programs, such as viruses that have gained full access privileges, can inflict detrimental damage.
The above methods, by themselves or in combination with other methods, provide some help in reducing the complexities involved with computers. However, none of the methods fundamentally changes how the operating system manages the components of a computer. Thus, a new approach is needed for operating computers and simplifying the manner in which files are handled or distributed over a network.
In addition, one of the growing concerns in today's computing environment has to do with data security. Computing devices are often stolen, lost, or thrown out while the storage device in the computing devices may contain sensitive data. Anyone who has gained physical access to the computing device or the storage device can easily access the data stored in the storage device in many ways. For instance, one can boot the computer with an alternate operating system and gain access to the data. One can crack the administrator or user accounts and passwords by using a password recovery tools. Or one can simply remove a storage device from a company's computing device and hand over the storage device with user account and password to a competing company in industrial espionage. If the data are not backed up to or synchronized with a file server, then everything is lost to an unauthorized party.
There are many methods to deter unauthorized access to files and data stored in a storage device. The methods generally fall into two categories: (a) operating system based access controls and (b) external hardware dependent access controls.
In the case of operating system based access controls, the operating system might use an encrypted file system (EFS) to encrypt the data stored in the storage device. But, such a method is operating system dependent. If the operating system is compromised (i.e. the encryption key for EFS is exposed), or if the user account and password are cracked or known to an unauthorized party, such operating system based access controls would not be able to safeguard the data.
In the case of external hardware dependent access controls, an external device, such as a SmartCard or a USB key, might be used to control access to the data. For instance, a SmartCard can be used to during the boot process before an operating system is fully loaded. If a valid SmartCard is found, then the boot process can continue loading the operating system that would grant access to data, most likely in EFS. This would be more secure than operating system based access controls alone, but it would be dependent on the external hardware. If the hardware is damaged without a backup, then even an authorized user can not gain access to the data. If a recovery procedure were to exist, then an unauthorized party could use the recovery procedure to gain access to data. Also, the external hardware (i.e., a copy of a SmartCard or a USB key) along with the storage device might be handed over to an unauthorized party in industrial espionage. To prevent this kind of unauthorized access, the operating system may also use a “connectedness policy” in addition to the external hardware. For instance, after a SmartCard is verified during the boot process, the operating system may check with a network server for connectedness before allowing access to the data. But, this method is also operating system dependent. That is, if a particular operating system that a user or an application prefers does not provide this kind of access control, then there is no way to benefit from such access control. Therefore, a new approach to safeguarding data in a storage device is needed, preferably one that is independent of the operating system running on the computing device.
Furthermore, a certain degree of asset management (both hardware and software asset management) and support management of end user computing devices are highly desirable in a corporate computing environment. Hardware and software asset management is often accomplished with a use of an “asset management agent” that gets installed on end users' computing devices. Depending on the implementation, the agent can query and report hardware components to a management station. The agent is able to detect any changes in hardware configuration or unauthorized activities such as the computing device case being opened. The agent can also query files stored in local storage devices and report to a management station what kind of software is installed on end users' computing devices. Based on information reported by the agent, hardware or software asset reports can be generated on the management station.
Such asset management agents are again highly operating system dependent. That is, many agents would need to be developed and written for all kinds and versions of operating systems (e.g. Windows XP, Windows Vista, Linux, UNIX, MacOS, etc.). Unless the agent is developed with some level of sophistication (i.e., not only checking the software file names but also checking certain traits of the software, such as an MD5 checksum, or the like), the agent can be fooled to report software assets incorrectly. For instance, if an end user wants to use an application that is not officially allowed or supported by the company, the user may rename the application executable to a common one, such as NOTEPAD.EXE, so that the application does not get reported to the management station. In the same way, a malicious program can disguise itself as an authorized application.
The user or a malicious program may also fool the agent to make it appear as if the user has a certain application running on the system or existing in the user's local storage device. In today's highly security-oriented computing environment, certain firewalls or gateway/proxy devices try to detect if the end user's computing device has certain applications running or not (i.e., whether an anti-virus software is running or not). Such firewalls or gateway/proxy devices would use an agent to detect the presence of required applications. Unless the agent can detect all flavors and versions of required applications, the user can change the name of an application and bypass the agent's query. For instance, the user can make a duplicate copy of NOTEPAD.EXE as NTRTSCAN.EXE, run and minimize the hoax NTRTSCAN.EXE to circumvent the agent's attempt to check if Trend Micro's OfficeScan application is running or not.
Such asset management agents are also not able to assess hardware components if the computing device is turned off. The agent has to be running on the computing device for the agent to be able to collect asset information and report to the management station. This means that the computing device has to be fully functional. The operating system has to be running correctly with many dependant services/daemons (such as network interface drivers and network protocols). Otherwise, the agent would not be able to provide any hardware or software asset information to a management station
There are numerous methods currently available for remotely controlling computing devices (i.e. power on or power off). Wake-On-LAN, for instance, is one of such methods that an administrator can use to turn on a computing device remotely so that the asset management agent can provide asset information to the management station. But being able to remotely power on a computing device would not do any good if the computing device is unbootable (i.e. OS is not loaded correctly or there is a faulty hardware component). In this case, the agent would not be loaded and would not be able to provide asset information to the management station. Therefore, a different approach, generally called “out-of-band management”, has been developed for managing and troubleshooting computing devices that do not operate correctly.
Out-of-band management is widely used in today's computing environment. It can simply provide remote view of computing device's console for troubleshooting. Many implementations of out-of-band management provide additional functions and information, such as power control and health check of system components (e.g. whether a local storage device is bad, whether there is a memory error, whether the computing device's internal temperature is too high for it to operate properly, etc.). This kind of management is usually accomplished by using an add-on management board or a management component built into the computing device's system board. Many servers used in the corporate environment have such management options that enable remote administration of servers. Since the add-on management board or the built-in management component can have constant power (either via a dedicated power supply or from the always-on system board), it can provide hardware-related information to the management station even if the computing device is powered off, independent of the operating system or an asset management agent.
Such add-on management boards or built-in management components have at least one major limitation—they cannot query a local storage device and generate a software asset list. This limitation is mainly due to the fundamental architecture of today's computing devices where no more than one operating system can control a local storage device. An operating system that is responsible for managing a local storage device builds a file system unique to the operating system. For instance, Microsoft Windows may use NTFS, and this file system cannot be recognized natively by Linux or MacOS. In addition, Microsoft Windows does not natively recognize file systems created by the Linux or UNIX operating systems. Therefore, if an add-on management board or a built-in management component were to be able to scan a local storage device, the add-on management board or the built-in management component would have to understand the file system in use. That is, the firmware running on the add-on management board or the built-in management component should be compatible with the operating system running on the computing device. This would be a serious limitation since there are many varieties of operating systems a computing device would use. Even if the firmware running on the add-on management board or the built-in management component was made fully compatible with the operating system running on the computing device, sharing the file system (possibly with directory and file level security as well) between the operating system and the firmware would likely introduce additional complications in that the file system could easily be corrupted, requiring significant attention to repair.
Thus, for an administrator to have a complete hardware and software asset management capability, a combination of a hardware-based approach (i.e. add-on management board or built-in management component) and a software-based approach (i.e. asset management agent) is ideal. Because a hardware-based approach would add complexity to the system and because software-based approach is highly operating system dependent, a new simplified approach is needed.
Basically, the root cause of problems existing in today's computing environment is contributed by the operating system that has inherent vulnerabilities arising from the operating system's kernel mode that controls every component of the computing device or the user's ability (and necessity) to have elevated access rights (e.g. administrative or root rights). For instance, most, if not all, computing devices are booted with an operating system that controls peripheral devices (e.g. a hard disk drive), and the operating system has an elevated access mode that the operating system itself or a user needs to configure the operating system and the peripheral devices. Malicious users or programs are constantly exploiting these elevated access modes to gain unauthorized access or to make the computing device non-functional. Operating systems and applications have become more and more complicated to prevent such exploitation, but fundamentally no effective cure is currently known.
One of the recent developments in an attempt to minimize the malicious exploitation and to increase the manageability is the virtualization of the user's computing environment. Virtualization is accomplished by booting a computing device with a host operating system (usually with a “hardened” operating system) to provide a virtual machine that has virtual storage devices, virtual network connections, etc. The virtual machine is then booted with a guest operating system for the user. For instance, an Apple® Mac is booted with Mac OS® Virtual PC® is started to create a virtual machine, and then the virtual machine is booted with Windows XP® so that the user can use Windows® applications. Or a server is booted with Windows Server 2003®, VMWare® server is started to create multiple virtual machines, and then the virtual machines are booted with various operating systems. Snapshots of a virtual machine can be taken to save the states of the virtual machine so that the virtual machine can be restored quickly if the virtual machine encounters a problem. If a server is configured to support multiple virtual machines for users, it would fall into centralized server-based computing or thin-client computing as the users can access the virtual machines remotely while transferring keyboard, mouse, and video signals over the network.
There are numerous advantages that server-based computing or thin-client computing has, such as limited direct user access to the server hardware and software components and ability use low spec devices as user stations. But server-based computing or thin-client computing has several disadvantages as well, such as the server needs to be powerful enough to support all users and, more restrictively, all user stations need to be connected to the server at all times to enable the users to use the applications available on the server.
Moreover, virtual machines running guest operating systems still have inherent vulnerabilities that can be exploited by malicious users or programs. For instance, the virtual machines would still need anti-virus software and a way to control file permissions, just like regular computing devices. And no matter how much the host operating system is hardened (protected), the host operating system still has elevated access modes that can be exploited. For instance, a malicious user or program that gained administrative rights on the host operating system can corrupt or configure the software components that provide virtual storage devices to virtual machines and cause all virtual machines inaccessible. Therefore, it is highly desirable to reduce exposure of the files and data used by the operating system and applications to malicious users or programs.