1. Field of the Invention
This invention relates generally to remote server management in networked computer systems and, more particularly, to improving the capability of remote server management by providing JTAG functionality in a remote server management controller.
2. Background of the Related Art
This section is intended to introduce the reader to various aspects of art which may be related to various aspects of the present invention which are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
1. Introduction
Since the introduction of the first personal computer (“PC”) over 20 years ago, technological advances to make PCs more useful have continued at an amazing rate. Microprocessors that control PCs have become faster and faster, with operational speeds eclipsing a gigahertz (one billion operations per second) and continuing well beyond.
Productivity has also increased tremendously because of the explosion in the development of software applications. In the early days of the PC, people who could write their own programs were practically the only ones who could make productive use of their computers. Today, there are thousands and thousands of software applications ranging from games to word processors and from voice recognition to web browsers.
As integrated circuits and other computer components such as motherboards have become more and more complex, effective and powerful methods to test them have become necessary. Accurate, thorough testing of integrated circuits and circuit boards in the early stages of their manufacture is very important for controlling manufacturing costs and ensuring product quality. If a defective integrated circuit or board is caught early in the manufacturing process, it may be pulled from further expensive processing for repair or scrap without wasting additional time and expense to produce a finished product that would just turn out to be defective in the end.
2. The Development of JTAG
In the mid-1980s, the Joint Test Action Group of the Institute of Electrical and Electronics Engineers (“IEEE”) promulgated an industry standard known as IEEE 1149.1, which was entitled “Test Access Port and Boundary-Scan Architecture.” That specification came to be known by the acronym “JTAG.” The JTAG standard sets out a methodology for performing testing on complex integrated circuits and circuit boards. JTAG provides a strategy to assure the integrity of individual components and the interconnections between them after installation on a printed circuit board (“PCB”). Since it was first promulgated, the JTAG standard has become widely adopted.
As integrated circuits contain more and more functionality, the packages that contain them (“chips”) continue to get bigger and bigger. Modem integrated circuits have dozens of electrical inputs and outputs called “pins.” To thoroughly test these chips, power may be applied and the input pins may be connected to equipment that can provide input signals (“test vectors”). Typically, a device known as a “test bed” or “interposer” is constructed to physically connect the power, input and output signals of the chip to the correct parts of a tester. If the device is functioning properly, the output pins, which are also connected via the test bed or interposer to equipment that can read their response to the test vectors, will respond in a predictable way. If the output pins do not respond correctly to the test vectors, the device fails and is normally removed from further processing until the root cause of the failure can be determined.
The more pins an integrated circuit has, the more closely (densely) packed together those pins have to be in the chip package. Increasing pin density is an ongoing problem for engineers and technicians who design and build integrated circuits. High pins density means that the pins are closer together, which makes it harder to design test beds and interposers to connect to the pins of a chip without shorting them to other pins that are close by.
The JTAG testing standard attempts to improve these physical access problems caused by high pin density by eliminating the need to connect individual chips being tested to an external connector, such as a test bed or interposer. Instead, each JTAG-compatible integrated circuit has test functionality built directly into its internal workings. After a circuit board is assembled, the test equipment connects to a single connector on the circuit board rather than each of the individual chips on the board.
The JTAG architecture may be envisioned as a large scan chain, in which the JTAG-compatible chips on a circuit board are all connected in series (like a chain) that is accessible through the test connector on the circuit board. Information from each device in the chain is sequentially made available at the test connector. Thus, information from each device may be shifted from device to device around the scan chain. As the tester works its way around the chain, the output of each device in the chain may be examined and evaluated. To verify the operation of the components in the chain, components may be instructed (via test vectors) to load signals, sample signals, bypass components, program or alter any device registers that are accessible through the JTAG chain.
In addition to facilitating the testing of components, JTAG device access may also be used to debug and configure integrated circuits after installation on a circuit board. JTAG devices may be designed to have their own set of user-defined instructions, such as CONFIGURE and VERIFY, which are accessible through a JTAG interface. User-defined characteristics may be used to control a wide variety of operational parameters of the device and to perform failure analysis after a device failure occurs.
Modern microprocessors and chipsets support JTAG and are good examples of devices that allow access and control of many internal functions through an on-board JTAG Interface. In fact, Intel Corporation, a major manufacturer of microprocessors and chipsets, has adopted the JTAG architecture as part of its In-Target Probe (“ITP”) and In-Circuit Emulation (“ICE”) test methodologies. ITP and ICE employ standard JTAG signals, plus a few others. The ITP and ICE methodologies are widely used to troubleshoot and configure microprocessors, chipsets and other chips.
In addition to its support of device testing, debugging and configuring, JTAG may be used to program and reprogram a wide range of memory devices such as flash memory, PROMs, CPLDs and FPGAs. Programming and reprogramming of these devices may be done via the JTAG interface after the devices are installed on a circuit board such as a system motherboard.
3. The Growth of Computer Networking and Remote Server Management
In addition to improvements in PC hardware, software and component testing, the technology for making computers more useful by allowing users to connect PCs together and share resources between them has also seen rapid growth in recent years. This technology is generally referred to as “networking.” In a networked computing environment, PCs belonging to many users are connected together so that they may communicate with each other. In this way, users can share access to each other's files and other resources, such as printers. Networked computing also allows users to share internet connections, resulting in significant cost savings. Networked computing has revolutionized the way in which business is conducted across the world.
Not surprisingly, the evolution of networked computing has presented technologists with some challenging obstacles along the way. One obstacle is connecting computers that use different operating systems (“OSes”) and making them communicate efficiently with each other. Each different OS (or even variations of the same OS from the same company) has its own idiosyncrasies of operation and configuration. The interconnection of computers running different OSes presents significant ongoing issues that make day-to-day management of a computer network challenging.
Another significant challenge presented by the evolution of computer networking is the sheer scope of modem computer networks. At one end of the spectrum, a small business or home network may include a few client computers connected to a common server, which may provide a shared printer and/or a shared internet connection. On the other end of the spectrum, a global company's network environment may require interconnection of hundreds or even thousands of computers across large buildings, a campus environment, or even between groups of computers in different cities and countries. Such a configuration would typically include a large number of servers, each connected to numerous client computers.
Further, the arrangements of servers and clients in a larger network environment could be connected in any of a large number of topologies that may include local area networks (“LANs”), wide area networks (“WANs”) and municipal area networks (“MANs”). In these larger networks, a problem with any one server computer (for example, a failed hard drive, failed network interface card or OS lock-up to name just a few) has the potential to interrupt the work of a large number of workers who depend on network resources to get their jobs done efficiently. Needless to say, companies devote a lot of time and effort to keeping their networks operating trouble-free to maximize productivity.
An important aspect of efficiently managing a large computer network is to maximize the amount of analysis and repair that can be performed remotely (for example, from a centralized administration site). Tools that facilitate remotely analyzing and servicing server problems help to control network management costs by reducing the number of network management personnel required to maintain a network in good working order. Remote server management also makes network management more efficient by reducing the delay and expense of analyzing and repairing network problems. Using remote management tools, a member of the network management team may identify problems and, in some cases, solve those problems without the delay and expense that accompanies an on-site service call to a distant location.
Remote management tools can communicate with a managed server using either (1) in-band communication or (2) out-of-band communication. In-band communication refers to communicating with the server over a standard network connection such as the managed server's normal Ethernet connection. In-band communication with the server is, accordingly, only possible when the server is able to communicate over its normal network connection. Practically speaking, this limitation restricts in-band communication to times when the OS of the managed server is operational (online).
Out-of-band communication, which is not performed across the managed server's normal connection to the network, is a much more powerful tool for server management. In out-of-band communication, a “back door” communication channel is established by a remote server management tool (such as a remote console or terminal emulator) using some other interface with the server (such as (1) through the server's modem, (2) via a direct connection to a serial port, (3) through an infrared communication port, or (4) through a management Ethernet interface or the like).
In a sense, out-of-band communication is like opening an unobtrusive window through which the inner workings of the operation of the managed server may be observed. After the out-of-band communication link with the server is established, the remote server management tool communicates with the server to obtain data that will be useful to analyze a problem or potential problem. After a problem has been analyzed, out-of-band communication may be possible to control the managed server to overcome the problem or potential problem.
In addition to the distinction between in-band and out-of-band communication with a managed server, another important distinction is whether the managed server is online or offline. The term “online” refers to a managed server in which the OS is up and running. The managed server is said to be “offline” if its OS is not up and running. For the purpose of explaining the present technique, communications with a managed server will take place in one of these four states: (1) in-band online; (2) in-band offline; (3) out-of-band online; and (4) out-of-band offline.
An important goal in the development of remote server management tools is to increase the number of server problems that may be analyzed and repaired remotely (that is, without requiring direct, on-site intervention by a member of the network management team). To facilitate that goal, it is highly desirable to have a network management tool that is able to capture the maximum amount of information from a managed server in the maximum range of operational states of the server (for example, not powered up, fully operational or powered but locked up) and to allow control of the managed server based on that data.
It is also highly desirable to have a remote management tool that is integrated within the managed server and capable of exercising the maximum amount of control over constituent devices within the managed server. The microprocessor and the chipset of the managed server are examples of devices that would be useful to access and control remotely.
Early remote management tools were able to analyze and address a relatively narrow range of managed server problems. One of the first remote server management tools had the ability to reset a managed server remotely by cycling power to turn the server off and on again via an out-of-band communication session over a phone line. In this way, a managed server could be reset whether in an online or offline condition. This tool, however, did not have the ability to assimilate data about the operation of the managed server or to analyze the cause of the managed server's failure. Accordingly, the principal utility of these early server management tools was to reset the managed server after catastrophic failure. These management tools were not useful for diagnosing subtle problems or preventing future failures.
Later server management tools employed proprietary software agents similar to device drivers to monitor a wide range of conditions in the managed server directly (for example, alerts and management parameters specified by the Simple Network Management Protocol (“SNMP”)). The proprietary software agents in these management tools were designed to pass their data to the OS of the managed server, where it could be retrieved by remote access such as a remote management console application.
The large amount of data accessible by these management tools made them useful for diagnosing the cause of a wide range of server failures and permitting repair of those failures. A shortcoming of these server management tools, however, is that they rely primarily on communication between the managed server's OS and proprietary software agents that monitor conditions in the managed server. This limitation means that the tool is only operational when the managed server is online. Server management tools of this type are, accordingly, of little use in correcting problems in a managed server that is offline.
A still later generation of server management tools relied on a dedicated add-in card comprising an independent processor, memory, and battery backup. The add-in card essentially provided a dedicated management computer for monitoring and controlling the managed server. The dedicated management computer was hosted in the managed server and could communicate with the managed server (host) through an existing communication interface (for example, the PCI bus of the managed server). Such remote management tools could additionally include software agent-based data gathering capability of the type used in earlier agent-based systems previously discussed. In this way, these remote management solutions combine the advantages of deep information gathering capability (software agent-based information gathering technology available when the OS of the managed server is online) with the ability to control the operation of the managed server independently via an out-of-band communication session using the dedicated server management computer system hosted in the managed server.
The add-in card type of remote management tool could also include the capability to capture video data and reset sequences from the managed server for remote display or replay at a later time. The capture of video data is facilitated by the close integration of a remote management tool with the managed server and the ability of the remote management tool to communicate with the managed server over existing communication links (such as an industry standard PCI bus). The ability of a remote management tool to capture video data from a managed server is a particularly powerful analysis tool because it lets a remote user have “virtual access” to the managed server, just as if the user was physically present and inspecting the managed server in person.
In a typical remote management system employing a dedicated server management computer on an add-in card, a user (typically, a member of the network management team) could initiate an out-of-band session with the dedicated server management computer hosted in the managed server via a remote console application program being executed on a client computer. The dedicated management computer could be addressed by the user to control various aspects of the operation of the managed server via control circuitry connected to the embedded server management computer hosted by the managed server.
The present invention is directed to further improvements of remote server management technology.