The present application relates to circuit board run-in, and specifically to run-in of server and/or workstation boards with high-insertion-force connectors.
Manufacturing Computers
In recent decades, rapid progress has been made in developing new xe2x80x9csmartxe2x80x9d manufacturing techniques. However, manufacturing of computers themselves, in production quantities, poses some additional difficulties.
Changing User Expectations
As the computer industry matures, users have come to increasingly demand turnkey convenience. In the mid-1970s personal computers could be sold as kits, for users to assemble and debug; and the few users who bought personal computers were interested and sophisticated enough to cope with (or at least tolerate) the high demands of such a system. However, as increasingly useful software has entered the computer market, and as the pool of users has steadily increased, this xe2x80x9chobbyistxe2x80x9d attitude has become increasingly uncommon. In the 1990s, even a user who is able to cope with the demands of system configuration of a new system will normally not want to do so. (In this respect, computer buyers are becoming more and more analogous to car buyers: even those car buyers who are competent to wire up the ignition system of a new car would not want to do so.)
Background: Board Testing
In high-end personal computers, the boards which are assembled to the system board may be very complex (and may indeed contain a large fraction of the value of the system). The commonest types of such boards include: RAID controllers; remote server management boards; high-end graphics boards (particularly those with accelerated 3D rendering capabilities, and most especially those used for professional animation generation and/or video editing). Other types may include: numeric accelerator boards; hardware emulation units; some real-time control boards; other custom boards with one or more coprocessors; and RAM-based disk emulators.
If such an add-in board fails, the complete system becomes useless for its intended purpose. Moreover, when an add-in board contains a large fraction of the total chips in the system (or a large fraction of the total gates in the system), the likelihood of failure of the add-in board becomes a substantial contributor to the total likelihood of system failure. Thus when such high-end boards are used, system reliability depends critically on the board reliability. To meet high system reliability standards, system manufacturers must therefore assure that the reliability of high-end boards is adequate.
One of the hottest areas in personal computer development is file servers. Personal-computer-based file servers are being adapted to high-reliability and high-bandwidth applications. In many installations the failure of a single file server can stall dozens or hundreds of computer users, so the cost of failure is very high. Servers in such applications often include very complex add-in boards, such as RAID controllers with extensive buffering capacity. Thus in such applications the tolerance for failure is very low, while an add-in board may be a significant part of total system complexity.
As more high-end personal computers are bought by customers who would formerly have bought engineering workstations, this constraint becomes tighter: such customers are both more demanding of reliability and more likely to order high-end add-in boards. These applications too may combine a low tolerance for failure with the use of very complex add-in boards.
Background: xe2x80x9cInfant Mortalityxe2x80x9d in Reliability Statistics
An odd feature of reliability is that many electronic components will fail relatively quickly. This is because a system may have a latent weakness which can pass the standard set of tests, but which becomes worse in service. A common example is an integrated circuit in which a power supply trace is necked down at one point; even though the wiring is not broken, and will pass normal electrical tests, the necked down location will be more susceptible to early failure due to electromigration. Many other failure mechanisms produce similar profiles. Thus it is common to see that a curve of unit failure rate will rapidly DECREASE during the first part of a unit""s life, and then stay relatively constant for a very long time, until units begin to reach normal wearout failure. See generally, e.g., Eugene Hnatek, xe2x80x9cDigital Integrated Circuit Testing from a Quality Perspectivexe2x80x9d (1993); Parag Lala, xe2x80x9cDigital Circuit Testing and Testability (1997); Alexander Miczo, xe2x80x9cDigital Logic Testing and Simulationxe2x80x9d (1986); van de Goor, Testing Semiconductor Memories; Roy Longbottom, xe2x80x9cComputer System Reliabilityxe2x80x9d (1980); National Semiconductor Reliability Handbook (1979; 3.ed. 1987); Wayne Nelson, Accelerated Testing: Statistical Models, Test Plans, and Data Analyses (1990); Dimitri Kececioglu, Reliability and Life Testing Handbook (1993); Ashok K. Sharma, Semiconductor Memories (1998); John H. Lau et al., Solder Joint Reliability of Bga, Csp, Flip Chip, and Fine Pitch Smt Assemblies (1996); Forrest W. Breyfogle, Statistical Methods for Testing, Development, and Manufacturing (1992); C. E. Mandel, Environmental Stress Screening: A Tutorial (1985); Heinz P. Bloch et al., An Introduction to Machinery Reliability Assessment; Snehesh Kumar Sinha, Life Testing and Reliability Estimation; all of which are hereby incorporated by reference.
Because of the xe2x80x9cinfant mortalityxe2x80x9d phenomenon, it is common to perform xe2x80x9cburn-inxe2x80x9d or xe2x80x9crun inxe2x80x9d on electronic components. During such procedures, components are xe2x80x9cexercisedxe2x80x9d under conditions which will tend to make latent early failures reveal themselves. (Every xe2x80x9cinfant mortalityxe2x80x9d failure which occurs during run in means one less angry customer.) The components may be continually cycled through tests during run in, or alternatively the components may simply be subjected to voltage and/or temperature stress under controlled environmental conditions and then retested separately.
Of course, one of the problems associated with burn-in of components is the test equipment necessary to perform this process. For example, if a test server can run diagnostics on only one or two boards at a time, the time and expense of burn-in is increased greatly over a test server which can burn-in a large number of boards at a time. As the boards themselves become more sophisticated, with specialized connectors and multiple bus connections necessary, supplying a test platform which can handle a large number of these boards becomes more difficult, but more necessary.
Hot-Swappable Multiboard Run-In Tester
The present application discloses systems, for running in boards-under-test, in which any one (or some, or all) of the following features are present:
the presence or absence of power in a bus socket is monitored, and replicated by paralleled power switching circuitry, to power up the boards-under-test only conditionally, and thereby control the application of power to multiple boards-under-test without overloading the power pins in the bus socket;
a complete operable personal computer is itself used as a testbed, and multiple bus sockets are each bridged to separate subsets of boards-under-test;
a movable extractor mechanism is positioned on a subboard which can receive multiple boards-under-test, so that by a step-and-repeat operation all boards-under-test can be rapidly extracted; and
a special power-up timing relationship is implemented, so that the self-initialization operations (power-on-self-test etc.) of the board-under-test can begin before the circuitry on the testbed""s bridge adaptor has finished its self-initialization operations.
A particular advantage of various disclosed embodiments is that computer boards can be run in very efficiently in a manufacturing floor environment.
Another advantage of various disclosed embodiments is that the ergonomics of board insertion and removal are greatly improved. This is highly advantageous in a computer manufacturing environment, where a technician may have to do hundreds of board insertion/removal operations in a single shift.
Another advantage of various disclosed embodiments is that the damage to boards with high-insertion-force connectors is essentially eliminated, since (with the disclosed extractor) connector separation during board removal is always linear.
Another advantage of various disclosed embodiments is that the xe2x80x9creach-upxe2x80x9d geometry, positioning the sockets for the boards-under-test atop the computer testbed, permits other sockets in the computer to be used for other purposes if desired. In one example, other slots of the testbed computer can be used for PCI NIC boards.