1. Technical Field of the Invention
This invention most generally relates to data transfer and broadband communication networks within a parallel computing system or a local area network. In particular, the present invention relates to wafer scale integration of optoelectronics.
2. Background of the Invention
Technological advancements have dramatically increased the capabilities and possibilities of computing electronics. The increased bandwidth and data transfer rates have resulted in commercial innovation and scientific advancements in many fields. However, data transfer continues to be a bottleneck. This is true for data transfer within an integrated circuit (IC), from one chip to another, from hybrid circuit to hybrid circuit, from integrated circuit board to another integrated circuit board, and from system to system.
Another driving factor leading to ever increasing demands for faster data transfer rates is the need to do tasks that are more complex, requiring multiple computing nodes to cooperate. Digital signal processing, image analysis, and communications technology all require a greater bandwidth. The demand for increased data transfer capability and greater bandwidth translates into increases in both the speed of the data transfer, and the amount of data that is transferred per unit time.
In general, the problems associated with data transfer within an IC and on a system network are similar. Increasing the data transfer rate can be done in any of several ways. Some increase in the data transfer rate can be obtained by increasing the speed at which signals are communicated from one part of a system or network to another. Presently, the fastest known transfer means is the use of optical signals that operate at the speed of light.
Another means to reduce system delays is to increase the bandwidth being used. In this approach, more information is sent at one time. Since the vast majority of systems and networks now are digital, the measure of the increase in bandwidth is in terms of the number of bits on a bus.
There are limitations to the available bandwidth, such as spacing and size requirements, noise problems, reliability of connectors, processing times, buffer size, and the power required to drive multiple lines off-chip. Increasing the transmission speed also has some limitations, as increasing the speed also increases power requirements, introduces timing skew problems across a channel, and usually requires more exotic processing than is standard practice. Combining higher transmission speeds and more bandwidth is exceedingly difficult and impractical.
Whether transferring data within a circuit or connecting system to system, the limited bandwidth of conventional hardware does not satisfy the marketplace. For high data rate transmissions, only fiber optics transmit data at Gigabit data rates. Fiber optic communication systems allow information to be transmitted by means of binary digital transmission. The data or information that is to be transmitted is converted into a stream of light pulses, wherein the presence of a pulse corresponds to the transmission of a binary xe2x80x9cone,xe2x80x9d and the absence of light corresponds to the transmission of a binary xe2x80x9czero.xe2x80x9d An optical receiver is used to convert the stream of light pulses into an electrical signal that is processed to determine the transmitted information.
Typically the optical transmitters are light emitting devices such as vertical cavity surface emitting lasers (VCSELS) and light detecting devices such as photodiodes. The optical transmitters and receivers may be encompassed in a separate chip or fabricated on the same substrate and with accompanying electronics. The fabrication process is well known in the art and U.S. Pat. No. 5,978,401 provides background materials, and is incorporated by reference.
The transmitters have driver circuitry that drives the VCSELS, while the receivers also have receiver circuitry for processing the received signals. The transmitter driver circuitry and the receiver driver circuitry is usually in the form of ASIC devices. The combination of the VCSELS and photodiodes along with the ASIC driver circuitry is called an optical transceiver. One embodiment for hybridization of the transceiver elements is via flip-chip bonding, which is generally explained in U.S. Pat. No. 5,858,814, incorporated by reference herein.
Optical fibers are used to transmit the optical data off the transceiver device and the fibers mate with the transceiver for data transfer. A spacing problem exists when there are large arrays of transceivers and corresponding optic cables mating to each emitter and detector. The coupling and alignment of these multiple fiber optic cables is exceedingly difficult and there is a high defect rate in large bundles.
In particular, data transfer in and out of a processor is a major concern. If the memory resides off the chip and is connected by traditional electronic means, data access is particularly slow. Even if it is on the chip, the current capabilities of reticles limit the amount of memory that is possible to put on the chip.
In recent years, there has been increased interest in systems on a chip. The logical extension of this idea is the system on a wafer, so called wafer scale integration. There are advantages to integrating an entire system on a wafer. First, the entire mask set can be designed for a particular function, much like current microprocessors, but at a higher level. Second, the entire wafer experiences the same set of process conditions. Many circuits exhibit slight process dependencies such as shifts in threshold voltages in MOSFETs and it is advantageous for all of the MOSFETs in a system to exhibit the same sensitivities. A further benefit of wafer scale integration is that all of the elements of a circuit can be processed at the same time.
However, with existing wafer fabrication technology there are some severe constraints posed by the need to have circuit elements such as memory, and some supporting circuitry, physically close to the processor section of a chip. The reason for this requirement is that as the distance between circuit elements increases, so does the signal propagation delays. The signal propagation increases the delay associated with transferring data to and from memory, and the need to accommodate current interconnection schemes. To account for the additional signal delay it becomes necessary to slow the data rate into and out of memory.
A prior art example of the spatial relationship between the processor section and the memory is shown in FIG. 1. The central processing unit (CPU) 10 is located in some small portion of the chip 30, while the memory cells 20 are located as near as possible in order to minimize distance and therein minimize propagation delays. It is apparent that only a certain quantity of memory devices may be located in close proximity to the CPU. Additional memory devices may be located at a greater distance on the chip or be located off chip. In either scenario, the increased distance translates into propagation delays.
An additional problem involved in wafer integration deals with the internal connections. The imaging process involved in forming integrated circuits must be done in a xe2x80x98step and repeatxe2x80x99 manner because of limitations in imaging extremely fine structures across a large area. In sum, there are challenges associated with reliably making connections from one portion of a wafer to another using conventional lithographic techniques if the distances are too great.
The factors that limit the transferal of data to and from processors on a wafer become even more acute as compared to system level impediments. Though transfer rates within a chip are quite high, the inter-chip data transfer rates are appreciably slower than the intra-chip data transfer rates. This problem is due, in part, to the limited area on the perimeter of integrated circuits, which traditionally contains the Input/Output (I/O) buffers needed to drive signals off-chip. Consequently, there is often a severe limitation on the number of bits available for any external bus. Thus, an external data bus can be a significant bottleneck to improving system performance. At present, the largest bus sizes are only 64 bits wide.
Some attempts have been made to address the aforementioned problems. Considerable work has been done to develop optical interconnect technology for mating transceivers with silicon die, but the current systems are still very limited in bandwidth.
In summary, conventional methods for communicating data between a CPU and memory cells on chips are slow and bandwidth limited. Furthermore, constraints of reticles limit the amount of memory that is possible to place next to a CPU.
In addition, traditional methods for assembling systems consist of producing individual chips that comprise a system, then packaging them, and shipping them to an assembly site where the chips are mounted on a motherboard. This process is inefficient because it requires excessive handling that increases costs and time. Furthermore, in so doing, the communication bandwidth of such systems is reduced significantly.
Finally, imaging in semiconductor manufacturing is limited by the need to step and repeat the exposure of a lithographic image. Alignment errors, though small, accumulate with increasing wafer size. The result is an inability of reliably making connections from one portion of a wafer to another if the sections being connected are located remotely from each other.
Ideally, what is needed is a way of allowing elements of a circuit to be placed at almost arbitrary distances from each other without suffering the adverse propagation delay effects, and without the limitations of the reticles used in current lithography. What is needed is a way to eliminate the need to place memory next to the CPU without suffering latency effects. What is needed is a way of connecting system components together using a system-on-a-chip concept and that does not adversely affect the system bandwidth. What is needed is a way of enabling inter-chip data transfer rates that significantly exceed current capabilities. What is needed is a means for reducing the latency so that it is not a significant factor in limiting data transfer.
The present invention is an apparatus that allows bundles of optical fibers to be connected from one portion of a wafer to another. The result is a dramatic increase in the inter-system bus bandwidth, an increase in overall speed, and an ability to design systems on a wafer with the system elements positioned remote from each other with loss of speed or bandwidth. The invention further comprises a method for assembling optoelectronic interconnects in a system on a wafer.
One embodiment of the present invention encompasses flip-chip mounting of transceiver arrays to a CMOS substrate to enable communication of data from a CPU to memory cells located elsewhere on the chip. The invention exploits optoelectronic techniques to ensure high-speed data transfer with a significant increase in bandwidth over the currently practiced art. The invention further includes a method for fabricating the structure.
One example of the present invention is an apparatus that uses a silicon semiconductor wafer in which sub assemblies are constructed, wherein some of the sub assemblies contain separate areas containing CMOS circuitry to facilitate communication protocol. Transceiver arrays are aligned above the CMOS circuitry, and secured in place using hybrid chip technology known in the art. The transceiver arrays on a III-V substrate are connected electrically using ultra-high-density flip-chip integration, and are mechanically affixed using epoxy. More epoxy is then used to form standoffs positioned above the transceiver arrays, and fiber bundles are aligned with the standoffs, and then connected from one area of the wafer to the other.
Optionally, a face-plate or micro-lens array is included between the epoxy standoffs and the fiber bundle. This process is repeated among all desired sub systems. The fiber bundles then provide a communication pathway between the transceiver arrays from sub-system to sub-system. As a result, communications between disparate portions of a chip are permitted. Since communication between the transceiver arrays is done optically, the delays normally associated with electrical communication between disparate areas on a chip are eliminated. Furthermore, the optical transceiver arrays enable significantly broader bandwidth communication than was heretofore possible.
One object of the present invention is to provide an apparatus that results from these elements that enable the functionality of optical communication among subsystems in a system on a wafer. Another object of this invention is that the method and apparatus described in this invention scale to wafers of arbitrary size and that may encompass varied accompanying subsystems.
One other object of this invention is a process for assembling optical interconnections in a system on a wafer that comprises the steps of forming CMOS circuitry in the silicon substrate to facilitate communication protocol, positioning transceiver arrays above the CMOS circuitry and securing them in place using ball grid solder for electrical contacts and epoxy to ensure mechanical connections, then mounting epoxy standoffs on the transceiver arrays, then aligning one end of a fiber bundle with the transceiver arrays, and securing it in place using epoxy, and securing the other end of the fiber bundle with more epoxy in another epoxy standoff located in another area of CMOS circuitry elsewhere on the wafer.
Another object of the invention is the use of digital signal processing (DSP) chips, or other arithmetic co-processor devices. These chips are specialized data processors that enable very high processing speeds using a limited number of operations. However, to take full advantage of the capabilities of these chips, they must be placed close to the CPU. Using this invention, they can be placed at a considerable distance from the CPU, yet still function at peak performance without the ill effects of excessive latency. This process of connecting the functionality of different portions of a wafer-sized chip is a process of seamless integration.
And another object of this invention is the interconnection of optical transceivers on different portions of a wafer. Another object is the dramatic increase in the chip-to-chip bandwidth within a system on a wafer module.
Yet another object of this invention is the functionality of using optoelectronic interconnects to communicate between portions of a system on a wafer. Another object of the invention is the optical interconnection means contains optical transceivers for communicating between portions of a system on a wafer.
An additional object of this invention is to provide the huge increase in speed/data transfer rates and bandwidth afforded by having data flow optically between key areas in a system on a wafer. Another object of this invention is that, at least for modest sized arrays, the power requirements be not excessive, especially given the huge increase in speed and bandwidth.
A further object is that the limitation of reticle size constraints are overcome, and it be possible to connect distant regions or portions of a wafer via optical interconnects. Another object of this invention is the seamless integration of integrated circuits, such as digital signal processing chips, memory caches, etc. into complete systems on a wafer.
And a further object of the invention is to provide the flexibility to accommodate various network architectures by bundling the optic fibers and routing them to various nodes. Yet another object of the invention is to provide fiber bundle connectors for connecting to a backup chip of the wafer for redundancy. Such backup chips would not necessarily be in use, but could be activated if necessary. Such a design ensures flexibility even when using rigid optical connectors in wafer sized systems, and ensures redundancy of the functionality of the overall system.
An object of the invention is the interconnection of a CPU with memory cells using flip-chip optoelectronic interconnects, where the CPU contains CMOS circuitry to interface with the transceiver arrays, the memory cells also contain CMOS circuitry to interface with the transceiver arrays, and fiber optic cables or bundles connect the transceiver arrays at the CPU to those on the memory cells.
Another object of this invention is the process of sending data between a CPU and a memory device located on the same chip but at some distance from the CPU section itself, where the process consists of sending data in a CPU to data ports made of CMOS circuitry, which then couple the data to the transceivers, which in turn send the data over fiber optic bundles to transceivers located on data ports made of CMOS circuitry in the memory devices, and this circuitry finally sends the data to the memory location desired.
And still another object of this invention is to provide the function of communicating between a CPU and memory devices on the same chip using optical interconnect devices, i.e., transceivers and fiber optic bundles to couple data to and from a CPU and memory devices on the same chip. A further object this invention is the process of communicating between a CPU and memory devices on the same chip using transceivers and fiber optic bundles, where the fiber optic bundles interface with the bulk of the chip (memory or CPU) via local data processing circuitry.
Yet another object of this invention is the reduction of supplemental logic circuitry to manage data flow. Associated with this advantage are fewer layers of memory (and associated circuitry) and the memory management schemes that are required to pass data from on-chip memory caches to disk memory. Another object is the ability to access memory a page at a time by transferring a group of bits read from memory over parallel optical channels simultaneously.
An additional object of the invention is the functionality of swapping out the entire cache memory in one clock cycle, using parallel optical channels to transfer the data simultaneously. Another object of this invention is the elimination of problems associated with the pitch of transceivers by designing data management circuitry and optically connecting the subsystems. The pitch of the currently available transceivers imposes an increase in chip size, which in turn means that the number of chips per wafer is reduced.
Yet an additional object of this invention is a structure for communicating data between a processing unit of a chip and memory via flip-chip optical interconnects. The structure consists of a silicon substrate with processing circuitry, one or more flip-chip mounted transceiver arrays, CMOS circuitry that drive the transceivers, electrical interconnects that connect the CMOS circuitry and the processing circuitry. There are also one or more memory devices to which are flip-chip mounted other transceiver arrays that are controlled by separate CMOS circuitry. Finally, an optical bundle is connected between the transceiver arrays on the processor and the transceiver(s) on the memory structure(s).
Another object of this invention is the function of operating a processor using data accessed from memory using optoelectronic means. An object of this invention is that the structure and process described scale to chips of arbitrary size.
Another object of this invention is the process for assembling the structure, where the process consists of processing the CPU and memory portion of the circuit and separately the transceiver arrays, then flip-chip mounting the transceiver arrays onto the CPU, and then connecting the optical fibers between the CPU and the memory devices.
Additional objects, advantages and novel features of the invention will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
Still other objects and advantages of the present invention will become readily apparent to those skilled in this art from the drawings and from a detailed description, wherein we have shown and described only a preferred embodiment of the invention, simply by way of illustration of the best mode contemplated by us on carrying out our invention. As will be realized, the invention is capable of other and different embodiments, and its several details are capable of modifications in various obvious aspects, all without departing from the invention. Accordingly, the drawings and description will be regarded as illustrative in nature and not as restrictive.