1. Field of the Invention
This invention relates to semiconductor devices that have reduced active region defects and to semiconductor devices that have unique contacting schemes.
2. Discussion of the Related Art
Optical communication systems use near infrared (IR) radiation at wavelengths ranging from about 800 nm to 1600 nm. In particular, important communication bands are around 850 nm for short-range fiber optic communication links and around 1310 nm and 1550 nm for longer-range fiber optic communication links.
Group III-V compound semiconductor photo-detectors (PDs) are currently the photodetectors of choice for optical communications receivers because GaAs-based and InP-based materials are good near IR absorbers. These detectors have absorption lengths (Labs) of about 1 μm or less over the wavelength band of 800 nm to 1600 nm.
Notwithstanding some of the desirable characteristics of Group III-V detectors, it would be advantageous to fabricate PDs in Si-based systems for two reasons: cost and functionality. Whereas Group III-V-based processing is low yield and expensive, Si-based processing is ubiquitous and low cost. Due to its high device yield, Si is the material of choice to realize complex electronic functionality. Low cost opto-electronic subsystems are possible in Si.
Unfortunately Si is a poor absorber in the IR range of practical interest (e.g., 1100-1600 nm). Si IR detectors for communications can be used only near 850 nm, but even there the absorption length (Labs) of Si is relatively large, greater than 20 μm. Absorption length impacts two important PD properties: quantum yield and frequency response. Quantum yield (QY) is the fraction of incident optical power absorbed by the detector. As light passes through a material of thickness T with a given Labs, the amount of light absorbed is exp(−T/Labs). In order to achieve high QY it is desirable that the thickness of the PD absorption region be greater than or equal to Labs at the wavelength of operation of the particular system.
Frequency response is quantified by the 3 dB frequency (f3). QY and f3 determine the maximum data rate at which the PD can accurately detect. A QY of at least 50% is desirable, and f3 must be larger than half the data rate. Important data rates for commercial IR communication channels are 2.5 GHz, 10 GHz and 40 GHz. Therefore, a minimum of f3=2.5 GHz is required for these relatively high-speed systems. On the other hand, lower speed detectors are useful in some less demanding applications such as IR cameras and wireless IR systems.
One prior art method employed to address the poor IR properties of Si is to monolithically integrate it with materials that have higher IR absorption. The material of choice for such integration is Si1-xGex, an alloy of Si and Ge having a Ge concentration (molar fraction) of x in Si. Significantly, Si1-xGex processing is compatible with Si processing.
FIG. 1a compares the absorption length of pure Ge (Si1-xGex with x=1) with that of Si and InGaAsP, a standard Group III-V compound semiconductors used in PDs. Labs in Ge is below 2 μm for the entire wavelength range of interest (e.g., 1300-1600 nm). When the Ge concentration of Si1-xGex is such that 0<x<1, the absorption is intermediate between that of Si and Ge. To reach the longer wavelengths from 1310 nm to 1550 nm, nearly pure Ge with x nearly equal to 1 is ideal (e.g., x˜0.8-0.9). However, the thickness of a high quality (low defect density) single crystal Si1-xGex layer that can be grown on a single crystal Si substrate or on a Si epitaxial layer is limited by the 4% lattice constant mismatch between Si and Ge. (See FIG. 1b where the curve represents critical layer thickness as a function of Ge concentration.). A Si1-xGex layer having a larger concentration of Ge has a smaller critical thickness. Layers grown above the critical thickness tend to contain misfit dislocations under equilibrium growth conditions; those grown below do not have misfit dislocations. These defects are a source of extrinsic leakage current (dark current) that adds to the noise of the detector, thereby limiting its overall sensitivity.
Dark current is the current that flows in the detector in the absence of a light signal. In the presence of defects it is proportional to the defect density. Defects also form recombination centers that diminish QY. In the absence of defects, the intrinsic dark current is proportional to exp[−EG(x)/kT], where EG(x) is the bandgap of the absorbing layer, x is the mole fraction of Ge in Si1-xGex, k is Boltzmann's constant, and T is the lattice temperature. EG(x) is a monotonically decreasing function of x, and so larger values of x result in larger intrinsic dark currents. For some applications at shorter wavelengths near 850 nm, a Si1-xGex semiconductor having x<1 may be desirable since Labs is short enough and the intrinsic dark current would be lower. Applications at longer wavelengths require a value of x nearly equal to 1 (nearly pure Ge; e.g., x˜0.8-0.9)). However, for any of the IR communication wavelengths of interest, the critical thickness of any Si1-xGex semiconductor with enough Ge to be a good near IR absorber is much smaller than the absorption length in these materials. As a result, near IR Si1-xGex PDs with sufficient performance cannot be made using prior art techniques to directly grow Si1-xGex on Si.
Several approaches have been proposed in the prior art in attempts to circumvent the critical layer thickness problem, but they all use complicated growth schemes. For example, Ge PDs formed on Si have been reported in the literature using two approaches different approaches known as (1) the graded buffer (GB) method, and (2) the Si/Ge heterojunction (SGH) method.
Graded buffer (GB) method: As shown in FIG. 2a, the GB method involves growing and annealing a graded, multi-layered buffer region of Si1-xGex on a single crystal Si substrate. [See, for example, M. T. Currie et al, Appl. Phys. Lett., Vol. 72, No. 14, p. 1718 (1998), which is incorporated herein by reference.]. The concentration of Ge in the buffer region (layers 2-4) is varied monotonically from 0% at the interface with the Si substrate 1 to 100% in the Ge device active (absorbing) layer 6. Since the buffer layers 2-4 contain lower Ge concentration than the top Ge absorbing layer, almost all the light will be absorbed in the Ge layer 6. However, the total layer structure is difficult to integrate with conventional CMOS processing because the layer stack can become quite thick and the annealing steps involved require high temperatures. Also, the best results to date for the quality of the surface layer still incorporate a relatively large density (˜105 cm−2) of defects 7, which are schematically shown on only the right hand side of the structure for convenience only. In fact, the defects exist through out the graded region.
Si/Ge heterojunction (SGH) method: As shown in FIG. 2b, the SGH method involves direct growth of a pure Ge layer 9 on a single crystal Si substrate 8 followed by a complicated sequence of high temperature, cyclic annealing steps aimed at reducing the dislocation (defect) density in the Ge layer [See, for example, G. Masini et al. Electronics Letters, Vol. 35, No. 17, p. 1467 (1999) and H-C Luan, et al, Appl. Phys. Lett., Vol. 75, No. 19, p. 2909 (1999), both of which are incorporated herein by reference.]. Like the GB process, this process poses challenges to CMOS integration due to required high temperature (900 C) anneals [See, L. Colace et al, Appl. Phys. Lett., Vol. 76, No 10, p. 1231 (2000), which is incorporated herein by reference.] However, the best material obtained by this technique still has a relatively high defect density of 2×106 cm−2.
Low-defect density (sometimes referred to as defect-free) material for device fabrication is important for reducing noise and increasing sensitivity in PDs. However, prior art techniques are not capable of producing low-defect-density Ge on Si. In addition, any defects are located in highly doped regions, such as the electrical contact regions, which are not depleted by the electric field. Heavy doping in the defect regions ensures that these regions remain electrically neutral under all bias conditions. Otherwise, generation-recombination current results in large reverse leakage (dark) current.
The region near the interface region 10 in FIG. 2b and the graded buffer regions 2-4 in FIG. 2a contain the most defects, as pointed out by G. Masini et al, IEEE Trans on Elec. Dev., Vol. 48, No. 6, p. 1092 (2001), which is incorporated herein by reference. It is essential that these regions be highly doped; however, it is not possible to eliminate all of the defect-induced dark current by means of high doping because some region of low-doped Ge is required to absorb the incident light signal. Low doping in the absorbing region ensures that carrier transport is dominated by the fast drift mechanism rather than the slower diffusion process.
Both the GB and SGH methods have been used to form two common types of PDs: a vertical PIN PD (FIG. 3a) and a metal-semiconductor-metal (MSM) PD (FIG. 3b). Both PDs have been designed for use as surface-illuminated detectors in which the signal light impinges upon the top (or bottom) surface of the detector and essentially perpendicular to the primary layers of the device. However, it is possible to use these PDs as edge-illuminated devices in which signal light impinges on an edge of the device and propagates in a direction essentially parallel to the primary layers of the device.
The major conclusions described here pertain to both vertical PIN and MSM IR prior art detectors. These devices suffer from two important limitations: (1) process incompatibility with conventional CMOS processes, and (2) intrinsically poorer performance. In addition, it has not previously been appreciated that these limitations are inherent in the methods of the prior art.
Thus, a need remains in the art for a Si-based near IR PD that exhibits both high speed and high QY.
To clarify the limitations of the prior art the implementation schemes of both PIN and MSM devices have been analyzed. In the vertical PIN structure shown in FIG. 3a the substrate layer 14 is either a single crystal Si substrate or a Si1-xGex buffer on such Si substrate. It is non-absorbing in the 1200-1600 nm band. The active device layer 12, where signal light absorption is intended, is undoped Ge. The top, highly doped contact layer 11 is also Ge. In the prior art GB method, the bottom, highly doped contact layer 13 is also Ge but in the prior art SGH method, it is Si. Signal light 19 is incident on the top surface 18. The light penetrates the device layers and is absorbed in the Ge active layer 12. Electron-hole pairs are created in Ge layers 11-13 of the GB method and in layers 11 and 12 of SGH method where they are separated by the electric field. The latter is generated by connecting a voltage source (not shown) with the indicated polarity across metal contacts 15 and 16. The detector photocurrent flows through a detection circuit (not shown) connected to contacts 15 and 16.
In both of these devices the thickness (T12) of the absorption layer 12 is approximately greater than 1 μm, which is the absorption length for light between 1310 nm and 1550 nm, and QY is given by exp(−T12/Labs). In a well-designed device, the frequency response is limited by the transit time of the photo-generated electrons and holes. Two different times are important in the vertical PIN structure: the drift time (τd) in the high field (undoped active region 12) and the diffusion time (τdiff) of carriers generated in the low field (high doped contact regions 11 and 13). Because carriers are generated throughout the Ge layers, there is a distribution of transit times. Calculation of the exact frequency response is complicated, but readily done through simulation. However, a good feel for f3 can be obtained by looking at the longest transit times, which limit the frequency response. The longest drift time is ˜T12/vd where vd is the average drift velocity of carriers in the electric field of layer 12. The longest diffusion time is proportional to the square of the thickness (W) of the doped layer contact layer 11. The overall transit time (τ) is approximately given by τd+τdiff, and f3 is then approximately 1/(2πτ). Even for W on the order of 0.2 μm, the diffusion time can dominate the overall frequency response.
The MSM structure shown in FIG. 3b differs from the PIN structure in that the electric drift field is parallel to the top surface 28, whereas in the PIN structure of FIG. 3a it is perpendicular to the surface 18. Again the substrate 24 is a single crystal Si substrate in the SGH method and a graded buffer layer in the GB method. The absorption of signal light takes place in the Ge active layer 23. In this case, inter-digitated Schottky barrier electrodes 21 and 22 are disposed directly on the Ge top surface 28 from which the detector photocurrent flows. In this device the normally incident light penetrates the Ge layer 23 and is absorbed there creating electron-hole pairs. As in the vertical PIN structure, QY is determined by the Ge layer thickness T23. The relevant transit time in the MSM structure is given approximately by τd˜(T23+D)/vd where D is the spacing between adjacent electrodes. Unlike the PIN structure, the MSM device has no problem with carrier diffusion times because there are no highly doped, low field regions where carriers can be photo-generated.
The PIN structure is preferable to the MSM structure because the highly defective interfacial region 25 is not highly doped, and therefore the MSM has relatively large dark currents. For the same reason, in the PIN structure it is preferable to make the bottom contact layer 13 of Ge, as in the devices described using the GB method, in order to ensure that the defect interface 17 is highly doped. Although the prior art SGH method does not suggest forming the Si/Ge heterojunction between the bottom contact layer 13 and the substrate 14, there is no reason why this couldn't be done in principle. The resulting structure would then be electrically identical to the PIN formed using the GB method and would consequently have the same performance. Therefore, for comparison purposes in the following discussion, we need consider the limitations of only the best of these prior art devices: the PIN structure (FIG. 3a) formed using either the GB or SGH method in which the bottom contact layer 13 is Ge.
We have performed device simulations to assess the ideal device speed of the PIN structures discussed above and have found that the frequency response of these devices is inherently limited by transit time considerations. The results are reported in Table I, below.
TABLE IAbsorption Regions ofFIG. 3a PIN DetectorW (μm)T12 (μm)f3 (GHz)(1) Regions 11, 12, and 131st set of simulations0.22.07.01st set of simulations0.21.58.01st set of simulations0.21.08.51st set of simulations0.20.56.51st set of simulations0.20.35.21st set of simulations0.20.24.62nd set of simulations0.21.08.52nd set of simulations0.41.02.42nd set of simulations0.61.01.0(2) Region 12 only3rd set of simulations0.22.08.93rd set of simulations0.21.511.03rd set of simulations0.21.018.03rd set of simulations0.20.536.03rd set of simulations0.20.361.03rd set of simulations0.20.292.0
Simulations were performed on idealized PIN structures as illustrated in FIG. 3a with voltages on contacts 15 and 16 that were large enough to result in saturated drift velocities in active region 12. The first set of results includes photo-generation in all Ge regions, which is what would occur naturally. In these simulations W has been fixed at 0.2 μm (a typical value for good contacting), and the high field region thickness T12 has been varied. To make the detector fast, T12 must be decreased, but it is clear from the table that as the ratio of W to T12 increases, f3 decreases. This relationship between T12 and f3 occurs because more of the carriers in the photocurrent response are limited by τdiff than by τd The second set of simulations varies W but fixes T12 at 1 μm, a value required to give a reasonable QY. Again, as the ratio of W to T12 increases, f3 decreases, this time with an approximate 1/W2 dependence, which is expected from diffusion-limited carrier transit. The third set of simulations artificially removes photo-generation in the contact regions 11 and 13 to demonstrate the impact of absorption in these n-type and p-type contact layers. In this case, f3 is limited by carrier transit times in the active region 12 and increases linearly with 1/T12 as expected. It should be noted that in this structure it is not possible to reduce W indefinitely. W is required to be thick enough for good, low leakage contacting and to be thick enough to ensure that all of the defects that exist at the interface 17 between the Ge and Si are completely covered by high doping. If this interfacial region is depleted of free carriers, prohibitively large dark currents will flow adversely impacting the noise performance. Poor frequency response is the inherent problem in such prior art devices. If dark currents are controlled, highly doped contact regions must be formed in the Ge. But, this design results in a frequency response limited by the diffusion time τdiff. Consequently, in the prior art devices it is very difficult to achieve high enough f3 to satisfy the desired data rates of high-speed systems.