Field of the Disclosure
The present disclosure generally relates to field effect transistors and methods of making and using the same for sequencing, diagnostics, the analysis of biological or chemical materials or reactions and bioinformatics processing. More specifically, the present disclosure relates to one-dimensional and two-dimensional nanomaterial-based field effect transistors useful for chemical and biological analysis.
Description of the Related Art
The detection and sequencing of nucleic acids, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), is a fundamental part of biological discovery. Detection and/or sequencing are useful for a variety of purposes, and are often used in scientific research, drug discovery, medical diagnostics, and in the prevention, monitoring, and treatment of disease. For instance, the genomics and bioinformatics fields, which rely on nucleic acid detection and sequencing techniques, are concerned with the application of information technology and computer science to the field of molecular biology. In particular, bioinformatics techniques can be applied to process and analyze various genomic data, such as from an individual so as to determine qualitative and quantitative information about that data that can then be used by various practitioners in the development of diagnostic, prophylactic, and/or therapeutic methods and products for detecting, preventing, treating, or at least ameliorating disease states, thus improving the safety, quality, and effectiveness of health care. The need for such diagnostic, therapeutic, and prophylactic advancements has led to a high demand for low-cost nucleic acid detection and sequencing methods, devices, and reagents, which in turn have driven, for example, the development of high-throughput sequencing, termed as Next Generation Sequencing (NGS).
Generally, the approach to DNA and/or RNA analysis, such as for genetic diagnostics and/or sequencing, involves nucleic acid hybridization and detection. For example, various conventional hybridization and detection approaches include the following steps. For genetic analysis, an RNA or DNA sample obtained from a subject to be analyzed is isolated and immobilized on a substrate. A detectable probe of a known genetic sequence, e.g., having a nucleotide sequence that corresponds to a disease marker (e.g., a marker evidencing a bacterial, fungal, or viral infection, a single nucleotide polymorphism (SNP) associated with a particular disease such as cancer, an autoimmune disease, etc.) is then added to the substrate, typically in a reaction mixture containing the requisite reagents to allow the probe to interact with its target, if present in the sample. If the disease marker is present, a binding event, e.g., hybridization, will occur and because the probe is detectable (e.g., via the inclusion in the probe of a detectable label such as a fluorescent dye if the detection scheme is optically-based), the hybridization event can either be or not be detected, thereby indicating the presence or absence of the disease marker in the subject's sample.
For DNA/RNA sequencing and/or detection, first, an unknown nucleic acid sequence to be identified, e.g., a single-stranded sequence of DNA/RNA from a subject, is isolated, amplified, and immobilized on a substrate. Next, in the presence of a primer complementary to a portion of the isolated nucleic acid sequence to be sequenced and/or identified, (preferably labeled) nucleotides, and a suitable DNA polymerase, a nucleic acid sequencing and/or detection reaction may take place. In such an instance, where the primer recognizes a corresponding sequence of the isolated and/or bound nucleic acid sequence, the polymerase can begin to add one or more labeled nucleotides to extend the primer in the presence of the unknown nucleic acid sequence, using the unknown nucleic acid sequence as the template. When the primer is extended, the most recently added labeled nucleotide, which hybridizes via hydrogen-bonding to its complementary base in the unknown sequence immobilized on the surface of the substrate, the most recent nucleotide's addition can then be detected, e.g., optically or electrically. These steps are then repeated until the entire DNA/RNA molecule has been completely sequenced. Typically, these steps are performed on a Next Gen Sequencer wherein thousands to millions of DNA fragments can be sequenced concurrently in the NGS process.
As will be appreciated, a central challenge in DNA sequencing based on the sequencing of numerous short DNA fragments is assembling full-length genomic sequences, e.g., chromosomal sequences, from a sample of genetic material, as the sequencing methods used in NGS processes do not produce full-length gene or chromosomal sequences from the sample DNA that can then be used for a desired genetic analysis, assessment of genetic variation or identity between the subject's sample and a reference gene, genome, etc. Rather, sequence fragments, typically from 100-1,000 nucleotides in length, are produced without any indication as to where in the genome they reside. Therefore, in order to generate full-length gene or chromosomal genomic constructs, or determine variants with respect to a reference genomic sequence, such DNA sequence fragments need to be mapped, aligned, merged, and/or compared to a reference genomic sequence. This is true also for SNP genotyping, even though in that case a full-length gene or chromosomal sequence need not be constructed, but at least a length of base pairs that encompasses the loci of the SNP must be constructed, e.g. lengths of 250 base pairs, 150 base pairs, or even 50 base pairs may be sufficient for SNP identification. Through such processes the variants of the sample genomic sequences from the reference genomic sequences may be determined by suitable bioinformatics approaches, such as by implementing a suitable variant calling application.
Even so, as the human genome comprises approximately 3.1 billion base pairs, and as each sequence fragment in an NGS process is typically only from 100 to 500 to 1,000 nucleotides in length, the time and effort that goes into building full-length genomic sequences and determining the genetic variants therein is quite extensive, often requiring the use of several different computer resources applying several different algorithms over prolonged periods of time. This is because in a given NGS analysis, thousands, millions, or even billions of DNA sequences are generated, which sequences must then be aligned and merged in order to construct a genomic sequence that approximates a chromosome or genome in size. A step in this process often includes comparing the DNA fragment sequences to a reference sequence to determine where in the genome the fragments reside.
In order to perform an NGS analysis, genetic material from a subject must be pre-processed. This preprocessing may be done manually or via an automated preparation system. Typically, preprocessing involves obtaining a biological sample from a subject, such as through venipuncture (blood, plasma, serum), buccal swab, urine, saliva, etc., and treating the sample to isolate the DNA therefrom. Once isolated, the DNA is then fragmented and denatured. The DNA (or portions thereof) may then be amplified, e.g., via polymerase chain reaction (PCR), so as to build a library of replicated strands that are now ready to be sequenced, such as by an automated sequencer. The sequencing machine is configured to sequence the amplified DNA strands, e.g., by synthesis of new, complementary strands that include labeled nucleotides, from which the nucleotide sequences that make up the DNA in the sample can be determined.
Further, in various instances, such as in building the library of amplified strands, it may be useful to provide for over-coverage or over-representation when preprocessing a given portion of the DNA. To provide this over-representation, increased sample preparation may be required, thus making the process more expensive, although such steps often yield an enhanced probability of the end result being more accurate.
Once a library of amplified DNA strands has been generated, the strands may be injected into an automated sequencer that can then determine the nucleotide sequences of the strands, such as by synthesis. For instance, amplified single-stranded DNA can be attached to a nano- or microbead and inserted into a test vessel, e.g., an array. All the necessary components for synthesis of its complementary strand, including labeled nucleotides (for adenine (A), cytosine (C), guanine (G), and thymine (T)), are also added to the vessel but in a sequential fashion. In some instances, one or more the nucleotides, e.g., “A”, “C”, “G”, and “T's” that are added may be configured so as to be reversible terminators, e.g., such that once incorporated into a growing strand being synthesized cause the synthesis reaction for that particular strand to be terminated at that point of incorporation, thereby producing several strands of terminated sequences that collectively represent the entire template nucleic acid sequence. Hence, in performing a nucleic acid synthesis or detection reaction all of the necessary nucleotide reactants are added, either one at a time or all together, to see which of the nucleotides is used to extend a primer molecule.
Particularly for an optically-based NGS system, after each addition, unincorporated nucleotides are washed away and a light, e.g., a laser, is then shone on the array. If the label fluoresces, that fluorescence can be detected, thereby indicating which nucleotide has been added and, due to the nature of the genetic code, which complementary nucleotide was present in the template DNA fragment in the subject location. In processes where labeled nucleotides are added one at a time, if extension occurs, then it's indicative fluorescence will be observed. If extension does not occur, the test vessel may be washed and the procedure repeated until the appropriate one of the four nucleotides binds to its complement and is incorporated by the polymerase into the growing DNA strand at the subject location such that its indicative fluorescence of its label can be detected.
Where all four reversible terminator nucleotides are added at the same time, each may be labeled with a different fluorescent indicator; when the complementary labeled nucleotide binds to its complement in the template DNA strand such that it is then added by the polymerase during the elongation step, the identity of the added, labeled nucleotide at the subject position can then be determined, such as by the color of its label's fluorescence. As will be appreciated, the use of all four labeled nucleotides in a given reaction greatly accelerates the synthesis process.
After each elongation reaction, the complex is then washed and the synthesis steps are repeated for the next position. This process of elongation and detection is then repeated for all nucleotides for as many positions as are present in the input DNA fragments or for so long as the sequencing machine directs (e.g., 100, 500, 1,000, or more cycles), thereby generating “sequence reads” of the over-sampled nucleic acid segments. The resulting sequence data is collected.
Usually a typical length of a sequence replicated in this manner is from about 100 to about 500 or about 1000 base pairs, such as between 150 to about 400 base pairs, including from about 200 to about 350 base pairs, such as about 250 base pairs to about 300 base pairs dependent on the sequencing protocol being employed. Further, the length of these segments may be predetermined, e.g., engineered, to accord with any particular sequencing machinery and/or protocol by which it is run. In any event, the end result is a readout, or “read”, that is comprised of an extended DNA fragment synthesized from an input DNA fragment.
Extended DNA fragments typically range from about 100 to about 1,000 nucleotides in length, and each nucleotide is labeled in such a manner that every nucleotide in the sequence can be identified because of its label. Hence, since the human genome is comprised of about 3.1 billion base pairs, and various known sequencing protocols usually result in labeled replicated sequences, e.g., reads, from about 100 or 101 bases to about 250 or about 300 or about 400 bases, the total number of segments that need to be sequenced, and consequently the total number of reads generated for single read coverage can be anywhere from about 10,000,000 to about 40,000,000, such as about 15,000,000 to about 30,000,000, dependent on how long the label replicated sequences are.
Therefore, the sequencer may typically generate about 30,000,000 reads, such as where the read length is 100 nucleotides in length, so as to cover the genome once. However, to ensure the accuracy of a particular base call (e.g., A, C, G, or T) at a particular nucleotide position, it is desirable that copies of each fragment in a sample be sequenced 5, 10, 20, 30, or more times, in some cases up to 500 or more times. Such over-sampling thus results in even more reads, thereby requiring more analysis. Fragment amplification in the pre-processing phase helps to facilitate such redundancy.
However, in part, due to the need for the use of optically detectable, e.g., fluorescent, labels in the sequencing reactions being performed, the required instrumentation for performing such high throughput sequencing is bulky, costly, relatively slow and not portable. For this reason, a number of new approaches for direct, label-free DNA sequencing have been proposed. For instance, among the new approaches are detection methods that are based on the use of various electronic analytic devices. Such direct electronic detection methods have several advantages over the conventional NGS platform. For example, the detector may be incorporated in or on the substrate of a semiconductor Integrated Circuit (IC) chip itself, such as employing a biosystem-on-a-chip device, such as a Complementary Metal Oxide Semiconductor (“CMOS”) IC device.
More particularly, in using a semiconductor IC device in genetic detection, the output signal representative of a nucleotide's addition in a DNA sequencing reaction can be directly acquired and processed on an IC chip. In such an instance, automatic recognition is achievable in real time and at a lower cost than is currently achievable using conventional NGS processes and equipment. Moreover, due to the maturity and high integration available with semiconductor IC devices, such as CMOS devices, they may be employed for such electronic detection, making the process simple, fast, inexpensive, and portable.
Particularly, in order for NGS methods to become widely used for diagnostic and therapeutic applications in the healthcare industry, sequencing instrumentation will need to be mass produced with a high degree of quality and economy. One way to achieve this is to recast DNA sequencing in a format that fully leverages the manufacturing base created for IC chips, such as CMOS chip fabrication, which is the current pinnacle of high technology large scale, high quality, low-cost manufacturing. To achieve this, ideally, the entire sensory apparatus of the sequencing device should be embodied in a semiconductor IC chip, manufactured in the same fabrication (“Fab”) facilities used for logic and memory chips. Recently, such a sequencing IC chip, and the associated sequencing platform, has been developed and commercialized by Ion Torrent, a division of Thermo-Fisher, Inc. The promise of this idea has not been realized commercially, however, due to the fundamental limits of applying a Metal Oxide Semiconductor Field Effect Transistor, or MOSFET, comprised of a typical semiconductor, such as silicon, as a biosensor. In particular, when a MOSFET is coupled to an ion-sensitive sensing plate used in solution as a biosensor, it is referred to as an ISFET (Ion Sensitive Field Effect Transistor). Particular limitations of ISFET devices include a lack of sensor sensitivity and poor signal-to-noise characteristics as the semiconductor node scales down to lower geometries of the transistor (channel and gate length).
FIG. 1A illustrates an ISFET with a traditional semiconductor FET as the sensing transistor (FET). The ISFET (200) has a semiconductor base (10), e.g. a silicon wafer, within and upon which are formed semiconductor FETs. The semiconductor FETs are comprised of a source (202), drain (204), gate (208) and gate dielectric (210). The source (202) and drain (204) of a traditional semiconductor FET are formed by regions of implanted and diffused species (e.g. boron, arsenic or phosphorous ions) that alter the number of carriers (holes or electrons) within those regions, so for example n-type source and drain regions may be created by implanting and diffusion of arsenic ions in a p-type semiconductor substrate. Contacts, e.g. silicide, are formed to the source and drain diffusions, such as a source contact (212) and a drain contact (214), and metal interconnects (25) couple with the contacts and are used to connect to and between the plurality of transistors. The metal interconnects (25) are embedded in a dielectric layer (20). When an appropriate gate voltage is applied to the gate (208), such voltage being referenced to another component of the transistor, such as the source (202), the FET “turns on” and carriers will flow between the source (202) and the drain (204). The voltage required to turn on the FET is call the threshold voltage. When the traditional semiconductor FET is turned on the flow of carriers defines a channel region (206). Note that this channel region (206) is not a physical entity, but is a location of charge carriers within the semiconductor. There is no channel when the traditional semiconductor FET is turned off. An electrical characteristic of a FET, such as current flowing between the source (202) and the drain (204), may be modified by changes in the gate voltage—and this forms the basis for using a FET as a sensor, i.e. if the target, analyte or reaction to be detected creates a change in electric field or charge density which in turn changes the gate voltage—then this may be detected by a change in monitored electrical characteristic of the FET such as drain current. In the case of an ISFET (200) fabricated from traditional semiconductor FETs there must be a way to communicate the sensed electric field or charge density of the target reaction or analyte to the FET transistor. As depicted in FIG. 1, an ISFET (200) may comprise a sensing plate (216), typically a metal plate, that is connected by interconnects (25) to the sensor transistor gate (208). This sensing plate is in communication with or is proximate to region, such as a chamber (37) where the analyte or reaction to be detected will be present. The chamber (37) is part of a well structure (38) formed from insulating passivation material (35). The chamber (37) has sidewalls (39).
An ISFET relies on a fluid (64) that covers the sensing area and fills the chamber (37) to provide a minimum gate voltage to turn on the sensor transistor. The gate voltage is applied by a reference electrode (66) coupled to the fluid or solution—creating a solution gated FET. In some instances an analyte or reaction-sensitive layer (218) is formed over the sensing plate (216). The ISFET (200) is typically replicated many times on a semiconductor IC chip to form an array of ISFET sensor cells that comprise an sensor IC chip. In order to read a signal from just one of the ISFETs at one time from the array of ISFETs on the IC chip an access transistor, one for each ISFET, is used to control access to the selected ISFET to be read.
Thus for an ISFET based on traditional semiconductor FETs there are two transistors (the sensor transistor and the access transistor). required for each sensing location. The need for two semiconductor transistors per sensor has implication on the size of those transistors (i.e. the transistors for a cell defined by two transistors will necessarily have smaller transistors that a cell defined by one semiconductor transistor). Smaller transistors create more noise than larger transistors.
Another consideration is that more semiconductor transistors (e.g. 2 per cell versus 1 per cell) require more interconnect connections and will lead to the need for more levels of interconnect wiring to accomplish connecting to all the transistors. Increases in the number of levels of interconnect wiring and increases in the interconnect length have deleterious effects on the noise of the ISFET sensor. As this discussion has highlighted, the noise of an ISFET made from a traditional semiconductor FET may be higher than other FETs used for sensing. Noise in a sensor is important since the detection signal must be discriminated separately from the noise in the sensor. The higher the signal to noise ratio then the better the sensitivity of the sensor will be.
As is known, a Field Effect Transistor (FET) manufactured by typical semiconductor IC fabrication processes includes a gate over a channel region, a channel region formed by charge carriers in the semiconductor material connecting source and drain regions when an appropriate gate voltage is applied, and an insulating barrier separating the gate from the channel. The operation of a conventional FET relies on the control of the channel's conductivity, and thus the drain current, by a voltage, designated VGS, applied between the gate and source. For high-speed applications, and for the purposes of increasing sensor sensitivity, FETs should respond quickly to variations in VGS. However, this requires short channel lengths and fast carriers in the semiconductor channel.
Furthermore, for a sensor chip to be used for DNA sequencing, requiring on the order of 30,000,000 reads as previously described, the size of the individual sensors in the sensor array must be made small enough to fit millions of sensors on the chip. In this case there are physical limitations to the chip size due to the photolithography systems available for wafer and chip manufacturing (e.g. a maximum chip size on the order of 25 mm square) which in turn limit the size of the sensors on the chip given the large number of sensors needed in the array.
Another consideration is that an ISFET used for DNA sequencing is arranged as an array of sensors—each of which must be individually addressable to read the signal of any DNA hybridization that is occurring local to that sensor. To achieve both this individual sensor cell addressability as well as sensing function the sensor cell requires at least two CMOS (or other semiconductor) transistors—one to control the access to the cell for reading and the other as the sensor transistor to transduce the DNA binding or hybridization event into an electrical signal. Because of this need for a minimum of two CMOS transistors per sensor cell the transistors must be even smaller, e.g., at least 2 times smaller, than what would be required due to the aforementioned geometrical constraints derived from the maximum chip size and number of sensors in the array. This further constraint on the size of the CMOS transistor used for sensing directly relates to the channel length of that transistor.
Unfortunately, FETs with short channel lengths frequently suffer from degraded electrostatics and other problems (collectively known as short channel effects), such as threshold-voltage roll-off, drain-induced barrier lowering, and impaired drain-current saturation, which result in a decrease in sensor sensitivity. Nevertheless, scaling theory predicts that a FET with a thin barrier and a thin gate-controlled region (measured in the vertical direction) should be robust against short-channel effects down to very short channel lengths (measured in the horizontal direction).
Accordingly, the possibility of having channels that have high surface area to volume ratio (e.g. are very thin in the vertical dimension like a 2D nanomaterial or have a small cross-section area like a 1D nanomaterial) yet still allow for high-speed transmission of carriers would allow for increased sensor sensitivity and accuracy. What is needed, therefore, is a FET device that is configured in such a manner and comprises such materials that in combination of structure and materials it offers a FET sensitivity that is higher than is currently achievable in present FET applications. A solution that includes such a FET device designed for use in biological applications, such as for nucleic acid detection, sequencing, and/or other diagnostic applications, would be especially beneficial.