The invention is in the field of devices and methods for sequencing biopolymers.
The prevailing DNA sequencing methods are based on Sanger chemistry and on fragment analysis using gel-based electrophoresis. These methods reveal the base pair sequence of individual fragments of DNA. The bases are separated by subjecting the fragments, suspended in a slab gel, to an electrical field. This causes size-dependent migration and spatial separation of the fragments. Once they have been separated on the gel, the bands corresponding to the individual base pairs are read or digitally scanned to determine the fragment sequence.
Although the results obtained from gel electrophoresis are generally of high quality and reliability, the process is labor intensive and relatively slow. Due to its complexity, gel electrophoresis often requires a skilled technician. Additionally, preparing samples prior to sequencing requires that the target be isolated, purified, amplified, and fragmented into relatively smaller pieces (e.g., about 300 to 500 base pairs). Since the average length of a human gene is over 50,000 bases (i.e., 15,000 to over 1,000,000 bases), considerable sample preparation is necessary to systematically fragment, purify, and amplify the fragments.
To ensure that each fragment is sequenced at least once, the target section is often deliberately overlapped, with the consequence that the same bases may be sequenced ten times or more in the end. Once the sequencing has been completed, the resulting data are processed to deduce the sequence of the original target section.
Improved engineering and automation have resulted in sequencing systems that include such technological advances as automated gel-based electrophoresis or ultra-thin capillary tube electrophoresis. These techniques permit higher speed and lower cost sequencing but are still limited by the fundamental constraints of Sanger-based chemistry and fragment analysis, namely the need for highly trained personnel to prepare relatively short read lengths. Nonetheless, automated gel electrophoresis is the technique currently used for almost all high throughput commercial sequencing.
Current efforts toward gene discovery, for example those based on xe2x80x9cpopulation-basedxe2x80x9d genetic analysis, can create tremendous demand for cost-effective DNA sequence analysis. For example, HIV protease inhibitor medications recently introduced rely heavily on DNA sequencing of individual patient samples to detect the emergence of resistant strains of HIV and subsequently alter choice of therapeutic intervention. Indeed, cost-effective DNA sequence analysis methods are likely to prove to be a prerequisite for xe2x80x9cindividual-basedxe2x80x9d preclinical and clinical patient studies.
The invention is based on the discovery that the sequence of monomers in a polymeric biomolecule can be determined in a self-contained, high pressure reaction and detection apparatus, without the need for fluid flow into or out from the apparatus. The pressure is used to control the activity of enzymes that digest the polymeric biomolecule to yield the individual monomers in the sequence in which they existed in the polymer. High pressures modulate enzyme kinetics by reversibly inhibiting those enzymatic processes which result in a higher average activation volume, when compared to the ground state, and reversibly accelerating those processes which have lower activation volumes than the ground state. Modulating the pressure allows the experimenter to precisely control the activity of the enzyme. Conditions can be found, for example, where the enzyme removes only one monomer (e.g., a nucleotide or amino acid) from the biomolecule before the pressure is again raised to a prohibitive level. The identity of the single released nucleotide or amino acid can be determined using a detector that is in communication with a probe in the detection zone within the reaction vessel.
In general, the invention features an integrated device for sequencing a polymeric biomolecule, including: a reaction vessel that includes a reaction zone and a detection zone; a solid support in the reaction zone for chemical attachment of the polymeric biomolecule; an enzyme that catalyzes the removal of one monomer unit at a time from one end of the polymeric biomolecule; a probe for sensing a characteristic (e.g., fluorescence, mass, impedance, optical, voltammetric or amperometric properties, etc.) of the released monomers positioned within the detection zone of the reaction vessel; and a pressure-control device (e.g., piezoelectric crystal-driven pressure modulation, thin-film-driven pressure modulation, electronic, pneumatic, hydraulic, magnetostrictive, etc.) that controls the pressure at least in the reaction zone of the reaction vessel.
Solid supports are available in many configurations, including beads, filters, membranes, capillaries, and frits. Both organic and inorganic supports can be used. For example, sephadex, agarose, dextran, latex, silica gel, glass, polyacrylamide, polystyrene, polyethylene, polyvinylidenefluoride, and other polymers; collagen and similar gels; and biological substrates can all be suitable.
The solid supports can be activated to form specific bonds with biomolecules. Reagents used to activate the solid supports toward covalent bonding include the cyanogen halides, sulfonyl chloride, periodates, sulfonate glutaraldehyde, and carboxyl functionalized compounds.
The reaction zone can be either spatially separated from the detection zone or not.
The probe can be an optical window of quartz or sapphire.
The device can also include a modulated electrophoretic, electroosmotic, fluid flow, or flow cytometry device to connect the reaction and detection zones. The modulation device can contain a buffer, the pH of which is pressure-sensitive.
The device can include multiple probes within its detection zone, allowing it to sense a characteristic of a multiplicity of monomers (e.g., for parallel analysis of multiple single molecule sequencing reactions).
The probe can be, for example, an optical fiber for fluorescence detection, a fluorescence microscope (e.g., CCD-based or confocal), an infrared or Raman spectrometer, a fluorescence polarimeter, an enzymatic biosensor, or a mass spectrometer.
The polymeric biomolecule can be a nucleic acid or a polypeptide, for example.
At least some of the monomers of the polymeric biomolecule can be labelled with fluorescent tags, and the identity of the fluorescently labelled monomers can then be determined by fluorescence resonance energy transfer between the monomers and the enzyme.
Another embodiment of the invention is a sample plate that includes a solid surface adapted for use in the sequencing device described above, and a linker molecule covalently bonded to the surface and to a primer molecule complementary to a biomolecule to be sequenced.
Still another embodiment of the invention features a method for sequencing a polymeric biomolecule. The method includes the steps of immobilizing the polymeric biomolecule on a solid support in a reaction vessel; associating an iondependent, biomolecule-digesting enzyme with the nucleic acid under conditions in which the activity of the enzyme is blocked by an exogenously controllable characteristic that can be altered without adding a reagent from a separate vessel; then, with the pressure in the vessel at a level that inactivates the enzyme, altering the exogenously controllable characteristic to allow the enzyme""s biomolecule digesting activity to be functional and to excise the terminal monomer from the polymeric biomolecule; adjusting the pressure to activate the enzyme to dissociate one monomer from the polymeric biomolecule; determining the identity of the dissociated terminal monomer with a probe for sensing a characteristic of the dissociated monomer within the reaction vessel; adjusting the pressure to inactivate the enzyme; and repeating the pressure adjusting and identity determining steps. The biomolecule immobilizing and enzyme associating steps can be performed in either order, so long as they precede the characteristic altering step.
The polymeric biomolecule can be either a nucleic acid molecule (e.g., DNA or RNA) or a polypeptide, for example. The enzyme can thus be an exonuclease or an exopeptidase (e.g., carboxypeptidase C), respectively.
The exogenously controllable characteristic can be ion concentration (e.g., Mg2+, Mn2+, Co2+, Cr3+, K+, Zn2+, or ethylenediamine2+concentration), pH, or a photolabile protecting group, for example. Such a photolabile protecting group can be attached to either the enzyme or the polymeric biomolecule, or can be a cage-like chelating molecule that sequesters divalent metal ions necessary for the enzyme""s biomolecule digesting activity. The photolabile protecting group can be removed by an intense pulse of light.
Examples of photolabile protecting groups include, but are not limited to: 4,5-dimethoxy-2-nitrobenzyl bromide or trans-2-(4,5-dimethoxy-2-nitrophenyl)-1-nitroethene (suitable if the enzyme contains a catalytically important thiol); 1-(4,5-dimethoxy-2-nitrophenyl)ethyl (DMPNE) (suitable if the enzyme or the biomolecule contains a catalytically important carboxylate); NVOC chloride (suitable if the enzyme contains a catalytically important amine); or 1-(4,5-dimethoxy-2-nitrophenyl) EDTA (suitable for sequestering magnesium ions.
Ion concentration (e.g., [Mg++]) and pH can be altered by a pressure change, through the use of buffers with high absolute ionization volumes (e.g., borate, 1,3-xe2x80x9cbis-trisxe2x80x9d propane, pyrophosphate ion, etc.), for example, electronically, or by an intense pulse of light. Depending on the enzyme being used, a pH in the range of 3 to 11 can be permissive. Magnesium ion concentrations of 0.1 to 50 mM are often suitable.
The polymeric biomolecule and the solid support can both be in the organic phase of a biphasic solvent system, while the dissociated monomers are detected in the aqueous phase of the biphasic solvent system.
pH can also be altered by the release of a proton from 2-hydroxyphenyl-1-(2-nitrophenyl)ethyl phosphate.
The monomers can be labelled with fluorescent tags to facilitate identification. In some instances, the identity of the fluorescently labelled monomers are discriminated by fluorescent resonance energy transfer between the monomers and the enzyme. For example, the inherent fluorescence of the enzyme can be the energy donor and the fluorescently labelled monomers can be the energy acceptors. Alternatively, the enzyme can be derivatized with a fluorescent label and can thereafter act as the energy donor or acceptor.
The dissociated monomer can also be transported from the reaction site to a detection zone (e.g., 1 xcexcm to 1 mm away, or further) within the reaction vessel. This transporting step can be effected by intermittent electrophoresis, electroosmosis, fluid flow, or flow cytometry. A gel can be used, although it is not necessary.
The dissociated monomer can be detected at the reaction site.
The pressure can be modulated using feedback from the probe or another detection device.
A detection device relies on pH titration of the spectroscopic or electrophoretic properties of labelled or unlabelled adenosine monophosphate, guanosine monophosphate, cytidine monophosphate, and thymidine monophosphate.
The pressure can be increased to a level sufficient to allow direct identification of the dissociated nucleic acid monomer by inherent fluorescence properties in organic solvent, for example.
A pulsed current can be used to induce the activity of the enzyme, or a continuous current can be supplied to extend and relax polymeric biomolecule binding to facilitate enzyme activity. For example, strands of DNA can be straightened by electricity (or by exposure to organic solvent or flowing aqueous solvent).
A fluorophore can be introduced to allow indirect detection of monomer absorbance properties.
The solid support can be a plate. The size of the plate depends on the sensitivity of the probe, and whether the sequencing is carried out in a parallel (i.e., multiple probes) or in a serial manner. The plate can be made from various materials (e.g., glass, metal, semiconductor, insulator, polymer, etc.). The plates can be either disposable or reusable. The pressure exerted in the vicinity of the plate (or in other locations within the reaction vessel) can be the same as the pressure throughout the entire vessel or can be generated locally. For example, a magnetostrictive (i.e., elastic deformation of a ferromagnetic material such as nickel upon application of a magnetizing force, quartz crystal, a ceramic transducer, or an ultrasonic ( greater than 20 kHz) transducer) device can be used; local pressure changes can also be effected with a piezoelectric device, wherein a mechanical deformation occurs, produced by the interaction of an applied electric field with an anisotropic crystal. Localized pressure allows, for example, the reaction vessel to be in communication with other vessels at enzyme operative pressures. The plate can be a microfabricated or micromachined circuit (suitable if electrophoresis is used).
The enzyme can have a net neutral charge, either naturally or due to a mutation. In some cases, the enzyme is not metal ion-dependent. For example, the enzyme can be an acid exonuclease, or a phosphodiesterase (e.g., snake venom phosphodiesterase). A continuous supply of metal ions (e.g., divalent magnesium ions) can be introduced the reaction zone in some cases.
Many classes of nucleases can be used in the new devices and methods. Examples include the 3xe2x80x2-exonucleases such as snake venom phosphodiesterase and E. coli exonucleases VII (EC 3.1.11.6) and III (EC 3.1.11.2), xcex-exonuclease (EC 3.1.11.3), T7 Gene 6 exonuclease, and the exonuclease activities of certain polymerases (e.g., T4, Taq, Deep Vent, and Vent DNA polymerases) or subunits of polymerases (e.g., T7 subunit of DNA polymerase III of E. coli). Other enzymes useful in sequencing include mung bean nuclease, Bal-31 exonuclease, S1 nuclease, S. aureus micrococcal nuclease, DNase I, and other exodeoxyribonucleases producing 5xe2x80x2-phosphomonoesters (EC 3.1.11.X; X can be any integer corresponding to a known enzyme).
For sequencing RNA, ribonuclease A and T1 micrococcal nuclease are suitable. Other exoribonucleases (EC 3.1.13.X and 3.1.14.X) and exonucleases active with either DNA or RNA (EC 3.1.15.X and 3.1.16.X) can also be used.
For sequencing polypeptides, suitable peptidases include: exopeptidases, carboxypeptidases A, B, and C (3.4.17.1, EC 3.4.17.2, and EC 3.4.16.1, respectively), serine-type carboxypeptidase C (EC 3.4.16.5), aminopeptidase M (EC 3.4.11.2) and other aminopeptidases (EC 3.4.11.X), dipeptidases (EC 3.4.13.X), dipeptidyl- and tripeptidylpeptidases (EC 3.4.14.X), peptidyl-dipeptidases (EC 3.4.15.X), serine-type carboxypeptidases (EC 3.4.16.X), metallocarboxypeptidases (EC 3.4.17.X), cysteine-type carboxypeptidases (EC 3.4.18.X), omega peptidases (EC 3.4.19.X), serine endopeptidases (EC 3.4.21.X), cysteine endopeptidases (EC 3.4.22.X), aspartic endopeptidases (EC 3.4.23.X), metalloendopeptidases (EC 3.4.24.X), and other endopeptidases of unknown catalytic mechanism (3.4.99.X).
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described below. All publications, patent applications, patents, technical manuals, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present application, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
An advantage of the new devices and methods is the capability to rapidly and economically sequence nucleic acid molecules or polypeptides in a single apparatus. Because the devices require only pressure modulation to control, digest, separate, and analyze, there is no need for hazardous chemicals (e.g., acrylamide or cross-linking agents) to be used. Furthermore, no waste is generated other than the digestion products. By combining the reaction and detection functions in a single unit, the new devices and methods can also save time.
Another advantage is that the new devices and methods eliminate the need to add additional reagents to the reaction vessel under high pressure conditions. Still another advantage is that the new devices and methods utilize a probe within the detection zone of the reaction vessel, allowing detection at high pressure nearly concurrent with the digestion reactions.
Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.