Rapid, accurate diagnosis of acutely ill patients is critical for their survival. Typically, thousands of dollars in diagnostic tests are performed within 24 hours of a patient""s admission to hospital. Accurate diagnosis of a treatable condition allows appropriate therapy to be started and unnecessary, potentially harmful medications to be stopped. On the other hand, some diagnostic tests require days to complete, some are invasive or even dangerous to perform, and all contribute to the upward spiral of medical costs. Selection and interpretation of appropriate tests in the appropriate order is therefore a highly valued skill, necessary to the physical health of the patient and the financial health of the care provider.
Requiring fast, accurate responses for an ever-expanding list of diagnostic questions, clinical laboratories turn more and more frequently to answers from molecular genetics. This rapidly evolving discipline comprises the study of gene structure and function at the molecular level. The most straightforward diagnostic application of this approach is to search clinical specimens for the presence of a particular gene or a particular allele (one variety of a particular gene). It is possible to use this direct approach to diagnose genetically transmitted diseases such as Huntington""s chorea (by detecting the disease-causing allele), or to diagnose occult infections with agents such as Bartonella henselae, the agent of cat-scratch disease (by detecting genes specific for that organism). Gene detection tests such as these have already found a welcome place for themselves within the vast arsenal of tests offered by reference laboratories. In some cases (notably the detection of herpes simplex virus in cerebrospinal fluid or Chlamydia trachomatis in genital specimens) amplification and detection of genes have become the front-line standard diagnostic tests for conditions difficult to diagnose by other means. These tests require  less than 24 hours from specimen to final result, and replace less sensitive methods with a turnaround time of several days or even weeks. Gene detection tests of this kind remain expensive, however, and have to be tailor-made for one or two organisms at a time. They are not useful for diagnosing disease caused by certain organisms such as bacteria of the genus Staphylococcus, which is normally present on the skin but which can also cause life-threatening disease.
A less obvious application of molecular genetics to clinical diagnosis requires analysis of gene transcription rather than the presence or absence of a particular gene. Disease-associated genes are present in all living things, including human hosts and parasites of all kinds (worms, protozoa, fungi, bacteria and viruses). In some cases, the mere presence of genetic material in a human specimen is enough to signify diseasexe2x80x94the presence of genes specific for human immunodeficiency virus, for example, or trisomy 21 for a diagnosis of Down""s syndrome. In other cases, however, a xe2x80x9cpathologicalxe2x80x9d gene may be present, but clinically silent for a variety of reasons. Examples include the defective hemoglobin gene which causes sickle cell anemia when two copies are present, but minimal disease when one copy is transcribed along with the normal hemoglobin allele, and no disease at all when the sickle cell allele is present but not transcribed. Moreover, genes for certain components of the immune system are present in every cell, but are only transcribed xe2x80x94that is, copied from DNA into RNAxe2x80x94when the host organism is diseased. The presence of these genes is universal, but transcription of them usually indicates a disease state.
In addition to associations between diseases and transcription of particular genes, we find that various combinations of gene transcription are required for specific pathologic outcomes. An example is found in certain lymphomas derived from B lymphocytes. Transcription of the myc gene when the bcl gene is not transcribed in these cells leads to limited proliferation followed by self-destruction of the proliferating cells. If the myc gene is transcribed in the presence of bcl transcription, however, the cell""s proliferation is unrestrained, and malignancy may result. The number of two-fold gene interactions is large enough to daunt even the most stout-hearted diagnostic molecular geneticist. When one contemplates the possibility of three-fold, four-fold or more complicated gene interactions, however, it becomes quite impossible to analyze all the possible interactions using methods which detect transcription of only one gene at a time.
Identification of transcription products typically involves five steps: RNA extraction, amplification, hybridization, labeling, and detection, with labeling usually performed during the hybridization or amplification steps. The researcher disrupts the sample in the presence of enzymes which inhibit degradation of RNA. Organic solvents remove protein and lipids, while differential acid and salt concentrations enrich RNA in and deplete DNA from the sample. One can amplify the extracted RNA by reverse transcription (making DNA from an RNA template) followed by the polymerase chain reaction (PCRxe2x80x94which makes a double-stranded DNA product), or by transcription-mediated amplification (TMAxe2x80x94which makes a single-stranded RNA product). At this stage a xe2x80x9clabelxe2x80x9d may be incorporated into the amplified product. Labels are small molecules bound to the components of nucleic acids. Ideally, the label does not interfere with nucleic acid chemistry. The label allows detection in one of four ways. It can emit radiation, it can serve as substrate for an enzyme, which makes a colored product, it can emit light itself (luminescence or fluorescence), or the label can serve as antigen for an antibody bound to a larger molecule which has one of the first three functions. The resulting product, with or without label, is then hybridized to a nucleic acid xe2x80x9cprobexe2x80x9d of known sequence. In general, either the probe is bound to a fixed surface and the amplified target is labeled, or the amplified target is bound to a fixed surface and the probe is labeled. Following hybridization, the probe produces radiation, light or color reaction. In measuring this, one identifies the presence of nucleic acid complementary to the probe in the target amplified from the original specimen.
To better understand the transcription process, and more specifically hybridization, an individual must understand the roles of nucleic acids in the process. Nucleic acids are chains of subunit molecules called nucleotides, which can be assembled in any order. The length of a chain is denoted by a number followed by the suffix xe2x80x98-merxe2x80x99, hence, dimer, trimer, tetramer, and so on to decamer, with longer chains denoted by xe2x80x9c11 -merxe2x80x9d, xe2x80x9824-merxe2x80x99 etc. DNA is made of the nucleotides deoxyadenosine (A), deoxycytosine (C), deoxyguanosine (G) and deoxythymidine (T) while RNA is made of the nucleotides adenosine (a), cytosine (c), guanosine (g) and uracil (u). Nucleic acid sequences are written as strings of lettersxe2x80x94for example ACGT, a DNA tetramer. Nucleic acids xe2x80x9chybridizexe2x80x99, or form double-stranded molecules, in a defined pattern. A or a lines up opposite T or u, and C or c line up opposite G or g. Nucleotides which line up opposite each other according to this scheme are called xe2x80x98complementsxe2x80x99. The RNA complement to the above-mentioned DNA sequence ACGT would be ugca, while the DNA complement would be TGCA. Hybridization is most specificxe2x80x94that is, a nucleic acid hybridizes solely to its complementxe2x80x94when the temperature is high, the salt concentration low, and the nucleic acid long, typically xe2x89xa715 nucleotides.
Single strands of nucleic acids make stable duplexes by hydrogen bonding with strands of complementary bases. Hybridization is the process of forming these duplexes from complementary single-strands of nucleic acids. Both deoxyribonucleic acids (DNA) and ribonucleic acids (RNA) may be hybridized.
Thus a probe sequence, for example DNA, may be immobilized on a solid surface (the probe surface) and used as a probe for a complementary DNA sequence (the target sequence). In general, the probe surface is exposed to a solution, the solution is removed and the surface is washed, and a DNA-detection method is used to determine if complementary target DNA has hybridized to the probe DNA on the probe surface. Such procedures are routine.
Probes for nucleic acids hold great promise for solving a wide-ranging variety of problems in the field of human medicine, veterinary medicine, and plant husbandry. Also, embodiments of the present invention may be used to detect the presence of organisms in other fields, such as the food, cosmetics and drug industries. A further embodiment of the present device may be utilized as a method of separation (e.g. in the context of a gene discovery or a transcript discovery strategy). Indeed, any living organism with RNA or DNA is potentially the subject of technologies that probe for nucleic acids. Key areas include: clinical diagnosis, for instance diagnosing a disease; transcriptional event discovery, for instance discovering that a certain RNA sequence is expressed in a given biochemical or physiological circumstance; and epidemiological tracking, for instance testing people for the presence and/or type of a virus.
When attempting to solve the wide range of problems described above, it is particularly useful to use a patterning approach. Different probe sequences are immobilized in probe patterns on the surface so that the location and identity of each sequence corresponds to a particular known location, or address, on the probe surface. The group of sequences that is used in the probe pattern is termed an array. Arrays may consist of a complete or a partial set of sequences; a set of sequences is the set of all possible combinations of sequences for a given condition. For instance, an oligonucleotide sequence that is eleven bases long and made of two unique bases, say adenine and cytosine, makes a set of 211 or 2,048 sequences; the array could have less than 2,048 sequences; and the probe pattern includes the array and its addresses. A detection means is used to detect the presence and/or quantity of a nucleic acid strand that hybridizes to a sequence in the probe pattern. The detection means are well-known to those familiar in the art and include fluorescent, enzyme-based markers, various staining agents, radioactive, and colorimetric means. A probe sequence is a sequence immobilized to the surface.
The patterning approach is in commercial use and has been described in patents, for instance those assigned to Affymetrix, such as U.S. Pat. Nos. 5,837,832 and 5,770,456. Thus it is possible, for example, to immobilize oligonucleotide arrays on a chip surface, expose the patterned chip surface to RNA derived from human tissue sample that contains RNA, and examine which addresses on the chip have hybridized with target RNA sequences. Such an approach is thought to be the most useful if the target nucleic acid sequence is known.
Indeed, if current hybridization technology is to be used, it is necessary to know what target DNA or RNA sequence to probe for. A known target sequence is crucial for clinical diagnosis, for transcriptional event discovery, and for epidemiological tracking. To diagnose a disease by hybridization, it would be necessary to discover some sequence that is expressed differently in the diseased patient as compared to a healthy patient or, in the case of an invasive organism such as a bacterium, to know a target sequence specific for the pathogenic strain of that bacterium. In the case of transcriptional event discovery, it is necessary to discover and synthesize a probe sequence for every allele of every gene of interest in order to learn how a certain biological event affects transcription globally. To perform epidemiological studies to detect the presence of a pathogen, for instance, a virus or a bacterium, it is necessary to know the pathogen""s sequence and to understand how that sequence may be affected by mutation or normal variation as the pathogen moves through the populace.
But disease specific target sequences are generally not known; current technology is limited to the special circumstances in which a target sequence is already known. In the vast majority of situations the target sequence is unknown because the human genome is only partially sequenced and the activity and function of the known sequences is understood only in a very limited way. Therefore even if every gene were sequenced and every biomolecule made by every gene were known, it would still not be clear what each gene does or how to take advantage of that knowledge.
Further, genes often work together to produce an effect. But assaying for the presence and expression of multiple genes or gene sets is prohibitively difficult. So even if all the human genes were sequenced and their products were known, and the biochemical functions of their products were known, it would generally be impossible to predict exactly which genes to probe for in a given situation. And even if the key genes were known, it is likely that the level of their expression would be important, i.e., if the gene""s product is present in a relatively high or low amount. Therefore, current hybridization-based technologies, which are based on the need to know the target sequence, are inadequate for use with conditions requiring knowledge of how genes work together.
And sequencing projects comparable to the Human Genome Project for obtaining genetic libraries for many plants and animals of interest is not even contemplated. Nor are sequences for pathogens commonly known; indeed, the variation and mutation of viruses causes their sequences to be in flux over time.
Thus, the current potential of technologies based on hybridization techniques are limited by the need to know a target sequence. But the present invention solves this need.
The present invention comprises a non-cognate hybridization system (NCHS). The NCHS generally includes a hybridization technology that is simply and economically used to probe for non-cognate nucleic acid sequences, i.e., for nucleic acid strands without known target sequences. NCHS causes nucleic acids, bound to a probe surface, to create a hybridization pattern that provides information about the presence and/or quantity of the nucleic acid sequences in a sample. The NCHS results normally orient the examiner towards a small number of specific diagnoses across a wide variety of diagnostic categories (including but not limited to infections, neoplasms and autoimmune diseases). The test will also identify final-common-pathway syndromes such as sepsis, anaphylaxis and tumor necrosis. While the test utilizes genetic information, it does not depend on prior knowledge of the genes involved in a particular disease or syndrome.
The test should be used whenever the following three conditions apply: 1) substantial diagnostic uncertainty; 2) illness severe enough to limit activities of daily living; and 3) possibility of a treatable diagnosis. These criteria usually apply in critical care admissions, as well as many emergency room visits and some chronic disease states. There are approximately 22 million hospital admissions in the United States each year, and it is estimated that at least one in ten such admissions would meet the criteria outlined above. Rounding downward, and discounting the possible use of the test in follow-up or in outpatient settings, it is estimated that hundreds of million of tests per year are indicated in the United States alone. In addition to clinical demand, a smaller demand for use of the system as a hypothesis-generating research tool is anticipated.
Embodiments of the present invention include probes for nucleic acids, which hold great promise for solving a wide-ranging variety of problems in the field of human medicine, veterinary medicine, and plant husbandry. Also, embodiments of the present invention may be used to detect the presence of organisms in other fields, such as the food, cosmetics and drug industries.
The present invention also includes embodiments which can be utilized to detect and identify living and dead organisms. For example the analysis of an environment, such as xe2x80x9cwater sourcesxe2x80x9d or xe2x80x9csterilized (or clean) roomsxe2x80x9d for bacteria. Furthermore, embodiments of the present invention include software, which can be utilized to recognize patterns in addition to hybridization patterns and assess their relatedness to known patterns (e.g. 2D gel electrophoresis patterns).
A further embodiment of the present device may be utilized as a method of separation, differentiation and prognostication (e.g. in the context of a gene discovery or a transcript discovery strategy). Prognostication would include the potential to predict and diagnose genetic and other health related issues in living organisms. In regards to separation, the physical surface of the array can be utilized to identify and specifically analyze the nucleic acids hybridized on relevant addresses. This can be used for gene discovery purposes or transcript expression purposes.
The present invention can be utilized in many other types of applications. Therefore, these and other aspects of the invention will be evident upon reference to the following detailed description.