The present invention relates to the identification of groups within a species, and in particular, methods and compositions for determining a statistically significant number of different strains within a species of bacteria indicative of the species population structure as a whole in order to permit the evaluation of a vaccine target.
Bacterial infections continue to account for a considerable amount of human illness. While antibiotic therapy is clearly one of the great success stories of modem medicine, the development of antibiotic resistant strains of important human pathogens has called into question the use of antibiotics as the first line of defense against bacterial pathogens.
Vaccines to a variety of bacteria have been attempted. The best results thus far have involved vaccines directed to specific toxins of the organism (e.g. diphtheria toxoid, tetanus toxoid, etc.). Considerably less favorable results have been achieved with whole organism (xe2x80x9ckilled bacteriaxe2x80x9d) vaccines (e.g. Bordetella pertussis, Vibrio cholerae, etc.). Indeed, immunity induced by vaccination with killed organisms such as V. cholerae persists for a only a few months and therefore is of very limited value.
One important problem with current approaches to vaccine development stems from the range of variability within a species of any particular surface antigen considered as a possible vaccine target. This accounts for the fact that only a few important, new bacterial vaccines have been produced in the last 30 years (i.e. for Haemophilus influenzae type b, a major cause of meningitis). Moreover, development of even these recent few successful vaccines were a tedious and haphazard endeavor with little progess seen for many years.
What is needed is a more efficient approach to vaccine development. Importantly, the new approach should be one that takes into account the variability in surface antigens within a species.
The present invention relates to the identification of groups within a species, and in particular, methods and compositions for determining a statistically significant number of different strains within a species of bacteria indicative of the species population structure as a whole in order to permit the evaluation of a vaccine target. The present invention employs a method comprising the grouping of strains within a species to approximate the minimum variability in any vaccine target. This permits the evaluation of the vaccine target in a more limited number of bacterial isolates (as opposed to the two extremes of 1) using but a single isolate and 2) testing hundreds of isolates at random).
In one embodiment of the method of the present invention, the present invention contemplates analysis of the flanking sequences of one or more so-called Ribosomal RNA Operons, each comprising three genes arranged in the order 16S-23S-5S, with xe2x80x9cspacerxe2x80x9d DNA separating each gene (hereinafter represented by: 5xe2x80x2-16S-spacer-23S-spacer-5S-3xe2x80x2). The present invention contemplates that the analysis of these flanking sequences in a statistically significant number (e.g. greater than one hundred, and more preferably greater than three hundred, and most preferably greater than five hundred) clinical isolates of a particlar bacterial or fungal species.
It is not intended that the present invention be limited by the technique by which the flanking sequences of such operons are analyzed. In one embodiment, primers directed to these sequences can be employed in an amplification reaction (such as PCR). On the other hand, these flanking sequences can conveniently be analyzed with restriction enzymes. Specifically, the present invention contemplates digesting bacterial or fungal DNA with one or more restriction enzymes which will produce a piece of nucleic acid of which at least a portion is outside (not bounded by) the 5xe2x80x2 and 3xe2x80x2 ends of the operon. For the convenience of detecting such digestion products by gel electrophoresis, it is preferred that the digestion product (due to the relatively limited resolution level of gel electrophoresis) be at least 200 bp in size (and more preferably between 400 and 30,000 bp in size).
In one embodiment, the present invention contemplates digestion of such DNA with restriction enzymes that cut only once in the DNA encoding 16S ribosomal RNA and only once in the DNA encoding 23S ribosomal RNA. In a preferred embodiment, the present invention contemplates digestion of bacterial DNA using a single restriction enzyme which cuts only once in the DNA encoding 16S ribosomal RNA and only once in the DNA encoding 23S ribosomal RNA.
In one embodiment, the present invention contemplates a method for vaccine development, comprising: a) providing a plurality of isolates of a single bacterial species, said isolates comprising DNA; b) examining said DNA from said isolates under conditions such that a phylogenetic tree is produced defining one or more phylogenetic subsets of said isolates; and c) evaluating a vaccine target antigen in said subset of isolates for variability.
In one embodiment, the present invention contemplates a method for vaccine development, comprising: a) providing a plurality (e.g. a panel) of clinical isolates of a single bacterial species; b) isolating bacterial DNA from each of said clinical isolates under conditions such that a DNA preparations is produced for each isolate, said DNA preparation comprising DNA flanking the DNA encoding 16S and 23S rRNA; c) digesting said DNA preparations with one or more restriction enzymes under conditions such that restriction fragments are produced, said restriction fragments comprising a digestion product for each of said isolates, said digestion product comprising a portion of said DNA encoding 16S rRNA or 23S rRNA and a portion of said DNA flanking said DNA encoding 16S rRNA or 23S rRNA; d) separating of said restriction fragments (e.g. by gel electrophoresis), e) detecting said digestion products of each of said isolates; f) grouping said isolates based on the number of digestion products having identical size to define one or more subsets of isolates; g) evaluating a vaccine target antigen in said subset of isolates for variability [e.g. examining the gene(s) encoding the antigen or the gene(s) encoding essential enzymes in the biosynthesis of the antigen].
It is not intended that the present invention be limited to the method by which the results are evaluated and grouped as set forth in step (f) above. A variety of types of phylogenic analysis can be employed. What is important is to use the phylogeny of the species of interest and look for antigen-encoding conserved genes that may be important in developing a vaccine.
It is not intended that the present invention be limited by the nature of the sample. The terms xe2x80x9csamplexe2x80x9d and xe2x80x9cspecimenxe2x80x9d in the present specification and claims are used in their broadest sense. On the one hand they are meant to include a specimen or culture. On the other hand, they are meant to include both biological and environmental samples. These terms encompasses all types of samples obtained from humans and other animals, including but not limited to, body fluids such as urine, blood, fecal matter, cerebrospinal fluid (CSF), semen, and saliva, cells as well as solid tissue (including both normal and diseased tissue). These terms also refers to swabs and other sampling devices which are commonly used to obtain samples for culture of microorganisms.
It is not intended that the present invention be limited by the means of detection or the means of comparing digestion products. In one embodiment, said digestion products that are separated by gel electrophoresis are probed with a labeled oligonucleotide in a hybridization reaction.
It is not intended that the present invention be limited by the number of samples compared. A large number of clinical samples of a particular species are specifically contemplated within the scope of the present invention.
To facilitate understanding of the invention, a number of terms are defined below. xe2x80x9cNucleic acid sequencexe2x80x9d and xe2x80x9cnucleotide sequencexe2x80x9d as used herein refer to an oligonucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin which may be single- or double-stranded, and represent the sense or antisense strand.
Prokaryotic ribosomes are constructed from 50S and 30S subunits that join together to form a 70S ribosome. The large subunit comprises a single xe2x80x9c23S rRNAxe2x80x9d molecule and a xe2x80x9c5S rRNAxe2x80x9d molecule, while the small subunit comprises a single xe2x80x9c16S rRNAxe2x80x9d molecule.
As used herein, the terms xe2x80x9ccomplementaryxe2x80x9d or xe2x80x9ccomplementarityxe2x80x9d are used in reference to xe2x80x9cpolynucleotidesxe2x80x9d and xe2x80x9coligonucleotidesxe2x80x9d (which are interchangeable terms that refer to a sequence of nucleotides) related by the base-pairing rules. For example, the sequence xe2x80x9cC-A-G-T,xe2x80x9d is complementary to the sequence xe2x80x9cG-T-C-A.xe2x80x9d
Complementarity can be xe2x80x9cpartialxe2x80x9d or xe2x80x9ctotal.xe2x80x9d xe2x80x9cPartialxe2x80x9d complementarity is where one or more nucleic acid bases is not matched according to the base pairing rules. xe2x80x9cTotalxe2x80x9d or xe2x80x9ccompletexe2x80x9d complementarity between nucleic acids is where each and every nucleic acid base is matched with another base under the base pairing rules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods which depend upon binding between nucleic acids.
Ribosomal RNA molecules are characterized by the presence of numerous sequences that can form complementary base pairs with sequences located else where in the same molecule. Such interactions cause rRNA molecules to fold into three-dimensional configurations that exhibit localized double-stranded regions.
As used herein, the term xe2x80x9cgenexe2x80x9d means the deoxyribonucleotide sequences comprising the coding region and including sequences located adjacent to the coding region on both the 5xe2x80x2 and 3xe2x80x2 ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA. The sequences which are located 5xe2x80x2 of the coding region and which are present on the mRNA are referred to as 5xe2x80x2 non-translated sequences. The sequences which are located 3xe2x80x2 or downstream of the coding region and which are present on the mRNA are referred to as 3xe2x80x2 non-translated sequences. The term xe2x80x9cgenexe2x80x9d encompasses both cDNA and genomic forms of a gene.
The chromosomal DNA of prokaryotic cells contains multiple copies of the genes coding for rRNAs. For example, the bacterium E. coli (xe2x80x9cECxe2x80x9d) contains seven sets of rRNA genes. In the rRNA transcription unit of E. coli, the three genes are arranged in the order 16S-23S-5S, with xe2x80x9cspacerxe2x80x9d DNA separating each gene.
The terms xe2x80x9chomologyxe2x80x9d and xe2x80x9chomologousxe2x80x9d as used herein in reference to nucleotide sequences and to a degree of complementarity with other nucleotide sequences. There may be partial homology or complete homology (i.e., identity). A nucleotide sequence which is partially complementary, i.e., xe2x80x9csubstantially homologous,xe2x80x9d to a nucleic acid sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid sequence. The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous sequence to a target sequence under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target sequence which lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.
Low stringency conditions comprise conditions equivalent to binding or hybridization at 42xc2x0 C. in a solution consisting of 5xc3x97SSPE (43.8 g/l NaCl, 6.9 g/l NaH2PO4xc2x7H2O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5xc3x97Denhardt""s reagent [50xc3x97Denhardt""s contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)] and 100 xcexcg/ml denatured salmon sperm DNA followed by washing in a solution comprising 5xc3x97SSPE, 0.1% SDS at 42xc2x0 C. when a probe of about 500 nucleotides in length is employed.
Other equivalent conditions may be employed to comprise low stringency conditions; factors such as the length and nature (DNA, RNA, base composition) of the probe and nature of the target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol), as well as components of the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions. In addition, conditions which promote hybridization under conditions of high stringency can be used (e.g., increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).
When used in reference to a double-stranded nucleic acid sequence such as a cDNA or genomic clone, the term xe2x80x9csubstantially homologousxe2x80x9d refers to any probe which can hybridize to either or both strands of the double-stranded nucleic acid sequence under conditions of low stringency as described above
When used in reference to a single-stranded nucleic acid sequence, the term xe2x80x9csubstantially homologousxe2x80x9d refers to any probe which can hybridize (i.e., it is the complement of) the single-stranded nucleic acid sequence under conditions of low stringency as described above.
As used herein, the term xe2x80x9chybridizationxe2x80x9d is used in reference to the pairing of complementary nucleic acids using any process by which a strand of nucleic acid joins with a complementary strand through base pairing to form a hybridization complex. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementarity between the nucleic acids, stringency of the conditions involved, the Tm of the formed hybrid, and the G:C ratio within the nucleic acids.
As used herein the term xe2x80x9chybridization complexxe2x80x9d refers to a complex formed between two nucleic acid sequences by virtue of the formation of hydrogen bonds between complementary G and C bases and between complementary A and T bases; these hydrogen bonds may be further stabilized by base stacking interactions. The two complementary nucleic acid sequences hydrogen bond in an antiparallel configuration. A hybridization complex may be formed in solution (e.g., C0t or R0t analysis) or between one nucleic acid sequence present in solution and another nucleic acid sequence immobilized to a solid support [e.g., a nylon membrane or a nitrocellulose filter as employed in Southern and Northern blotting, dot blotting or a glass slide as employed in in situ hybridization, including FISH (fluorescent in situ hybridization)].
As used herein, the term xe2x80x9cTmxe2x80x9d is used in reference to the xe2x80x9cmelting temperature.xe2x80x9d The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. The equation for calculating the Tm of nucleic acids is well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm=81.5+0.41(% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl [see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985)]. Other references include more sophisticated computations which take structural as well as sequence characteristics into account for the calculation of Tm.
As used herein the term xe2x80x9cstringencyxe2x80x9d is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. xe2x80x9cStringencyxe2x80x9d typically occurs in a range from about Tmxe2x88x925xc2x0 C. (5xc2x0 C. below the Tm of the probe) to about 20xc2x0 C. to 25xc2x0 C. below Tm. As will be understood by those of skill in the art a stringent hybridization can be used to identify or detect identical polynucleotide sequences or to identify or detect similar or related polynucleotide sequences.
As used herein, the term xe2x80x9camplifiable nucleic acidxe2x80x9d is used in reference to nucleic acids which may be amplified by any amplification method. It is contemplated that xe2x80x9camplifiable nucleic acidxe2x80x9d will usually comprise xe2x80x9csample template.xe2x80x9d
As used herein, the term xe2x80x9csample templatexe2x80x9d refers to nucleic acid originating from a sample which is analyzed for the presence of a target sequence of interest. In contrast, xe2x80x9cbackground templatexe2x80x9d is used in reference to nucleic acid other than sample template which may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
xe2x80x9cAmplificationxe2x80x9d is defined as the production of additional copies of a nucleic acid sequence and is generally carried out using polymerase chain reaction technologies well known in the art [Dieffenbach CW and GS Dveksler (1995) PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.]. As used herein, the term xe2x80x9cpolymerase chain reactionxe2x80x9d (xe2x80x9cPCRxe2x80x9d) refers to the method of K. B. Mullis U.S. Pat. Nos. 4,683,195 and 4,683,202, hereby incorporated by reference, which describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. The length of the amplified segment of the desired target sequence is determined by the relative positions of two oligonucleotide primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the process, the method is referred to as the xe2x80x9cpolymerase chain reactionxe2x80x9d (hereinafter xe2x80x9cPCRxe2x80x9d). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be xe2x80x9cPCR amplifiedxe2x80x9d.
With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
Amplification in PCR requires xe2x80x9cPCR reagentsxe2x80x9d or xe2x80x9cPCR materialsxe2x80x9d, which herein are defined as all reagents necessary to carry out amplification except the polymerase, primers and template. PCR reagents nomally include nucleic acid precursors (dCTP, dTTP etc.) and buffer.
As used herein, the term xe2x80x9cprimerxe2x80x9d refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
As used herein, the term xe2x80x9cprobexe2x80x9d refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention will be labelled with any xe2x80x9creporter molecule,xe2x80x9d so that it is detectable using any detection system including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
As used herein, the terms xe2x80x9crestriction endonucleasesxe2x80x9d and xe2x80x9crestriction enzymesxe2x80x9d refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence. Such enzymes can be used to create Restriction Fragment Length Polymorphisms (RFLPs). RFLPs are in essence, unique fingerprint snapshots of a piece of DNA, be it a whole chromosome (genome) or some part of this, such as the regions of the genome that specifically flank ribosomal RNA operons. All such RFLP fingerprints are indicative of the random mutations in all DNA molecules that inevitably occur over evolutionary time. Because of this, if properly interpreted, evolutionary relatedness of any two genomes can be compared based on the fundamental assumption that all organisms have had a common ancestor. Thus, the greater the difference in RFLP fingerprint profiles, the greater the degree of evolutionary divergence between them (although there are exceptions). With such an understanding, it then becomes possible, using appropriate algorithms, to covert RFLP profiles of a group of organisms (e.g. bacterial isolates) into a phylogenic (evolutionary) tree.
RFLPs are generated by cutting (xe2x80x9crestrictingxe2x80x9d) a DNA molecule with a restriction endonuclease. Many hundreds of such enzymes have been isolated, as naturally made by bacteria. In essence, bacteria use such enzymes as a defensive system, to recognize and then cleave (restrict) any foreign DNA molecules which might enter the bacterial cell (e.g. a viral infection). Each of the many hundreds of different restriction enzymes has been found to cut (i.e. xe2x80x9ccleavexe2x80x9d or xe2x80x9crestrictxe2x80x9d) DNA at a different sequence of the 4 basic nucleotides (A, T, G, C) that make up all DNA molecules, e.g. one enzyme might specifically and only recognize the sequence A-A-T-G-A-C, while another might specifically and only recognize the sequence G-T-A-C-T-A, etc. etc. Dependent on the unique enzyme involved, such recognition sequences vary in length, from as few as 4 nucleotides (e.g. A-T-C-C) to as many as 21 nucleotides (A-T-C-C-A-G-G-A-T-G-A-C-A-A-A-T-C-A-T-C-G). From here, the simplest way to consider the situation is that the larger the recognition sequence, the fewer restriction fragments will result as the larger the recognition site, the lower the probability is that it will repeatedly be found throughout the genomic DNA.
In one embodiment, the present invention utilizes the restriction enzyme called EcoRI which has a 6 base pair (nucleotide) recognition site; Thus, given that there exist but 4 nucleotides (A,T,G,C), the probability that this unique 6 base recognition site will occur is 46; or once in every 4,096 nucleotides. Given that the H. influenzae (xe2x80x9cHixe2x80x9d) genome (chromosome) is approximately 2xc3x97106 bp (base pairs) in length, digestion of this DNA with EcoRI theoretically should yield488 fragments. This varies significantly from isolate to isolate of H. influenzae because of xe2x80x9crandom mutationsxe2x80x9d that inevitably occurs over evolutionary time, some of which either destroy an EcoRI sequence cutting site, or create a new one. As such, the degree of variation in EcoRI RFLP profiles among a series of isolates within a given species such as H. influenzae, is indicative of the degree of genetic relatedness of these isolates (although there are exceptions). Using appropriate algorithms, such RFLP profiles are readily converted to xe2x80x9cphylogenetic treesxe2x80x9d which are simply a diagrammatic figures indicating the evolutionary divergence of isolates from some theoretically common ancestor.
Once the genomic (chromosomal) DNA of a bacterial isolate has been isolated, it is then digested (cut) with an enzyme such as EcoRI. Following the digestion, the resultant individual fragments are separated from one another based on their sizes. This can be done by using agarose gel electrophoresis. In essence, during electrophoresis the smaller molecules (DNA fragments) move faster than larger one and thus the resultant separation is a gradient from the largest to the smallest fragments. These can easily be visualized as bands down the electrophoresis gel, from the top to the bottom with the smallest fragments bottom-most.
Using ribotyping methodology, DNA fragments involving the multiple (6 for the case of H. influenzae, 7 for the case of E. coli, etc) ribosomal RNA operons and the immediately flanking DNA sequences (genes) can be distinguished by hybridization of the resultant electrophoresis separated DNA fragments with a radioactively labeled ribosomal operon DNA probe. This then reduces the total number of visualized DNA fragments (predicted above to be approximately 488 restriction fragments) to those only including or immediately flanking the RNA operons, about 14 fragments in toto for H. influenzae. Nonetheless, because of inevitable random background mutation indicative of evolutionary time, with the exception of very recently evolved clones, every independent isolate of H. influenzae will have a variant EcoRI ribotype RFLP profile. And the more variant, the more distantly related will be any two isolates so compared.
DNA molecules are said to have xe2x80x9c5xe2x80x2 endsxe2x80x9d and xe2x80x9c3xe2x80x2 endsxe2x80x9d because mononucleotides are reacted to make oligonucleotides in a manner such that the 5xe2x80x2 phosphate of one mononucleotide pentose ring is attached to the 3xe2x80x2 oxygen of its neighbor in one direction via a phosphodiester linkage. Therefore, an end of an oligonucleotide is referred to as the xe2x80x9c5xe2x80x2 endxe2x80x9d if its 5xe2x80x2 phosphate is not linked to the 3xe2x80x2 oxygen of a mononucleotide pentose ring. An end of an oligonucleotide is referred to as the xe2x80x9c3xe2x80x2 endxe2x80x9d if its 3xe2x80x2 oxygen is not linked to a 5xe2x80x2 phosphate of another mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may be said to have 5xe2x80x2 and 3xe2x80x2 ends. In either a linear or circular DNA molecule, discrete elements are referred to as being xe2x80x9cupstreamxe2x80x9d or 5xe2x80x2 of the xe2x80x9cdownstreamxe2x80x9d or 3xe2x80x2 elements. This terminology reflects the fact that transcription proceeds in a 5xe2x80x2 to 3xe2x80x2 fashion along the DNA strand.
As used herein, the term xe2x80x9can oligonucleotide having a nucleotide sequence encoding a genexe2x80x9d means a nucleic acid sequence comprising the coding region of a gene, i.e. the nucleic acid sequence which encodes a gene product. The coding region may be present in either a cDNA, genomic DNA or RNA form. When present in a DNA form, the oligonucleotide may be single-stranded (i.e., the sense strand) or double-stranded.
As used herein, the terms xe2x80x9cnucleic acid molecule encoding,xe2x80x9d xe2x80x9cDNA sequence encoding,xe2x80x9d and xe2x80x9cDNA encodingxe2x80x9d refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes for the amino acid sequence.
The term xe2x80x9cSouthern blotxe2x80x9d refers to the analysis of DNA on agarose or acrylamide gels to fractionate the DNA according to size, followed by transfer and immobilization of the DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect DNA species complementary to the probe used. The DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support. Southern blots are a standard tool of molecular biologists [J. Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, N.Y., pp 9.31-9.58].
The term xe2x80x9cNorthern blotxe2x80x9d as used herein refers to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect RNA species complementary to the probe used. Northern blots are a standard tool of molecular biologists [J. Sambrook, J. et al. (1989) supra, pp 7.39-7.52].
The term xe2x80x9creverse Northern blotxe2x80x9d as used herein refers to the analysis of DNA by electrophoresis of DNA on agarose gels to fractionate the DNA on the basis of size followed by transfer of the fractionated DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then probed with a labeled oligo-ribonuclotide probe or RNA probe to detect DNA species complementary to the ribo probe used.
The term xe2x80x9cisolatedxe2x80x9d when used in relation to a nucleic acid, as in xe2x80x9can isolated oligonucleotidexe2x80x9d refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is nucleic acid present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA which are found in the state they exist in nature.
As used herein, the term xe2x80x9cpurifiedxe2x80x9d or xe2x80x9cto purifyxe2x80x9d refers to the removal of undesired components from a sample.
As used herein, the term xe2x80x9csubstantially purifiedxe2x80x9d refers to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% free, and most preferably 90% free from other components with which they are naturally associated. An xe2x80x9cisolated polynucleotidexe2x80x9d is therefore a substantially purified polynucleotide.
The term xe2x80x9csamplexe2x80x9d as used herein is used in its broadest sense and includes environmental and biological samples. Environmental samples include material from the environment such as soil and water. Biological samples may be animal, including, human, fluid (e.g., blood, plasma and serum), solid (e.g., stool), tissue, liquid foods (e.g., milk), and solid foods (e.g., vegetables).
The term xe2x80x9cbacteriaxe2x80x9d and xe2x80x9cbacteriumxe2x80x9d refer to all prokaryotic organisms, including those within all of the phyla in the Kingdom Procaryotae. It is intended that the term encompass all microorganisms considered to be bacteria including Mycoplasma, Chlamydia, Actinomyces, Streptomyces, and Rickettsia. All forms of bacteria are included within this definition including cocci, bacilli, spirochetes, spheroplasts, protoplasts, etc. Also included within this term are prokaryotic organisms which.are gram negative or gram positive. xe2x80x9cGram negativexe2x80x9d and xe2x80x9cgram positivexe2x80x9d refer to staining patterns with the Gram-staining process which is well known in the art [Finegold and Martin, Diagnostic Microbiology, 6th Ed. (1982), CV Mosby St. Louis, pp 13-15]. xe2x80x9cGram positive bacteriaxe2x80x9d are bacteria which retain the primary dye used in the Gram stain, causing the stained cells to appear dark blue to purple under the microscope. xe2x80x9cGram negative bacteriaxe2x80x9d do not retain the primary dye used in the Gram stain, but are stained by the counterstain. Thus, gram negative bacteria appear red.