1. Field of the Invention
The present invention relates to methods of analyzing gene expression that require only small amounts of biological samples.
2. Description of the Background
Several methods are now available for monitoring gene expression on a genomic scale. These include DNA microarrays (1, 2) and macroarrays (3, 4), expressed sequence tag (EST) determination (5, 6), and serial analysis of gene expression (7). Such methods have been designed, and are still used, for analyzing macroamounts of biological material (1-5 xcexcg of poly(A) mRNAs, i.e. xcx9c107 cells). However, mammalian tissues consist of several different cell types with specific physiological functions and gene expression patterns. Obviously, this makes intricate the interpretation of large scale expression data in higher organisms. It is therefore most desirable to set out methods suitable for the analysis of defined cell populations.
SAGE has been shown to provide rapid and detailed information on transcript abundance and diversity (7-10). It involves several steps for mRNA purification, cDNA tags generation and isolation, and PCR amplification. It was reasoned that increasing the yield of the various extraction procedures, together with slight modifications in the number of PCR cycles could enlarge SAGE potentiality.
SAGE was first described by Velculescu et al. in 1995 (U.S. Pat. No. 5,695,937 (incorporated herein by reference, and ref. 7), and rests on 3 principles which have now been all corroborated experimentally: a) short nucleotide sequence tags (10 bp) are long enough to be specific of a transcript, especially if they are isolated from a defined portion of each transcript; b) concatenation of several tags within a single DNA molecule greatly increases the throughput of data acquisition; c) the quantitative recovery of transcript specific tags allows to establish representative gene expression profiles.
However, this method was designed to study macroamounts of biological materials (5 xcexcg of poly(A) RNAs, i.e. about 107 cells). Since mammalian tissues consist of several different cell types with specific physiological functions and gene expression patterns, it is most desirable to scale down the SAGE approach for studying well delineated tissue fragments or isolated cell populations.
According, there remains a need for a process that may be conducted using smaller amounts of the biological sample as compared to the SAGE method described above.
It is an object of the present invention to provide a method of analyzing gene expression which requires only small quantities of biological samples.
The present invention is based on the discovery of a microadaptation of SAGE, referred to herein as SADE since, in contrast to SAGE, the inventive method described herein provides quantitative gene expression data on a small number (30,000-50,000) of cells.
The object of the present invention above, and others, may be accomplished with a method of obtaining a library of tags able to define a specific state of a biological sample, such as a tissue or a cell culture, comprising the following successive steps:
(1) extracting in a single-step mRNA from a small amount of a biological sample using an oligo(dT), e.g., oligo(dT)25, covalently bound to paramagnetic beads,
(2) generating a double-strand cDNA library, from the mRNA according to the following steps:
(a) synthesizing the 1st strand of the cDNA by reverse transcription of the mRNA template into a 1st complementary single-strand cDNA, using a reverse transcriptase lacking Rnase H activity,
(b) synthesizing the 2nd strand of the cDNA by nick translation of the mRNA, in the mRNA-cDNA hybrid form by an E. coli DNA polymerase,
(3) cleaving the obtained cDNAs with an anchoring enzyme, corresponding to a restriction endonuclease with a 4-bp recognition site,
(4) separating the cleaved cDNAs in two aliquots,
(5) linking or ligating the cDNA contained in each of the two aliquots with different oligonucleotide linkers comprising a type IIS recognition site,
(6) digesting the products obtained in step (5) with a type IIS restriction enzyme and obtaining two different tags,
(7) blunt-ending the tags with a DNA polymerase, e.g., T7 DNA polymerase or Vent polymerase, and mixing the tags ligated with the different linkers,
(8) ligating the tags obtained in step (7) to form ditags with a DNA ligase, and
(9) determining the nucleotide sequence of at least one tag of the ditag to detect gene expression.
The object of the invention may also be accomplished with a more specific method of obtaining a library of tags able to define a specific state of a biological sample, such as a tissue or a cell culture, comprising the following successive steps:
(1) extracting in a single-step mRNA from a small amount of a biological sample using an oligo(dT), e.g., oligo(dT)25, covalently bound to paramagnetic beads,
(2) generating a double-strand cDNA library, from the mRNA according to the following steps:
(a) synthesizing the 1st strand of the cDNA by reverse transcription of the mRNA template into a 1st complementary single-strand cDNA, using a reverse transcriptase lacking Rnase H activity,
(b) synthesizing the 2nd strand of the cDNA by nick translation of the mRNA, in the mRNA-cDNA hybrid form by an E. coli DNA polymerase,
(3) cleaving the obtained cDNAs using the restriction endonuclease Sau3A I as anchoring enzyme,
(4) separating the cleaved cDNAs in two aliquots,
(5) ligating the cDNA contained in each of the two aliquots via the Sau3A I restriction site to a linker consisting of one double-strand cDNA molecule having one of the following formulas:
GATCGTCCC-X1 SEQ ID NO:1 or GATCGTCCC-X2 SEQ ID NO:2,
xe2x80x83wherein X1 and X2, which comprise 30-37 nucleotides and are different, include a 20-25 bp PCR priming site with a Tm of 55xc2x0 C.-65xc2x0 C., and
xe2x80x83wherein GATCGTCCC corresponds to a Sau3A I restriction site joined to a BsmF I restriction site,
(6) digesting the products obtained in step (5) with the tagging enzyme BsmF I and releasing linkers with anchored short piece of cDNA corresponding to a transcript-specific tag, the digestion generating BsmF I tags specific of the initial mRNA,
(7) blunt-ending the BsmF I tags with a DNA polymerase, preferably T7 DNA polymerase or Vent polymerase and mixing the tags ligated with the different linkers,
(8) ligating the tags obtained in step (7) to form ditags with a DNA ligase,
(9) amplifying the ditags obtained in step (8) with primers comprising 20-25 bp and having a Tm of 55xc2x0-65xc2x0 C.,
(10) isolating the ditags having between 20 and 28 bp from the amplification products obtained in step (9) by digesting the amplification products with the anchoring enzyme Sau3A I and separating the digested products on an appropriate gel electrophoresis,
(11) ligating the ditags obtained in step (10) to form concatemers, purifying the concatemers and separating the concatemers having more than 300 bp,
(12) cloning and sequencing the concatemers and
(13) analyzing the different obtained tags.
The object of the present invention is also accomplished with the use of a library of tags obtained according to the methods described above, for assessing the state of a biological sample, such as a tissue or a cell culture.
The present invention also includes the use of the tags obtained according to the methods described above as probes.
The subject of the present invention is also a method of determining a gene expression profile, comprising:
performing one of the here above defined methods and
translating cDNA tag abundance in gene expression profile.
The present invention also includes to a kit useful for detection of gene expression profile, characterized in that the presence of a cDNA tag, obtained from the mRNA extracted from a biological sample, is indicative of expression of a gene having the tag sequence at an appropriate position, i.e. immediately adjacent to the most 3xe2x80x2 Sau3A I site in the cDNA, obtained from the mRNA, the kit comprising further to usual buffers for cDNA synthesis, restriction enzyme digestion, ligation and amplification,
containers containing a linker consisting of one double-strand cDNA molecule having one of the following formulas:
xe2x80x83GATCGTCCC-X1 SEQ ID NO:1 or GATCGTCCC-X2 SEQ ID NO:2,
wherein X1 and X2, which comprise 30-37 nucleotides and are different, include a 20-25 bp PCR priming site with a Tm of 55xc2x0 C.-65xc2x0 C., and
wherein GATCGTCCC corresponds to a Sau3A I restriction site joined to a BsmF I restriction site, and
containers containing primers comprising 20-25 bp and having a Tm of 55xc2x0 65xc2x0 C.
As compared to SAGE, the inventive SADE method includes the following features: 1) single-step mRNA purification from tissue lysate; 2) use of a reverse transcriptase lacking Rnase H activity; 3) use of a different anchoring enzyme; 4) modification of procedures for blunt-ending cDNA tags; and 5) design of new linkers and PCR primers.
FIG. 1, modified from the original studies of Velculescu et al., summarizes the different steps of the SADE method, which is a microadaptation of SAGE. Briefly, as described above, mRNAs are extracted using oligo(dT)25, covalently bound to paramagnetic beads. Double strand cDNA is synthesized from mRNA using oligo(dT)25 as primer for the 1st strand synthesis. The cDNA is then cleaved using a restriction endonuclease (anchoring enzyme: Sau3A I) with a 4-bp recognition site. Since such an enzyme cleaves DNA molecules every 256 bp (44) on average, virtually all cDNAs are predicted to be cleaved at least once. The 3xe2x80x2 end of each cDNA is isolated using the property of the paramagnetic beads and divided in half. Each of the two aliquots is ligated via the anchoring enzyme restriction site to one of the two linkers containing a type IIS recognition site (tagging enzyme: BsmF I) and a priming site for PCR amplification. Type IIS restriction endonucleases display recognition and cleavage sites separated by a defined length (14 bp for BsmF I), irrespective of the intercalated sequence. Digestion with the type IIS restriction enzyme thus releases linkers with an anchored short piece of cDNA, corresponding to a transcript-specific tag. After blunt ending of tags, the two aliquots are linked together and amplified by PCR. Since all targets are of the same length (110 bp) and are amplified with the same primers, potential distortions introduced by PCR are greatly reduced. Furthermore, these distortions can be evaluated, and the data corrected accordingly (7, 8). Ditags present in the PCR products are recovered through digestion with the anchoring enzyme and gel purification, then concatenated and cloned.