With the advent of the Human Genome Project, one is confronted with voluminous information demonstrating that biological systems may be controlled by hundreds of genes working in concert. A single glance at the ever-increasing number of genes involved in signal transduction makes one wonder just how many genes are needed-to choreograph the symphonic dance of implementing a signal, from the receptor-ligand binding to the nuclear response of transcriptional activation. During the 1980xe2x80x2s and early 1990xe2x80x2s, biologists were busy dissecting single genes"" functions from the reductionist point of view. This approach, while thorough in its exact methodological analysis of genetic impact, lacks the expanded vision of how each particular single gene functions in the context of many sister genes or partners, to accomplish a biological task. Thus, it is not surprising that the technology of high-throughput gene screening is emerging rapidly, in the attempt to identify tens or hundreds of genes whose changes, viewed in composite genetic signatures, define a particular physiological state. This gene signature approach, complemented by single gene analysis, provides a vertical, in-depth analysis of an individual gene""s function, as well as the comprehensive picture of the pattern of gene expression in which the particular gene functions. The notion of genetic signature can be further generalized to address the question of inter-individual variance, by comparing individuals from cohorts of hundreds or thousands.
The unfathomable task of comparing several dozens of single nucleotide polymorphisms (SnP) in a hundred people can now be approached easily by DNA biochip technology (Wang, et al. Science 280:1077-1082 (1998)). For example, a p53 DNA chip is used popularly for the identification and gene screening of unique cancer risks, to discover new SnPs as well as screening known SnPs. Either task needs a fast, multiplex approach requiring data entry on the scale of hundreds and thousands, a demand that can only be met by high-throughput technology. The presently available microarray biochip technology is certainly the method of choice to solve the problem of complexity, and the previously impossible task of defining a genetic signature for a unique person in a cohort with accuracy and speed that are impossible by the conventional diagnostic approach. Therefore, from bench-side researchers to bedside physicians, there is intense interest in the technology of microarray analysis, for screening or identifying tens or hundreds of genes related to disease or normal states of a given person or biological system.
cDNA and oligonucleotide microarrays are becoming an increasingly powerful technique for investigating gene expression patterns. In spite of the fast progress in this field, some limitations of the technique persist. One of the major obstacles is the requirement for a large amount of mRNA. Another problem with existing microarray systems is data mining; while information on expression of tens of thousands genes is absolutely vital to estimate the functions of new genes, in some instances a researcher is interested in the expression profile of only a subset of genes, in many physiological conditions.
It is an object of the present invention to provide a method and materials for the rapid analysis of genetic information based on a common regulatory feature.
It is a further object of the present invention to provide a method and materials for sensitive and quick analysis of genetic information present in very small quantities.
Microarray technology is a fast-growing field of biomedical research, aiming to investigate changes in molecular features of hundreds of genes. The multiple parallel processing of information generated from matrices of huge numbers of loci on a solid substrate has allowed the gathering of gene signatures defining specific biological states. A new approach has been developed to facilitate this process wherein genes of the same regulatory modality are selected. The transcriptional regulation of these genes is related to the same control element, the E-box, defined by the sequence CACGTG. PCR products of selected regions of all known genes either binding to this sequence or whose expression is dependent on this binding, as well as genes interacting with E-box-binding genes and control genes, are arrayed on a nylon membrane or other appropriate microchip susbstrate, which is then used as an E-box-specific microarray. The transcriptionally regulated profile of E-box-related genes specific to a given cultured cell sample is then determined by unique labeled cDNAs probes produced from RNAs isolated from the culture of interest.
The production of E-box microarrays provides an approach to custom-adapt the gene screening task to analyze a subgroup of gene expressions controlled by the same molecular modality. E-box binding-related genes represent a specific group of basic helix-loop-helix/leucine zipper transcription factors, recognizing the core-binding site CACGTG. They play important roles in regulation of basic cellular functions, like proliferation and apoptosis (c-Myc) or tissue-specific differentiation (Myod). As demonstrated by the example, careful selection of genes for the microarray allowed extraction of E-box gene specific signatures of HeLa cells and normal human lymphocytes. The significant differences in expression of 3-6 genes out of 61 are already much more manageable than can be detected from ordinary microarrays with massive numbers of genes, in the hundreds or thousands.