1. Field of the Invention
The present invention relates to DNA superfragments, a method of preparing DNA superfragments, and a method of cloning which utilizes DNA superfragments.
2. Discussion of the Background
The primary goal of most genetic laboratories is to identify, isolate (clone) and characterize genes. This primary function is severely hampered in eukaryotes by the complexity (size) of the genome (e.g., the human genome contains 3.times.10.sup.9 base pairs on each set of chromosomes), and by the fact that transcribed (functioning) genes represent only a small percentage of the genetic material, which is hidden among the remaining DNA. While great progress has been made in the isolation of genes involved in metabolic pathways, because of the abundance of the proteins encoded by such genes and the availability of biochemical selection for the property of the gene product (i.e., assays that allow the elimination of cells that do not contain the desired gene product), cloning of most genes currently requires more tedious "reverse genetics" approaches.
These approaches rely on the localization of a gene to a particular chromosome or chromosomal region, and then "walking" along a chromosome from a known DNA marker to the desired gene. There are several problems inherent in reverse genetic approaches. These include but ar not limited to the following:
(i) Currently, if one knows the approximate chromosomal location of a gene, one can look for the nearest chromosomal marker and then walk along the chromosome to the gene (Molecular Cloning, A Laboratory Manual, 2nd Ed., J. Sanbrook, E.F. Fritsch, and T. Maniatis, Cold Spring Harbor Laboratory Press (1989)). There are 3 billion base pairs of DNA along the length of the genome. Typically, available DNA markers are sparsely distributed (H. Donis-Keller et al, "A Genetic Linkage Map of the Human Genome", Cell, vol. 51, pp. 319-337 (1987)). For example, only seven markers have been mapped over a range of 15 centimorgans (approximately 15 million base pairs) in a region of chromosomal band 11p15 that may harbor a tumor suppressor gene. Worse, these markers are often not equally spaced and there are often large regions that lack DNA markers. In order to walk from one chromosomal location to another, one uses a DNA marker or probe to screen a genomic library containing the entire human genome or a large portion of the chromosome of interest. One identifies clones that contain this marker. A typical bacteriophage library contains DNA fragments of approximately 15,000 base pairs. Thus, each chromosomal walk will allow isolation of new fragments at an average distance of 15,000 base pairs A total of over 60 such walks (at best requiring 1 month each) is necessary to walk each one million base pairs. Specialized methods have been developed to shorten this process. The most successful of these approaches, yeast artificial chromosome cloning (Burke, D. T.; Carle, G. F.; and Olson, M. V., Science, vol. 236, pp. 806-812, (1987)), allows one to walk through a library, in increments of approximately 250,000 base pairs on average. In the example of chromosome 11, this process would still require over 500 YACs to encompass the chromosome, but several times that number to account for overlap. Furthermore, one may walk for quite a distance and then get "stuck", if there is a gap in the representation of the library, which often occurs.
(ii) One often does not know which of many transcribed sequences in a walking effort represents the gene of interest, since there is no convenient way to assay for the gene's properties. This is because in conventional reverse genetic approaches, only a portion of the gene is isolated and considerable additional effort is necessary to obtain a functionally useful clone. This requires a second effort of cloning, in this case using probes derived from genomic walking to screen a cDNA library. The entire cDNA must be isolated, which is often difficult, and it must be placed into the appropriate expression vector in order to assay for the phenotypic properties of its expression. Because of this, complementation studies (screening tests demonstrating the functional properties of the gene) commonly are done later than gene cloning, and they are not helpful in actually isolating the gene. The identity of the gene is thus usually confirmed by identification of mutations within candidate genes in disease states (Riordan, Jr. et al, "Identification of the cystc fibrosis gene: cloning and characterization of complementary DNA", Science, vol. 245, pp. 1066-1073 (1989)), association with chromosomal alterations (Friend et al, "A Human DNA Segment with Properties of the Gene that Predisposes to Retinoblastoma and Osteosarcoma", Nature, vol. 323, pp. 643-646 (1986)), or other cumbersome means. Since one often does not know the precise location of the gene of interest and must begin with a candidate region of millions of base pairs or even a whole chromosomal arm or entire chromosome, there is often no clear starting point and no well-defined endpoint.
(iii) Current strategies do not permit one to screen clones for the biological property that the gene of interest confers. For the purposes of the present application, a screening process is any process, by which it is possible to determine if a cell expresses a particular gene product, which is not based on the selective survival of the cell or the ability to isolate the cell from all other cells because of the expression of the particular gene product. In contrast, for the purposes of the present application, a selection process is any process, by which it is possible to determine if a cell expresses a particular gene product, which is based on the selective survival of the cell or the ability to isolate the cell from all other cells because of the expression of the particular gene product. Thus, there is currently no generalized strategy that would allow one to screen for a gene, enabling one to know which of many clones contain the gene with the properties one desires, unless there is a way to select for that gene as well. This is a common problem in molecular biology. One can easily design a "library" of tens of thousands of genes, each in a bacterial colony, but unless one has a way of selecting those colonies that express the gene of interest, one cannot find the "needle in a haystack", namely the colony containing the gene of interest among the many thousands of other genes.
For example, one can screen for a cell containing a tumor suppressor gene, because the growth of that cell is inhibited by the gene. However, one cannot select for such a cell, because by definition, cells containing this gene will grow more slowly than cells not containing the gene, and thus selection strategies will select against the gene of interest. There have been several failed efforts to circumvent this problem, and current cloning strategies for tumor suppressor genes rely on reverse genetic methods, i.e., they do not depend upon a screening test for the gene.
Other examples of genes for which screening is possible but selection is not include the following:
(a) Genes that cause chromosomal breakage. There are several human clinical disorders that predispose one to chromosomal breakage, birth defects, and cancer. Indeed, one of these, ataxia talagiectasia, is thought to account for as much as 15% of the incidences of breast cancer in the general population.
(b) Very large genes that are difficult to transfer intact using conventional vectors such as a phage or cosmid.
(c) Genes which encode integral membrane proteins or receptors that are difficult to purify.
(d) Genes which cause cellular aging.
(e) Trans-acting genes that regulate, directly or indirectly, other genes.
Thus, there remains a need for a method of cloning genes which is free of the above-mentioned drawbacks. In particular, there remains a need for a method of cloning genes for which there is no method of selection, such as tumor suppresor genes, genes that cause chromosomal breakage, very large genes, genes which cause cellular aging, and encode integral membrane proteins or receptors that are difficult to purify, and transacting genes.