The characterization of the structure of human chromosomes and elucidation of their various encoded activities are major interests of modern biology and medicine. The past decade and a half in molecular biology has been a time of "hit-and-run" approach for cloning, sequencing, and analyzing individual genes of specific interest. However, there is a need for an overall and comprehensive approach to the study of human chromosomes. For example, of the estimated 100,000 human genes, only some 3,000 are represented as sequenced genes, mapped markers, cloned fragile sites, and neoplastic breakpoints. V. McKusick, N. Eng. J. Med. 320: 910-915 (1989). Much less is known about chromosomal regions with other basic functions such as DNA replication, chromatin packaging, and chromosomal segregation. Thus, for progress in human physiology and pathology, it would be extremely valuable to have a complete physical map and nucleotide sequence of the human genome.
The recent construction of a detailed linkage map of the human genome in size of 1-10 megabases is an important first step for the localization of genes and other functional chromosomal regions. H. Donis-Keller et al., Cell 51: 319-317 (1987). To increase the resolution of such a map in a range suitable for rapid cloning and sequencing, an average spacing of 100 Kb has been estimated, which required the mapping of 30,000 linearly ordered human DNA clones. M. Olson et al., Science 245: 1434-1435 (1989). To construct such a physical map with 100 Kb resolution, new mapping approaches such as the Sequence-Tagged-Sites (STS) and Repetitive-Sequence-Fingerprinting based mapping methodologies (RSF) are being developed to allow computer-mediated storage and retrieval of specific and unique human sequences. See M. Olsen et al., supra; R. Stallings et al., Proc. Natl. Acad. Sci. USA 87: 6218-6222 (1990). However, with respect to the 100 Kb resolution that will be required for such a map with good practical coverage, a key problem is finding a vector with suitable capacity.
Common cloning systems allow human DNA inserts to be propagated in bacteria or yeast. A problem with the bacterial cosmid system, however, is that it only has limited cloning capacity (about 40 Kb). Two newly developed prokaryotic vectors, the P1 cloning system and the mini-F based plasmid vector, provide the opportunity to propagate larger DNA fragments in bacteria. See N. Sternberg et al., Proc. Natl. Acad. Sci. USA 87: 103-107 (1990); M. O'Connor et al., Science 244: 1307-1313 (1989). The P1 cloning system can clone up to 100 Kb DNA and the mini-F based plasmid vector has the potential for cloning 136 Kb DNA. Yeast artificial chromosome (YAC) has capacity for carrying exogenous DNA fragments in the megabase range. D. Burke et al., Science 236: 806-812 (1987). A problem with these systems is that human genomic DNA propagated in heterologous organisms such as bacteria or yeast can be subjected to sequence reorganization, particularly if carrying highly repetitive sequences such as SINEs, LINEs or VNTRs. Furthermore, human genetic imprinting such as 5-methylcytosine will not be maintained faithfully in these single cell organisms.
Accordingly, there is a need for a cloning system which accomodates large size inserts, and which can be used in mammalian, particularly human, cells. The present invention is based on continuing research into solutions to this problem.