The invention relates to molecular biology, and more specifically, to genetic engineering techniques for isolating samples of cloned vectors or cloned cells containing recombinant DNA. Cloned vectors are isolated from "shotgun" vector libraries, or vector libraries of any sort, and cloned cells include bacterial, yeast, fungal or eukaryotic cells.
When DNA is digested with restriction enzymes or sheared mechanically and the fragments cloned into suitable viral or phage vectors or inserted into cells by any means, a heterogeneous collection of vectors or cells results. These vectors or cells may then be individually isolated or cloned and grown to provide masses of particles which are identical, and which amplify the recombinant DNA present in the original individual isolated vector or phage or cell.
The amount of DNA in one vector is limited, averaging 4.1 kilobases for vectors in present use, such as Chiron II and similar vectors, and around 40 kilobases for cosmid vectors. Larger fragments may be grown in yeast, while masses of single human chromosomes may be produced in Chinese hamster cells, for example. Since the human genome contains approximately 3.4 billion base pairs of DNA, one million different lambda-scale phage clones or 100,000 cosmids would be required to span the entire human genome.
When amplification involves the use of virus or phage particles, cells must be provided in which these non-living particles can grow and multiply. Where the DNA is inserted directly into a cell under conditions in which it can multiply free from viral genome influence or control, then the requirement is to clone and select these cells directly. As mentioned, DNA may be directly inserted into yeast, for example, and human chromosomes can be arranged to multiply in cells of other species. In the latter case, cells containing one known chromosome are selected and cloned. Therefore, both cloning of infectious particles (vectors, viruses, phage, etc.) using living host cells and cloning of non-virus infected cells containing specific desired DNA inserts or chromosomes are required.
For a large-scale human DNA sequencing project, several different libraries of vectors containing DNA fragments produced by several different restriction enzymes will be required. The isolation, growth, processing, storage and analysis of large numbers of individual clones from large recombinant DNA vector libraries will therefore be an important activity as a first step in large-scale DNA sequencing, and completely automatic and robotic methods will be required for clone isolation.
To isolate vector clones using standard methods, the initial sample from the recombinant DNA vector library is diluted and applied to a lawn of E. coli or other organism in which the vector used will grow and produce cell lysis. Small clear areas are produced where lysis occurs. The vector, usually a phage, is recovered from these clear lytic spots, and grown en masse in additional cell cultures to yield the DNA required for sequencing. This is customarily done in petri dishes poured and infected by hand, the phage is diluted and applied manually (or added to the original bacterial cell suspension), and the phage "colonies" are observed visually and the phage recovered for further multiplication by a human operator. To improve efficiency, the vector may be genetically engineered to contain an enzyme which is active in those colonies which contain recombinant DNA inserts. This enzyme is chosen to yield an identifying color in the lytic spot when a suitable substrate is present. With this method, only colored spots are chosen. The enzyme product may also be fluorescent, in which case the choice is made on the basis of fluorescence. Detection may thus be by absence of light scatter in a clear zone, or by light absorption or fluorescence by an enzyme-produced dye.
If the amount of DNA is such that one million individual clones are produced in the initial library (termed a "heap" since it is not ordered or indexed), and if a million clones are isolated from this heap, then on the basis of pure statistical probability one would expect a little more than three-fourths of all clones to have been isolated. The remainder of the clones represent duplicates (and a few triplicates, etc.). Therefore, to approach isolation of all individual clones in the heap, repeated sets of a million clones would have to be isolated and intercompared. These considerations lead to the conclusion that many tens of millions of clones will require isolation as part of any attempt to sequence the entire human genome, and that clone isolation on this scale will require the application of automation and robotics.
The entire sequencing project becomes more efficient the larger the size of the initial fragments cloned, hence the interest in starting with either chromosome sorting, or with cells containing only one human chromosome. Such cells are ideally chosen for having small numbers of indigenous chromosomes, and for having chromosomes from which the human chromosome carried may be easily isolated. Cloning such foreign chromosome-containing cells may be done with the systems described here, but do not generally require large-scale cloning.
Fragments smaller than chromosome size may be inserted into yeast and the yeast grown to yield quantities of the insert. The fragments are generally produced by shearing or by the use of restriction enzymes which cut at rare sites (so-called "eight cutters," for example). Alternatively, digestion with restriction enzymes may not be carried to completion, yielding some long fragments which can be further cut by additional enzyme action. Cloning and selection of cells from such host yeast preparations will require cloning on a smaller scale than that required for various vectors, but even with yeast there is still a substantial amount of effort required to obtain a complete set of all possible clones.