As is now well known, deoxyribonucleic acid (DNA) exists as a long unbranched molecule consisting of many similar units known as nucleotides. The individual nucleotides are arranged into two large polymeric chains which are interwound to form the so-called double helical structure of DNA. The DNA nucleotides are generally of four types characterized by possessing one of four organic heterocyclic ring moieties often referred to as bases. Two of the bases adenine (A) and guanine (G) belong to the class of heterocyclic ring compounds known as purines while the other two bases thymine (T) and cytosine(C) belong to the pyrimidine class of heterocyclic rings. In addition to a base, each nucleotide contains a five carbon sugar (pentose) called deoxyribose and a phosphate (PO.sub.4) group.
A part of one chain of DNA may be represented by the structure: ##STR1##
By convention the carbon atoms which comprises the deoxyribose moiety are given designations 1' to 5'. The polymer is formed through diester linkages of the phosphate group to the 3' and 5' carbon atoms of adjacent pentose residues. This configuration results in the chain possessing a free phosphate group at the 5' terminus and a free OH group at its 3' terminus. Because of this arrangement of atoms the polynucleotide chain is said to have polarity, that is one end of the molecule is distinguishable from the other, much as the two ends of a bar magnet would be distinguishable.
In its native form, DNA is comprised of two polynucleotide chains arranged such that the bases of the two chains are oriented towards the center of the molecule and the sugar-phosphate groups oriented to the outside of the molecule.
More specifically, the bases are oriented in a complementary fashion so that the specific purine G is always opposite the specific pyrimidine C and the specific purine A is always opposite a specific pyrimidine T. Each A.dbd.T or G.tbd.C base pair is stabilized by two and three hydrogen bonds respectively. The sugar-phosphate groups, often referred to as the "backbone" of the molecule are arranged in an antiparallel fashion, that is to say if one chain is oriented 5'.fwdarw.3' to other chain is oriented 3'.fwdarw.5'. This specific arrangement is illustrated as follows: ##STR2##
As mentioned above in its native form DNA exists as a double helix; this is the consequence of the fact that each base pair is displaced slightly (.about.36.degree.) in axial rotation from the base pair adjacent to it. The molecule thus makes one complete spiral turn every ten base pairs resulting in the well-known double helical structure shown in FIG. 6.
DNA molecules are large, chemically stable and easily replicated and as such are ideally suited to function as the storage form of genetic information. For example, most of the genetic repertoire of the bacteria E. coli is contained within a single DNA molecule composed of approximately 4.2.times.10.sup.6 nucleotide base pairs.
The flow of genetic information in cells is well known. The information directing the biosynthesis of any protein is encoded in the sequences of DNA nucleotides known as a gene.
Transcription is the process by which the retrieval of information is begun. Transcription involves the resynthesis of the information in the form of another type of nucleic acid called ribonucleic acid (RNA). One type of RNA, messenger RNA (mRNA), transports the information to the site of protein synthesis called a ribosome.
Once the mRNA is synthesized from the gene, the process of protein synthesis may begin. This process is essentially one of molecular decoding, in which the nucleotide sequence of the mRNA provides a template for the synthesis of a particular protein. Since there is a change from a nucleic acid language into that of a protein language, this process of protein synthesis appropriately is referred to as translation. Continuing the analogy a bit further, it would be appropriate to think of the constituents of the nucleic acids, the nucleotides, as representing the alphabet of the nucleic acid language and the amino acids, the building blocks of proteins, as representing the alphabet of the protein language. During the process of translation not only are the languages changing but the alphabets are changing as well. This is a particularly complex process which is known to involve over 100 types of molecules. As the mRNA is passed through the ribosome (much like the tape through a tape recorder) groups of 3 nucleotides (codons) are positioned such as to orient accessory RNA molecules, known as transfer RNA (tRNA), carrying a single amino acid into the proper alignment for the addition of the amino acid to the growing protein chain.
Of special interest with respect to the subject invention is the relationship of the structure and function of DNA has to the application of recombinant DNA (genetic engineering) technology.
One of the main objectives of genetic engineering experiments is to provide to a recipient organism a source of genetic information which will permit the recipient organism to perform a new function. Generally, this is accomplished by providing the genetic information in the form of a piece of DNA which has been isolated from another organism and chemically integrated into the DNA which normally exists within the recipient organism. The result of such a procedure is a molecular hybrid and is often referred to as a chimeric DNA molecule (Chimera - Gk. mythol.--A fire breathing monster usually represented as a composite of a lion, a goat and a serpent.). Since the chimeric molecule is often replicated (i.e. found in multiple copies) within the recipient organism, the DNA is said to have been cloned. The construction of stable, functioning genetic chimeras by means of genetic engineering techniques, involves a series of in vitro and in vivo steps.
The source of DNA to be cloned may include viruses, bacteria, fungi, plants or animals. This DNA is generally referred to as donor DNA and contains the desired genetic information to be propagated. This DNA represents one component of the chimera.
The other component of the chimera, the vector, is a segment of DNA into which the donor DNA is integrated. This vector DNA, also referred to as the cloning vehicle, is a segment of non-chromosomal DNA that is capable of independent replication when placed within a microbe. The cloning vehicles commonly used are derived from viruses, bacteria, fungi, plants or a combination thereof.
For example, an early step in the genetic engineering process involves integrating a fragment of donor DNA containing the desired genetic information into an appropriate vector. Generally, this involves treating both the vector DNA and the donor DNA with an enzyme (a restriction endonuclease) which cleaves only at specific sites within the two DNAs. Since the termini of the cleaved molecules are complementary, due to the action of the restriction enzyme, the foreign DNA may be integrated at a particular point within the plasmid. Optionally, this site of integration itself will have been previously "engineered" so as to be nearby the appropriate control sequences which will ensure the successful expression (i.e. transcription and translation) of the integrated DNA. The last step in the integration involves the enzymatic sealing of the phosphodiester backbone of the DNA molecule employing the enzyme DNA ligase.
During the course of some recombinant DNA experiments, it is necessary to generate a single stranded DNA molecule from a double stranded DNA molecule. In addition it is often desirable to asymmetrically decrease the length of a double stranded molecule in a progressive, controlled manner. The instant invention provides a rapid and generally applicable method to perform either of these manipulations.