A great deal of our knowledge on origins of replication (ORIs) comes from the study of ORIs in animal viruses. The SV40 and polyomavirus origins of replication occur in highly accessible, nucleosome-free regions of about 450 base pairs (bp) which are activated by T antigen protein encoded by the viral genome. This nucleosome-free gap contains both transcription and replication controlling elements. Large T antigen alone (90 kDa) initiates each round of viral DNA synthesis. However, several transcription factors are known to bind to the 21 bp repeats and the 72 bp enhancers of the SV40 ORI; these transcription factors could interact with large T to enhance initiation of replication.
Three rather extensive regions within SV40 ORI, termed sites I, II and III, bind large T (Hay 1982). The SV40 core ORI consists of three domains of a total length of 64 bp: (i) a sequence with an imperfect inverted repeat; (ii) a palindrome with four GAGGC pentanucleotide repeats recognized by T antigen; and (iii) a 17-bp A-Trich segment (Deb, et al., 1986a; Parsons, et al., 1990). T antigen contacts and melts bases in the imperfect inverted repeat, structurally distorts the GAGGC pentanucleotide domain the bends and untwists the AT-rich domain (Deb, et al., 1986b; Borowiec and Hurwitz, 1988; Parsons, et al., 1990). All three domains of core ORI are protected by the T antigen toward digestion by DNase I (Deb and Tegtmeyer, 1987).
In the presence of ATP twelve molecules of large T antigen are assembled in the form of two hexamers on the SV40 core ORI (Mastrangelo, et al., 1989; Tsurimoto, et al., 1989a, 1989b). Assembly of a hexamer first occurs on the early half core ORI and then on the late half; the formation of these hexamers melts the early ORIs and untwists the late half core ORIS; melting and untwisting releases large T antigen molecules from the GAGGC pentanucleotides to act as helicases at flanking DNA regions (Parsons, et al., 1991).
The B subunit of DNA polymerase a (68 kDa) mediates the assembly of the 180 kDa subunit with T antigen acting as a molecular tether linking the two proteins (Collins, et al., 1993). Thus, T antigen enhances the ability of DNA polymerase .alpha. to prime the synthesis of new DNA chains and to extend pre-existing DNA chains (Erdile, et al., 1991; Collins, et al., 1993). In the absence of T antigen DNA polymerase a synthesizes about 7-13 nt per binding event and then dissociates from DNA (Copeland and Wang, 1991); the ability of T antigen to translocate in the 3' to 5' direction along the DNA in an ATP-consuming process and the linkage of T antigen to the 180 kDa catalytic subunit of DNA polymerase .alpha. via the 70 kDa B subunit would be expected to hold the polymerase and make single-stranded template available, thus increasing enzyme processivity (Collins, et al., 1993).
Unlike E. coli, which uses a single start point to replicate its DNA, eukaryotic cells use multiple replication origins (Huberman and Riggs, 1968; Linskens and Huberman, 1990). The existence of multiple replicons--chromosomal segments that are replicated from a single origin and whose size, number and temporal order of replication is cell type- and developmental stage-specific (Edenberg and Huberman, 1975; Hand, 1978)--has promoted the idea of chromatin compartmentalization into domains. The 300 kb locus comprising the murine immunoglobulin heavy chain gene segments is a single replicon (Brown, et al., 1987).
Both RNA and DNA viruses, using either prokaryotic or eukaryotic cells for their proliferation, usually possess a unique, and in some cases (i.e., HSV), two or three origins of replication. For example, both the DNA of SV40, a virus causing cancer in monkeys (5 kb), and the genome of E. coli (3.times.10).sup.6 bp) are replicated from a single origin. However, eukaryotes, due to their vast content in DNA, require multiple origins in replication. For example, the genome of the fruit fly Drosophila melanogaster (.about.10.sup.8 bp) and the DNA in haploid human nuclei (.about.3.times.10.sup.9 bp) use about 5,000 and 60,000 start points of DNA replication, respectively.
Of the 60,000 or so ORIs from human cells, five specific ORIs have apparently been identified as of June 1996. The known human ORIs are:
(i) That of the .beta.-globin gene complex (Kitsberg, et al., 1993; Boulikas, 1993). Earlier studies on the replication of the .beta.-globin multigene cluster showed temporal directionality and led to the identification of potential initiation sites for the replication of the .beta.-globin gene complex (Dhar, et al., 1988); PA1 (ii) that of the c-myc gene (Iguchi-Ariga, et al., 1988; Ariga, et al., 1989; Vassilev and Johnson, 1990); PA1 (iii) the ORI in the 18S/28S ribosomal DNA 44 kb repeating unit (Little, et al., 1993); PA1 (iv) the ORI of the human HSP70 gene (Taira, et al., 1994); and PA1 (v) the ORI of the CHAT gene (Boulikas, et al., 1996).
These ORIs are presumably activated by transcription factors (TFs) and replication initiator proteins which may include ssDNA-binding proteins (Bergemann, et al., 1992) and cruciform DNA-binding proteins (Pearson, et al., 1995). One TF involved in replication initiation may be the oncoprotein c-myc which promotes cellular DNA replication by binding to a cloned human putative ORI sequence (Iguchi-Ariga, et al., 1987a), c-myc can substitute for SV40 large T-antigen in an in vitro SV40 replication system (Iguchi-Ariga, et al., 1987b; Classon, et al., 1987). A region approximately 2 kb upstream of the transcription start site of the human c-myc gene contains a putative ORI of 210 bp (Boulikas, 1996, herein incorporated by reference) which is also a transcription enhancer containing c-myc binding sites (Iguchi-Ariga, et al., 1988a; Umekawa, et al., 1988). This fragment contains the 22-nucleotide binding site determined by DNase footprinting and mobility shift assays; this interaction is involved in both upregulating transcription as well as replication of the c-myc gene domain (Ariga, et al., 1989).
DNA sequences enriched in origins of replication termed ors have been isolated by extrusion of single-stranded newly synthesized DNA at the replication fork from actively replicating monkey cells in culture (Zannis-Hadjopoulos et al., 1985). pBR322 plasmid harboring several cloned ors sequences have been shown to be autonomously and extrachromosomally replicating after their transfection into HeLa cells (Frappier and Zannis-Hadjopoulos, 1987; Rao et al, 1990; Landry and Zannis-Hadjopoulos, 1991).
Two chromosomal origins of replication within the DHFR amplicon (240 kb) mapped by two different approaches (Anachkova and Hamlin, 1988; Leu and Hamlin, 1989) are located at a distance of about 20 kb from one another. One of these origins had been localized in a 4.3 kb fragment (Burhans, et al., 1986) and was narrowed down to a 450 bp fragment by mapping the site where the strand specificity of the Okazaki fragments switches (Burhans, et al., 1990). The presence of two independent origins for the amplified DHFR locus was confirmed by Handeli, et al. (1989) using a novel mapping procedure. Multiple initiation sites are apparently used for the replication of this gene repeat lying within a 28 kb region (Vaughn, et al., 1990). These data do not contradict the model that precise DNA sequences are used for the initiation of DNA replication in mammalian cells since .sup..about. 1000 copies of the DHFR gene that may include DNA elements that function in replication initiation are present within an amplicon.
Plasmids carrying the cad gene and flanking regulatory sequences were able to function as autonomously replicating episomes in mammalian cells (Carroll, et al., 1987). A bidirectional origin in the native locus and in episomally amplified murine adenosine deaminase loci has been found (Carroll, et al., 1993). The region of DNA replication in the murine immunoglobulin heavy chain gene has been identified and the octamer motif has been suggested as a putative DNA replication origin in mammalian cells (Iguchi-Ariga, et al., 1993). Similar studies are consistent with the presence of a replication origin in the mdr-1 gene (Ruiz, et al., 1989). The chicken has one H5 gene displaying a polarity with respect to its replication in expressing and non-expressing cell types; these data are compatible with an origin in the 5' flanking region used for the replication of the avian .beta.-globin gene in erythroid cells and from an origin in the 3' flanking region in nonerythroid cell types (Trempe, et al., 1988). Shot-gun cloning experiment aimed at identifying mammalian genomic sequences with ARS activity in yeast has identified autonomously replicating sequences (Roth, et al., 1983; Montiel, et al., 1984; Ariga, et al., 1985).