1. Reference to a Sequence Listing Provided on Compact Disc
This application refers to a xe2x80x9cSequence Listingxe2x80x9d, which is provided as an electronic document on two identical compact discs (CD-R), labeled xe2x80x9cCopy 1xe2x80x9d and xe2x80x9cCopy 2.xe2x80x9d These compact discs each contain the electronic document, filename xe2x80x9cPB275C1.ST25.txtxe2x80x9d (2,259,883 bytes in size, created on Nov. 14, 2002), which is hereby incorporated in its entirety herein.
2. Field of the Invention
The present application discloses the complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements. Also identified are 1738 predicted protein-coding genes.
3. Related Background Art
The view of evolution in which all cellular organisms are in the first instance either prokaryotic or eukaryotic was challenged in 1977 by the finding that on the molecular level life comprises three primary groupings (Fox, G. E., et al., Proc. Natl. Acad. Sci. USA 74:4537 (1977); Woese, C. R. and Fox, G. E., Proc. Natl. Acad. Sci. USA 74:5088 (1977); Woese, C. R., et al., Proc. Natl., Acad. Sci. USA 87:4576 (1990)): the eukaryotes (Eukarya) and two unrelated groups of prokaryotes, Bacteria and a new group now called the Archaea. Although Bacteria and Archaea are both prokaryotes in a cytological sense, they differ profoundly in their molecular makeup (Fox, G. E., et al., Proc. Natl. Acad. Sci. USA 74:4537 (1977); Woese, C. R. and Fox, G. E., Proc. Natl. Acad. Sci. USA 74:5088 (1977); Woese, C. R., et al., Proc. Natl. Acad. Sci. USA 87:4576 (1990)). Several lines of molecular evidence even suggest a specific relationship between Archaea and Eukarya (Iwabe, N., et al., Proc. Natl. Acad. Sci. USA 86:9355 (1989); Gogarten J. P., et al., Proc. Natl. Acad. Sci. USA 86:6661 (1989); Brown, J. R. and Doolittle, W. F., Proc. Natl. Acad. Sci. USA 92:2441 (1995)).
The era of true comparative genomics has been ushered in by complete genome sequencing and analysis. We recently described the first two complete bacterial genome sequences, those of Haemophilus influenzae and Mycoplasma genitalium (Fleischmann, R. D., et al., Science 269:496 (1995); Fraser, C. M., et al., Science 270:397 (1995)). Large scale DNA sequencing efforts also have produced an extensive collection of sequence data from eukaryotes, including Homo sapiens (Adams, M. D., et al., Nature 377:3 (1995)) and Saccharomyces cerevisiae (Levy, J., Yeast 10:1689 (1994)).
M. jannaschii was originally isolated by J. A. Leigh from a sediment sample collected from the sea floor surface at the base of a 2600 m deep xe2x80x9cwhite smokerxe2x80x9d chimney located at 21xc2x0 N on the East Pacific Rise (Jones, W., et al., Arch. Microbiol. 136:254 (1983)). M. jannaschii grows at pressures of up to more than 500 atm and over a temperature range of 48-94xc2x0 C. with an optimum temperature near 85xc2x0 C. (Jones, W., et al., Arch. Microbiol. 136:254 (1983)). The organism is autotrophic and a strict anaerobe; and, as the name implies, it produces methane. The dearth of archaeal nucleotide sequence data has hampered attempts to begin constructing a comprehensive comparative evolutionary framework for assessing the molecular basis of the origin and diversification of cellular life.
The present invention is based on whole-genome random sequencing of an autotrophic archaeon, Methanococcus jannaschii. The M. jannaschii genome consists of three physically distinct elements: (i) a large circular chromosome; (ii) a large circular extrachromosomal element (ECE); and (iii) a small circular extrachromosomal element (ECE). The nucleotide sequences generated, the M. jannaschii chromosome, the large ECE, and the small ECE, are respectively provided on pages 153-586 (SEQ ID NO:1), pages 586-601 (SEQ ID NO:2), and pages 602-606 (SEQ ID NO:3).
The present invention is further directed to isolated nucleic acid molecules comprising open reading frames (ORFs) encoding M. jannaschii proteins. The present invention also relates to variants of the nucleic acid molecules of the present invention, which encode portions, analogs or derivatives of M. jannaschii proteins. Further embodiments include isolated nucleic acid molecules comprising a polynucleotide having a nucleotide sequence at least 90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% identical, to the nucleotide sequence of a M. jannaschii ORF described herein.
The present invention also relates to recombinant vectors, which include the isolated nucleic acid molecules of the present invention, host cells containing the recombinant vectors, as well as methods for making such vectors and host cells for M. jannaschii protein production by recombinant techniques.
The invention further provides isolated polypeptides encoded by the M. jannaschii ORFs. It will be recognized that some amino acid sequences of the polypeptides described herein can be varied without significant effect on the structure or function of the protein. If such differences in sequence are contemplated, it should be remembered that there will be critical areas on the protein which determine activity. In general, it is possible to replace residues which form the tertiary structure, provided that residues performing a similar function are used. In other instances, the type of residue may be completely unimportant if the alteration occurs at a non-critical region of the protein.
In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a polypeptide of the invention. The epitope-bearing portion is an immunogenic or antigenic epitope useful for raising antibodies.