1. Field of the Invention
This invention relates generally to the determination of the biological function of molecules and, more specifically, to a method of predicting the biological activity of compounds using models of nucleic acids, including DNA, RNA and DNA-RNA complexes.
2. Description of the Prior Art
DNA (deoxyribonucleic acid) is a repeating polymeric structure which has two primary components: a deoxyribosephosphate backbone and a series of nucleic acid bases stacked in a helical pattern. The DNA molecular contains the genetic code that is generally recognized as a universal language used in all living systems and is divided into triplet sections, each section formed from a sequence of three (3) bases of nucleic acids and each section influencing the coding for a specific amino acid. In double-stranded nucleic acids, the bases are paired, giving rise to a coiled, double helical structure.
Twenty years have passed since the discovery of the genetic code. The exact naure of the relationship between the sequences of three consecutive nucleotide bases known as codons and the unique group of twenty L-amino acids involved in protein synthesis is, however, still uncertain. While there have been many descriptions of physicochemical relationships between amino acids and the purine and pyrimidine moieties of their codons, a satisfactory stereochemical explanation of the code remains to be established. Thus, how each of the amino acids in a protein sequence came to be related to three (3) nucleic acid bases in a nucleic acid sequence has not been elucidated. This state of affaris prompted Crick to propose that the code might be a "frozen accident" of the evolutionary process while nonetheless advising that "it is therefore essential to pursue the stereochemical theory." Crick, F. H. (1968), J. Mol. Biol., vol. 38, pages 367-379.
In the search for various stereochemical approaches to the genetic code, e.g., Hendry, L. B., et al., (1979), Persp. Biol. Med., vol. 22, pages 333-345. it was discovered that many of the R groups of the twenty L-amino acids are similar in structure to the purine (adenine and guanine) and pyrimidine (thymine and cytosine) bases of DNA. (FIG. 1A). When the .alpha.-amino group of an amino acid is positioned at N-9 of a purine of N-1 of a pyrimidine as shown in FIG. 1B, the R group can assume a conformation in which the atomic arrangements are like those of a purine or a pyrimidine is a complementary Watson-Crick base pair. Amino acids with hydrophilic moieties appear to be capable of forming complementary hydrogen bonding pairs with nucleic acid bases which are analogous to those in base pairs. Hydrophobic amino acids can, in many caases, form complementary Van der Waals surfaces with one of the bases (FIG. 1B). The complementary of Watson-Crick nucleic acid base pairs and the putative complementary pairing of structurally analogous amino acids with bases are illustrated in A-F of FIG. 1B: (A) cytosine(C)-guanine(G) base pair; (B) cytosine-arginine(ARG) pair; (C) proline(PRO)-guanine pair; (D) thymine(T)-adenine(A) pair; (E) thymine-histidine(HIS) pair; and (F) isoleucine(ILE)-adenine pair. With fourteen of the twenty L-amino acids, only one complementary amino acid-base pair is possible; in each case, the base is the second in its anticodon.
The above-described structural analogies between L-amino acids and nucleic acid bases suggested that it might be possible to employ modelling techniques to incorporate amino acids directly into DNA as if they were bases with apparent stereochemical specificity and without disrupting the double helix configuration.