Several classes of small molecules that interact with double-stranded DNA have been identified. Many of these small molecules have profound biological effects. For example, many aminoacridines and polycyclic hydrocarbons bind DNA and are mutagenic, teratogenic, or carcinogenic. Other small molecules that bind DNA include: biological metabolites, some of which have applications as antibiotics and antitumor agents including actinomycin D, echinomycin, distamycin, and calicheamicin; planar dyes, such as ethidium and acridine orange; and molecules that contain heavy metals, such as cisplatin, a potent antitumor drug.
The sequence binding preferences of most known DNA binding molecules have not, to date, been identified. However, several small DNA-binding molecules have been shown to preferentially recognize specific nucleotide sequences, for example: echinomycin has been shown to preferentially bind the sequence [(A/T)CGT]/[ACG(A/T)](Gilbert et al.); cisplatin has been shown to covalently cross-link a platinum molecule between the N7 atoms of two adjacent deoxyguanosines (Sherman et al.); and calicheamicin has been shown to preferentially bind and cleave the sequence TCCT/AGGA (Zein et al.).
Many therapeutic DNA-binding molecules (such as distamycin) that were initially identified based on their therapeutic activity in a biological screen have been later determined to bind DNA. There are several examples in the literature referring to synthetic or naturally-occurring polymers of DNA-binding drugs. Netropsin, for example, is a naturally-occurring oligopeptide that-binds to the minor groove of double-stranded DNA. Netropsin contains two 4-amino-1-methylpyrrole-2-carboxylate residues and belongs to a family of similar biological metabolites from Streptomyces spp. This family includes distamycin, anthelvencin (both of which contain three N-methylpyrrole residues), noformycin, amidomycin (both of which contain one N-methylpyrrole residue) and kikumycin (which contains two N-methylpyrrole residues, like netropsin) (Debart, et al.). Synthetic molecules of this family have also been described, including the above-mentioned molecules (Lown, et al. 1985) well as dimeric derivatives (Griffin et al., Gurskii, et al.) and certain analogues (Bialer, et al. 1980, Bialer, et al. 1981, Krowicki, et al.).
Molecules in this family, particularly netropsin and distamycin, have been of interest because of their biological activity as antibacterial (Thrum et al., Schuhmann, et al.), antiparasitic (Nakamura et al.), and antiviral drugs (Becker, et al., Lown, et al. 1986, Werner, et al.).
Among the synthetic analogs of netropsin and distamycin are oligopeptides that have been designed to have sequence preferences different from their parent molecules. Such oligopeptides include the "lexitropsin" series of analogues. The N-methlypyrrole groups of the netropsin series were systematically replaced with N-methylimidazole residues, resulting in lexitropsins with increased and altered sequence specificities from the parent compounds (Kissinger, et al.). Further, a number of poly(N-methylpyrrolyl)-netropsin analogues have been designed and synthesized which extend the number of residues in the oligopeptides to increase the size of the binding site (Dervan, 1986).
There are several different approaches that could be taken to look for small molecules that specifically inhibit the interaction of a given DNA-binding protein with its binding sequence (cognate site). One approach would be to test biological or chemical compounds for their ability to preferentially block the binding of one specific DNA:protein interaction but not others. Such an assay would depend on the development of at least two, preferably three, DNA:protein interaction systems in order to establish controls for distinguishing between general DNA-binding molecules (polycations like heparin or intercalating agents like ethidium) and DNA-binding molecules having sequence binding preferences that would affect protein/cognate binding site interactions in one system but not the other(s).
One illustration of how this system could be used is as follows. Each cognate site could be placed 5' to a reporter gene (such as genes encoding .beta.-galactoside or luciferase) such that binding of the protein to the cognate site would enhance transcription of the reporter gene. The presence of a sequence-specific DNA-binding drug that blocked the DNA:protein interaction would decrease the enhancement of the reporter gene expression. Several DNA enhancers could be coupled to reporter genes, then each construct compared to one another in the presence or absence of small DNA-binding test molecules. In the case where multiple protein/cognate binding sites are used for screening, a competitive inhibitor that blocks one interaction but not the others could be identified by the lack of transcription of a reporter gene in a transfected cell line or in an in vitro assay. Only one such DNA-binding sequence, specific for the protein of interest, could be screened with each assay system. This approach has a number of limitations including limited testing capability and the need to construct the appropriate reporter system for each different protein/cognate site of interest.
Another example of a system to detect sequence-specific DNA-binding molecules would involve cloning a DNA-binding protein of interest, expressing the protein in an expression system (e.g., bacterial, baculovirus, or mammalian expression systems), preparing a purified or partially purified sample of protein, then using the protein in an in vitro competition assay to detect molecules that blocked the DNA:protein interaction. These types of systems are analogous to many receptor:ligand or enzyme:substrate screening assays developed in the past, but have the same limitations as outlined above in that a new system must be developed for every different protein/cognate site combination of interest. The capacity for screening numerous different sequences is therefore limited.
Another example of a system designed to detect sequence-specific DNA-binding drugs would be the use of DNA footprinting procedures as described in the literature. These methods include DNase I or other nuclease footprinting (Chaires, et al.), hydroxy radical footprinting (Portugal, et al.), methidiumpropyl EDTA(iron) complex footprinting (Schultz, et al.), photofootprinting (Jeppesen, et al.), and bidirectional transcription footprinting (White, et al.). These procedures are likely to be accurate within the limits of their sequence testing capability but are seriously limited by (i) the number of different DNA sequences that can be used in one experiment (typically one test sequence that represents the binding site of the DNAbinding protein under study), and (ii) the difficulty of developing high throughput screening systems.